From greg@cosc.canterbury.ac.nz  Mon Jul  1 02:00:58 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 01 Jul 2002 13:00:58 +1200 (NZST)
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <01f201c21f69$f7600f10$ced241d5@hagrid>
Message-ID: <200207010100.g6110wJ28436@oma.cosc.canterbury.ac.nz>

Fredrik Lundh <fredrik@pythonware.com>:

> Tim warned me that the mere attempt to read sources for
> existing RE implementations was a sure way to destroy my
> brain, so I avoided that.

Oh, no! Does that mean I should attach a health
warning to the source of Plex???

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From xscottg@yahoo.com  Mon Jul  1 03:02:55 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Sun, 30 Jun 2002 19:02:55 -0700 (PDT)
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
In-Reply-To: <006801c22072$00c890a0$88e97ad1@othello>
Message-ID: <20020701020255.17441.qmail@web40110.mail.yahoo.com>

--- Raymond Hettinger <python@rcn.com> wrote:
> RH> > The change was based on the advice I got.
> TP > Wasn't that an empty set?
> 
> Not unless Scott Gilbert is a null:
> 
> SG > > >  "... So the best bet would be to have it just always return a
> string..."
> 

I'm pretty close to a null.  :-)


Besides I don't think my comments to you made the list.   At least I don't
remember CC'ing them to python-dev...  I'd be happy to foward those
messages to the list if there is interest, but I don't think there is <0.5
grin>.

I personally don't have much stake in the buffer object.  It looked like
something that would be useful for several things that I'm interested in,
but when I looked closer I realized it just isn't.  If it's politically the
correct thing to leave it broken, then that gets my unneeded blessing.  It
would be nice if _that_ decision was documented somewhere instead of
everything just getting quiet when the topic is brought up.  Tim has said
before that this is one of those yearly pointless discussions.  I would
have read Guido's essay on the topic if I knew how to find it...

As I've said before though, a mutable byte array object that pickled
efficiently, could be constructed from a pointer & destructor, and promised
not to invalidate your pointer when the GIL is released would be useful. 
And it looks like Guido's long lost essay seems to concur with this in a
few places.  

Asynchronous file I/O, concurrent calculation on numeric arrays, page
aligned memory for DMA transfers, all sorts of other goodies could use
something like this.  Of course the buffer object can't be used for any of
these.  Guido's essay seems to indicate that one of the reasons not to add
something like this is because there is no equivalent in Java, and
therefore Jython.  I don't find that motivating.  Let Jython be portable in
the Java sense of the word, and let Python be powerful everywhere there is
a working C compiler...






__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com



From bsder@mail.allcaps.org  Mon Jul  1 03:23:29 2002
From: bsder@mail.allcaps.org (Andrew P. Lentvorski)
Date: Sun, 30 Jun 2002 19:23:29 -0700 (PDT)
Subject: [Python-Dev] XML module causes profiler to throw
Message-ID: <20020630191302.J5810-100000@mail.allcaps.org>

This was reported in bug 534864, but seems to have been left to rot.
At the very least, I'd like to bump it's importance up in case there is a
later bugfix version for 2.2 (aka 2.2.2 or something).

What is it's status?  Is there a workaround?  What is the diagnosis?

I find it hard to believe that others haven't tripped across this
(especially somebody in the Zope team).  I can at least add my voice such
that it is not specific to one type of installation (his is Red Hat
7.1--mine is FreeBSD 4.6)

Thanks,
-a




From tim.one@comcast.net  Mon Jul  1 05:04:37 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 01 Jul 2002 00:04:37 -0400
Subject: [Python-Dev] XML module causes profiler to throw
In-Reply-To: <20020630191302.J5810-100000@mail.allcaps.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDBABAB.tim.one@comcast.net>

[Andrew P. Lentvorski, about <http://www.python.org/sf/534864>]
> This was reported in bug 534864, but seems to have been left to rot.
> At the very least, I'd like to bump it's importance up in case there is a
> later bugfix version for 2.2 (aka 2.2.2 or something).
>
> What is it's status?  Is there a workaround?  What is the diagnosis?

Everything known about it is in the bug report.

> I find it hard to believe that others haven't tripped across this
> (especially somebody in the Zope team).  I can at least add my voice such
> that it is not specific to one type of installation (his is Red Hat
> 7.1--mine is FreeBSD 4.6)

Posting this info to Python-Dev doesn't do any good.  Add it to the bug
report!  Be sure to say which version of Python you were using (the report
only mentioned 2.2; there's not even any info there about 2.2.1).  The last
activity that report saw was Martin saying he couldn't reproduce it in
2.3a0, and to date nobody has added a comment saying they could reproduce
it.

Based on what's there (an unconfirmed report against 2.2, and a failure to
reproduce under CVS), I wouldn't give it high priority either.




From tim.one@comcast.net  Mon Jul  1 06:45:20 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 01 Jul 2002 01:45:20 -0400
Subject: [Python-Dev] Some dull gc stats
Message-ID: <LNBBLJKPBEHFEDALKOLCCEDFABAB.tim.one@comcast.net>

I checked in a surprisingly large patch to change the way we collect a
generation.

Before:  Things directly reachable from outside the young generation were
moved into a 'reachable' set in one pass.  Things indirectly reachable from
outside the young generation were moved into 'reachable' in a second pass.
The 'young' list then contained the unreachable objects.

After:  Things unreachable from outside the young generation are moved into
an 'unreachable' set in one pass.  The 'young' list contains the reachable
objects when that's done.

The point was that almost everything is reachable in the end, and moving an
object between lists costs six pointer stores (updating prev and next
pointers in the object, and in each of the two lists).  So if most stuff is
doomed to be reachable in the end, better to move the unreachable stuff than
to move the reachable stuff.

This seems to be a nice little win.  If you want to know more, read the
comments.  An instrumented version showed this over a run of the Python test
suite:

scanned   7437363
moved       36854
movedback   34389

where

    scanned
        # of times the loop in move_unreachable() went around == # of
        objects
    moved
        # of times "if (gc->gc.gc_refs == 0)" in move_unreachable()
        triggered == # of objects moved into an unreachable set
    movedback
        # of times "if (gc_refs == GC_TENTATIVELY_UNREACHABLE)"
        in visit_reachable() triggered == the number of times
        move_unreachable() guessed wrong and an object had to be moved
        back into a reachable set

So the change saved about 7e6 object moves here, for 6x as many pointer
stores, and gc is finding very little that's unreachable in the end.

Surprisingly, the worst (for some technical meaning of "worst" I'll leave to
your imagination) stats I've seen came out of running Zope3's test suite:

scanned    649444
moved       56124
movedback   43576

It's surprising for lots of reasons, including how relatively little work gc
is doing in total, and how relatively much was found to be unreachable
(about 12 thousand objects).  The latter is surprising because Zope code has
traditionally tried like heck not to create cycles.

The Python test suite *almost* gave "the best" (most favorable to the
change) stats I've seen.  Only a variant of Kevin Jacobs's little test case
looked better so far:

scanned    12322200
moved           244
movedback       244

It would be nicer if we could drive scanned there down to 0 <wink>.




From martin@v.loewis.de  Mon Jul  1 06:56:22 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 01 Jul 2002 07:56:22 +0200
Subject: [Python-Dev] XML module causes profiler to throw
In-Reply-To: <20020630191302.J5810-100000@mail.allcaps.org>
References: <20020630191302.J5810-100000@mail.allcaps.org>
Message-ID: <m3eleoat4p.fsf@mira.informatik.hu-berlin.de>

"Andrew P. Lentvorski" <bsder@mail.allcaps.org> writes:

> What is it's status?  Is there a workaround?  What is the diagnosis?

The status is that it is unreproducable. I just tried with the Python
2.2. Unless there is some independent verification of the problem, I'm
going to close it as unreproducable.

Regards,
Martin



From martin@v.loewis.de  Mon Jul  1 07:30:38 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 01 Jul 2002 08:30:38 +0200
Subject: [Python-Dev] Some dull gc stats
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEDFABAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCCEDFABAB.tim.one@comcast.net>
Message-ID: <m3y9cw9cz5.fsf@mira.informatik.hu-berlin.de>

Tim Peters <tim.one@comcast.net> writes:

> I checked in a surprisingly large patch to change the way we collect a
> generation.

Do you think this should be backported to 2.2.2 as well?

Regards,
Martin



From oren-py-d@hishome.net  Mon Jul  1 07:31:20 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 1 Jul 2002 09:31:20 +0300
Subject: [Python-Dev] String interning
Message-ID: <20020701093120.A3499@hishome.net>

I was looking into the string interning code to see how it works. Here are
my observations:

A Python string can be in one of three states:

1. Not interned: s->ob_sinterned == NULL
2. Directly-interned: s->ob_sinterned == s
3. Indirectly interned. ob_sinterned points to another string:
     s->ob_sinterned != s && s->ob_sinterned != NULL

Indirectly interned strings are quite rare.  Creating one requires that a
string already exist in the interned dictionary and that an equal string 
with multiple references be interned.  The reference used for internining 
is replaced with the previously interned string.  The other references 
to the same string will become indirectly interned.

References to the ob_sinterned field are found in in stringobject.c and 
dictobject.c and, inexplicably, in Mac/Python/macimport.c

In stringobject.c most references to ob_sinterned are to initialize it. The
only place that uses it is string_hash:  if ob_sinterned is not NULL it uses 
the hash of the string it points to instead of the current string object. 
If the string is directly interned this is just a longer way of doing the
same thing.  If the string is indirectly interned this is merely redundant 
because the hash of the two strings should be equal (I hope so!). The only 
thing this test could have saved is recalculating the hash from the string 
if the cached hash is zero. This doesn't happen because if the string
is indirectly interned it has been used as a key during interning which
initializes its cached hash.

In dictobject.c the only reference to ob_sinterned is in PyDict_SetItem: if
the string is interned it uses the ob_sinterned pointer as the key instead
of the argument. This could only make a difference if the string is 
indirectly interned. It turns out that this never happens. I couldn't find
one occurence of SetItem with an indirectly interned string as key in the
regression tests or any other Python code I have tested.  Even if this did
happen it wouldn't cause any problems to ignore this case - the lookup 
would still function correctly.

Summary: As far as I can tell, indirectly interned strings are redundant. 
Without them the ob_sinterned field is effectively a boolean flag. The size 
of all string objects can be reduced by 3 bytes.

Can anyone explain why interning is implemented the way it is?  Can anyone
explain why Mac/Python/macimport.c is messing with ob_sinterned?

	Oren




From martin@v.loewis.de  Mon Jul  1 08:03:22 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 01 Jul 2002 09:03:22 +0200
Subject: [Python-Dev] String interning
In-Reply-To: <20020701093120.A3499@hishome.net>
References: <20020701093120.A3499@hishome.net>
Message-ID: <m3sn349bgl.fsf@mira.informatik.hu-berlin.de>

Oren Tirosh <oren-py-d@hishome.net> writes:

> In stringobject.c most references to ob_sinterned are to initialize it. The
> only place that uses it is string_hash:  if ob_sinterned is not NULL it uses 
> the hash of the string it points to instead of the current string object. 

This is not true: PyString_InternInPlace has

	if ((t = s->ob_sinterned) != NULL) {

which checks whether the string being interned had been interned
before.

> Summary: As far as I can tell, indirectly interned strings are redundant. 
> Without them the ob_sinterned field is effectively a boolean flag.
> 
> Can anyone explain why interning is implemented the way it is?  Can anyone
> explain why Mac/Python/macimport.c is messing with ob_sinterned?

I'm not sure what meaning you would assiocate with the boolean
flag. If this is meant to denote "this is an interned string", then

	if ((t = s->ob_sinterned) != NULL) {
		if (t == (PyObject *)s)
			return;

would become

        if (s->ob_isinterned) return;

To see the difference, I added

	if ((t = s->ob_sinterned) != NULL) {
		if (t == (PyObject *)s)
			return;
		fprintf(stderr, "reinterning\n");

If that code prints "reinterning", it can efficiently intern the
argument, but couldn't with your change.

I agree that this is very rare, but in the test suite, it triggers 5
times in test_descr.

> The size of all string objects can be reduced by 3 bytes.

That is not true. Taking a 32-bit architecture, and considering that
each string has 16 bytes minimum storage (without ob_sinterned), and
taking into account the 8-byte clustering of pymalloc, we get

stringsize  current-storage  new-storage  savings
0           24               24           0
1           24               24           0
2           24               24           0
3           24               24           0
4           32               24           8
5           32               24           8
6           32               24           8
7           32               32           0

So the size reduction depends on the actual length of the strings;
it's 3 bytes only on average, assuming a uniform distribution of
string sizes.

Regards,
Martin



From oren-py-d@hishome.net  Mon Jul  1 09:00:21 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 1 Jul 2002 04:00:21 -0400
Subject: [Python-Dev] String interning
In-Reply-To: <m3sn349bgl.fsf@mira.informatik.hu-berlin.de>
References: <20020701093120.A3499@hishome.net> <m3sn349bgl.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020701080020.GA62710@hishome.net>

On Mon, Jul 01, 2002 at 09:03:22AM +0200, Martin v. Loewis wrote:
> Oren Tirosh <oren-py-d@hishome.net> writes:
> 
> > In stringobject.c most references to ob_sinterned are to initialize it. The
> > only place that uses it is string_hash:  if ob_sinterned is not NULL it uses 
> > the hash of the string it points to instead of the current string object. 
> 
> This is not true: PyString_InternInPlace has

I meant references to ob_sinterned outside the actual implementation of 
PyString_InternInPlace.

> > Summary: As far as I can tell, indirectly interned strings are redundant. 
> > Without them the ob_sinterned field is effectively a boolean flag.
> > 
> > Can anyone explain why interning is implemented the way it is?  Can anyone
> > explain why Mac/Python/macimport.c is messing with ob_sinterned?
> 
> I'm not sure what meaning you would assiocate with the boolean
> flag. 

"This string is interned. It is equal to another interned strings iff they 
are the same object"

...
> If that code prints "reinterning", it can efficiently intern the
> argument, but couldn't with your change.
> 
> I agree that this is very rare, but in the test suite, it triggers 5
> times in test_descr.

test_descr is not exactly typical Python code...  

What bothers me is that of the two places that check if a string is interned 
one is a no-op and the other never happens.

	Oren



From bsder@mail.allcaps.org  Mon Jul  1 09:08:48 2002
From: bsder@mail.allcaps.org (Andrew P. Lentvorski)
Date: Mon, 1 Jul 2002 01:08:48 -0700 (PDT)
Subject: [Python-Dev] XML module causes profiler to throw
In-Reply-To: <m3eleoat4p.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020701010104.L6236-100000@mail.allcaps.org>

Thanks, that was the info I needed.  I'll work on trying to create a small
program that people can run.

It is not *completely* unreproduceable.  I have found three different
filings about this floating around in various places.  I will collect the
references tomorrow and put them into the main bug report.

-a

On 1 Jul 2002, Martin v. Loewis wrote:

> "Andrew P. Lentvorski" <bsder@mail.allcaps.org> writes:
>
> > What is it's status?  Is there a workaround?  What is the diagnosis?
>
> The status is that it is unreproducable. I just tried with the Python
> 2.2. Unless there is some independent verification of the problem, I'm
> going to close it as unreproducable.
>
> Regards,
> Martin
>




From mal@lemburg.com  Mon Jul  1 09:21:50 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 01 Jul 2002 10:21:50 +0200
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
References: <001f01c21ed9$873f3c00$06ea7ad1@othello> <004c01c21f5c$6dcbf2d0$ced241d5@hagrid>
Message-ID: <3D20111E.2090203@lemburg.com>

Fredrik Lundh wrote:
> raymond wrote:
> 
> 
> 
>>As far as I can tell, buffer() is one of the least used or known about
>>Python tools.  What do you guys think about this as a candidate for silent
>>deprecation (moving out of the primary documentation)?
> 
> 
> +1, in theory.
> 
> does anyone have any real-life use cases?  I've never been
> able to use it for anything, and cannot recall ever seeing it
> being used by anyone else...
> 
> (it sure doesn't work for the use cases I thought of when
> first learning about the API...)

-1.

I use it in real-life applications to wrap binary data.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From Oleg Broytmann <phd@phd.pp.ru>  Mon Jul  1 11:39:14 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Mon, 1 Jul 2002 14:39:14 +0400
Subject: [Python-Dev] Infinie recursion in Pickle
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIECEABAB.tim.one@comcast.net>; from tim.one@comcast.net on Sun, Jun 30, 2002 at 04:32:35PM -0400
References: <20020701002535.A1510@phd.pp.ru> <LNBBLJKPBEHFEDALKOLCIECEABAB.tim.one@comcast.net>
Message-ID: <20020701143913.A6446@phd.pp.ru>

On Sun, Jun 30, 2002 at 04:32:35PM -0400, Tim Peters wrote:
> [Oleg Broytmann]
> >    I think I can reduce this, but I am afraid the data structure
> > still will be large,
> 
> That doesn't matter.  It's the amount of *code* we don't understand and have
> to learn that matters.  If you could reduce this to a gigabyte of pickle
> input that we only need to feed into pickle, that would be great.

   Ok. From today I have a lot of spare time and very good almost free
Internet connection, so I can investigate things. I can post the results of
my investigation to the developers list or to the c.l.py, if anyone is
interested.

> >    That what I don't want to do - file a mysterious bug report.
> 
> That's what bug reports are best for!

   Hmm, I thought they are not, as those mysterious bug reports take up
space and time - someone have to read it, at least; but they are not help
in any way.

> Now you've got comments about your
> bug scattered across comp.lang.python and python-dev, and nobody will be
> able to find them again.  Attaching new info to a shared bug report is much
> more effective.

   Ah, I see now. I am strictly attached to email and email archives, and I
am always hating web-based collaboration tools, but you made a good point.
   Still, life is too short to spend it in the SF slooow interface :(

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From jacobs@penguin.theopalgroup.com  Mon Jul  1 11:44:52 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Mon, 1 Jul 2002 06:44:52 -0400 (EDT)
Subject: [Python-Dev] Some dull gc stats
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEDFABAB.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.44.0207010620001.20256-100000@penguin.theopalgroup.com>

On Mon, 1 Jul 2002, Tim Peters wrote:
> I checked in a surprisingly large patch to change the way we collect a
> generation.
>[...] 
> The point was that almost everything is reachable in the end, and moving an
> object between lists costs six pointer stores (updating prev and next
> pointers in the object, and in each of the two lists).  So if most stuff is
> doomed to be reachable in the end, better to move the unreachable stuff than
> to move the reachable stuff.
>[...]
> It would be nicer if we could drive scanned there down to 0 <wink>.

This change may be a short-term win if I can get Jeremy's idea working.  It
involves temporarily untracking objects with known external roots, so many
more objects become unreachable.  These include objects stored on the
c-eval stack, local variables in the current frame, and possibly other
select places.

I have no idea if this approach will make enough of a difference to be
worthwhile, but it seems like a worthy experiment.

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From jepler@unpythonic.net  Mon Jul  1 12:46:44 2002
From: jepler@unpythonic.net (jepler@unpythonic.net)
Date: Mon, 1 Jul 2002 06:46:44 -0500
Subject: [Python-Dev] Infinie recursion in Pickle
In-Reply-To: <20020630234820.A1006@phd.pp.ru>
References: <20020630234820.A1006@phd.pp.ru>
Message-ID: <20020701064639.A1003@unpythonic.net>

Is this posted as an SF bug report yet?

I have a small program which can hit the recursion limit in pickle and
cause sig11 in cPickle.  It's a very deep nested tuple in this case.

If the first 'pickle.dump()' call is not commented out, the program dies
with "RuntimeError: maximum recursion depth exceeded".  If the second bit
of code is executed, it dies with segmentation violation.

On my system, redhat 7.2, the stack in the main thread is very large, but
the stack in other threads is very small.  On systems where the main stack
is smaller, just running
    cPickle.dump(x, open("/dev/null", "w"))
should show the problem, no threads needed.

Probably some sort of stack check should be present in cPickle, but there's
nothing much that can be done about data structures that are so deeply
recursive that they fill the stack.  Well, pickle could be rewritten to be
iterative, but that's a tall order.

(traceback from cPickle:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1026 (LWP 1198)]
0x4004912c in __new_sem_post (sem=0x8154358) at semaphore.c:137
137     semaphore.c: No such file or directory.
        in semaphore.c
(gdb) where
#0  0x4004912c in __new_sem_post (sem=0x8154358) at semaphore.c:137
#1  0x08095ec2 in PyThread_release_lock (lock=0x8154358)
    at Python/thread_pthread.h:412
#2  0x08079a4d in PyEval_SaveThread () at Python/ceval.c:329
#3  0x4025b09d in write_file (self=0x80ff3e8, s=0x4025db54 "(", n=1)
    at /home/jepler/cvs/python/dist/src/Modules/cPickle.c:414
#4  0x4025396a in save_tuple (self=0x80ff3e8, args=0x4062a66c)
    at /home/jepler/cvs/python/dist/src/Modules/cPickle.c:1340
#5  0x40254f18 in save (self=0x80ff3e8, args=0x4062a66c, pers_save=0)
    at /home/jepler/cvs/python/dist/src/Modules/cPickle.c:1944
#6  0x402539b1 in save_tuple (self=0x80ff3e8, args=0x4062a68c)
    at /home/jepler/cvs/python/dist/src/Modules/cPickle.c:1350
...)

import pickle, cPickle

x = ()
for i in range(100000):
    x = (x,)


#pickle.dump(x, open("/dev/null", "w"))


import thread, time
thread.start_new_thread(cPickle.dump, (x, open("/dev/null", "w")))
time.sleep(1000)



From Oleg Broytmann <phd@phd.pp.ru>  Mon Jul  1 12:54:31 2002
From: Oleg Broytmann <phd@phd.pp.ru> (Oleg Broytmann)
Date: Mon, 1 Jul 2002 15:54:31 +0400
Subject: [Python-Dev] Infinie recursion in Pickle
In-Reply-To: <20020701064639.A1003@unpythonic.net>; from jepler@unpythonic.net on Mon, Jul 01, 2002 at 06:46:44AM -0500
References: <20020630234820.A1006@phd.pp.ru> <20020701064639.A1003@unpythonic.net>
Message-ID: <20020701155431.B6446@phd.pp.ru>

On Mon, Jul 01, 2002 at 06:46:44AM -0500, jepler@unpythonic.net wrote:
> Is this posted as an SF bug report yet?

   Not yet.

> I have a small program which can hit the recursion limit in pickle and
> cause sig11 in cPickle.  It's a very deep nested tuple in this case.

   In my case the data strucrure is more complex, but less deep. It is a
tree of objects (about 3000 objects). I tried to create lesser trees, but
the bug disappeared.
   One interesting thing to note is that when I changed builtin list back
to UserList the problem disappeared. That is, my problem is related to
pickling new classes.
   I narrowed the code to just 180 lines. The problem manifests itself
after loading initial tree, running inverse linker, and saving data back.
Inverse linker runs over all objects in the tree and adds a link to its
parent to every object. So I think the bug is in the pickling a data
structure with loops.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd@phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.



From pinard@iro.umontreal.ca  Mon Jul  1 13:39:41 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 01 Jul 2002 08:39:41 -0400
Subject: [Python-Dev] Re: Infinie recursion in Pickle
In-Reply-To: <20020701143913.A6446@phd.pp.ru>
References: <20020701002535.A1510@phd.pp.ru>
 <LNBBLJKPBEHFEDALKOLCIECEABAB.tim.one@comcast.net>
 <20020701143913.A6446@phd.pp.ru>
Message-ID: <oqadpbsjua.fsf@titan.progiciels-bpi.ca>

[Oleg Broytmann]

> On Sun, Jun 30, 2002 at 04:32:35PM -0400, Tim Peters wrote:

> > Attaching new info to a shared bug report is much more effective.

You know, it is only effective when it works!  The SF tracker did not work
for me.  I filed a bug, and checked it was correctly saved (through tedious
paging all over).  Then, much later, I received a message from a maintainer
saying that my report was empty.  It surely was not after I filed it.

I guess the SF tracker works only for those using it very often :-).

> Ah, I see now. I am strictly attached to email and email archives, and I
> am always hating web-based collaboration tools, but you made a good point.
> Still, life is too short to spend it in the SF slooow interface :(

Slow, hardly usable, and not even dependable.  Email might be less
black-holish, after all.  Moreover, most people (maintainers included)
know how to read and file an email.  I've a hard time believing people
who tell me that maintainers are unable to sort emails without loosing
them, or that I can really sort their own email better than they can.
I usually praise maintainers as intelligent people. :-)

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From fredrik@pythonware.com  Mon Jul  1 14:15:49 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 1 Jul 2002 15:15:49 +0200
Subject: [Python-Dev] Re: Infinie recursion in Pickle
References: <20020701002535.A1510@phd.pp.ru><LNBBLJKPBEHFEDALKOLCIECEABAB.tim.one@comcast.net><20020701143913.A6446@phd.pp.ru> <oqadpbsjua.fsf@titan.progiciels-bpi.ca>
Message-ID: <05a601c22101$741ac2f0$0900a8c0@spiff>

Fran=E7ois Pinard wrote:
> Slow, hardly usable, and not even dependable.  Email might be less
> black-holish, after all.  Moreover, most people (maintainers included)
> know how to read and file an email.

might so be, but such archives are not shared.

> I usually praise maintainers as intelligent people. :-)

so why not do as we tell you, and post bug reports on SF?

or better, join the roundup team, and make sure it's good enough
to replace the SF tracker.  (as far as I know, it already is -- but
someone still needs to set it up, write a script that pulls all data
out of the old system, etc).

</F>




From Jack.Jansen@cwi.nl  Mon Jul  1 14:19:15 2002
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Mon, 1 Jul 2002 15:19:15 +0200
Subject: [Python-Dev] String interning
In-Reply-To: <20020701093120.A3499@hishome.net>
Message-ID: <241819E8-8CF5-11D6-94DE-0030655234CE@cwi.nl>

On Monday, July 1, 2002, at 08:31 , Oren Tirosh wrote:
>   Can anyone
> explain why Mac/Python/macimport.c is messing with ob_sinterned?

It's all explained in the comment a few lines above where ob_sinterned 
is used:
	/*
	** If we have interning find_module takes care of interning all
	** sys.path components. We then keep a record of all sys.path
	** components for which GetFInfo has failed (usually because the
	** component in question is a folder), and we don't try opening these
	** as resource files again.
	*/

This code gives a considerable speedup for module searches. The reason 
it's mac-specific is that MacPython allows files on sys.path as well as 
directories (and these files are searched for PYC resources).
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -




From oren-py-d@hishome.net  Mon Jul  1 14:57:25 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 1 Jul 2002 16:57:25 +0300
Subject: [Python-Dev] String interning
In-Reply-To: <241819E8-8CF5-11D6-94DE-0030655234CE@cwi.nl>; from Jack.Jansen@cwi.nl on Mon, Jul 01, 2002 at 03:19:15PM +0200
References: <20020701093120.A3499@hishome.net> <241819E8-8CF5-11D6-94DE-0030655234CE@cwi.nl>
Message-ID: <20020701165725.A10327@hishome.net>

On Mon, Jul 01, 2002 at 03:19:15PM +0200, Jack Jansen wrote:
> 
> On Monday, July 1, 2002, at 08:31 , Oren Tirosh wrote:
> >   Can anyone
> > explain why Mac/Python/macimport.c is messing with ob_sinterned?
> 
> It's all explained in the comment a few lines above where ob_sinterned 
> is used:
> 	/*
> 	** If we have interning find_module takes care of interning all
> 	** sys.path components. We then keep a record of all sys.path
> 	** components for which GetFInfo has failed (usually because the
> 	** component in question is a folder), and we don't try opening these
> 	** as resource files again.
> 	*/
> 
> This code gives a considerable speedup for module searches. The reason 
> it's mac-specific is that MacPython allows files on sys.path as well as 
> directories (and these files are searched for PYC resources).

I guess the clean solution would be to add a PyString_CheckInterened macro.

	Oren



From tismer@tismer.com  Mon Jul  1 15:02:22 2002
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 01 Jul 2002 16:02:22 +0200
Subject: [Python-Dev] Ann: Stackless 2.2.1 on PowerPC
Message-ID: <3D2060EE.7090101@tismer.com>

Announcement:

Stackless Python Works on PowerPC.

The PPC support was much simpler to implement than expected.
It was helpful to look into the PPC switch implementation
of the ICON language.
Thanks to Just van Rossum for giving me access to his machine.
Thanks to Armin Rigo for showing me the tricks for x86-unix.

There is still no installer available, this is at alpha
level. In case you want to build your own Stackless,
check out the module stackless from
:pserver:anonymous@tismer.com:/home/cvs

Updated news can be found at http://www.stackless.com/

have fun - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
       whom do you want to sponsor today?   http://www.stackless.com/




From fredrik@pythonware.com  Mon Jul  1 15:57:00 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 1 Jul 2002 16:57:00 +0200
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
References: <001f01c21ed9$873f3c00$06ea7ad1@othello> <004c01c21f5c$6dcbf2d0$ced241d5@hagrid> <3D20111E.2090203@lemburg.com>
Message-ID: <02e001c2210f$922ca2a0$ced241d5@hagrid>

mal wrote:

> > does anyone have any real-life use cases?  I've never been
> > able to use it for anything, and cannot recall ever seeing it
> > being used by anyone else...
 
> I use it in real-life applications to wrap binary data.

can you elaborate?  how do you use it?  could it be replaced
by something simpler, and still work in your application?

would something like this work?

    class buffer(object):
        def __len__(...)
        def __getitem__(...)
        def __getslice__(...)

    class basestring(buffer):
        ...

    class string(basestring):
        ...

    class unicode(basestring):
        ...

</F>




From pinard@iro.umontreal.ca  Mon Jul  1 16:25:14 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 01 Jul 2002 11:25:14 -0400
Subject: [Python-Dev] Re: Infinie recursion in Pickle
In-Reply-To: <05a601c22101$741ac2f0$0900a8c0@spiff>
References: <20020701002535.A1510@phd.pp.ru>
 <LNBBLJKPBEHFEDALKOLCIECEABAB.tim.one@comcast.net>
 <20020701143913.A6446@phd.pp.ru>
 <oqadpbsjua.fsf@titan.progiciels-bpi.ca>
 <05a601c22101$741ac2f0$0900a8c0@spiff>
Message-ID: <oq1yan8o85.fsf@titan.progiciels-bpi.ca>

[Fredrik Lundh]

> > I usually praise maintainers as intelligent people. :-)

> so why not do as we tell you, and post bug reports on SF?

Because this is not an efficient way to proceed, and does not always work.
I do not have much spare time, and would hate seeing it spoiled, fighting
with artificial problems coming from tools doomed to be replaced anyway.

> or better, join the roundup team, and make sure it's good enough
> to replace the SF tracker.

I found the SF tracker so unattractive that I've been tempted to do that
indeed.  Yet, thinking more about it, it is non-sense for me to invest
vast amount of energies merely to acquire the capability of submitting
numerous little things (as documentation nits, for example).

A long while ago, I witnessed that we had to pay real money to machine
constructors, yearly, for having the right of submitting reports to them
in such a way that we could later use their consequent works.  The free
software movement turned the values around, and refreshingly underlined
that reporting a problem is a contribution from the user to the software
maintainer and indirectly, to the community.  For many years, we are living
a progressive swing-back, in which expenditure of money has been replaced
by all the stunts and sufferings induced by inadequate communication tools,
like bug trackers.  The price to pay is rather high.  If users' contributions
were really welcome, maintainers would not try to force users into this.

It is much easier and comfortable for me to be a mere user, and let others
pay the price.  However, my principles and education strongly tell me that
when something is given to me (like Python), it is only normal and natural
trying to give something back.  As I contributed many thousands of hours for
other projects, I did my share overall, and my own principles are satisfied.
Enough for me to refuse a high price ticket, in free time and irritation,
before I could offer my work or dedication.  Oh, I may come to like bug
trackers.  But surely, I find extremely distasteful being forced into them.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From gmcm@hypernet.com  Mon Jul  1 16:33:39 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Mon, 1 Jul 2002 11:33:39 -0400
Subject: Bug tracking (was: [Python-Dev] Re: Infinie recursion in Pickle)
In-Reply-To: <05a601c22101$741ac2f0$0900a8c0@spiff>
Message-ID: <3D203E13.22484.1789FB93@localhost>

On 1 Jul 2002 at 15:15, Fredrik Lundh wrote:

[complaints about SF bug tracker]

> or better, join the roundup team, and make sure
> it's good enough to replace the SF tracker.  (as far
> as I know, it already is -- but someone still needs
> to set it up, write a script that pulls all data out
> of the old system, etc). 

Someone is setting it up, and has written those
scripts. A working demo (populated with PythonLabs
history) should soon be available.

I have to say that my "good enough" has been
focussed on funtionality more than usability.
I've never noticed any particular[1] difficulty
in using the SF tracker to enter or post info on
a bug, so François probably has a different "good
enough" :-).

-- Gordon
http://www.mcmillan-inc.com/

[1] Writing an app as a set of cgi's is a fast way
to write a mediocre GUI. Writing a *good* GUI
in this environment is extremely difficult and ugly,
because you end up with lots of just-barely-portable-
by-any-definition-thereof javascript, and Aahz is
left out in the cold. Oh well, at least no one had
a requirement that it be usable throught their
cell phone...



From fredrik@pythonware.com  Mon Jul  1 16:59:07 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 1 Jul 2002 17:59:07 +0200
Subject: [Python-Dev] Re: Bug tracking
References: <3D203E13.22484.1789FB93@localhost>
Message-ID: <03e601c22118$3ddbec70$ced241d5@hagrid>

gordon wrote:

> Someone is setting it up, and has written those
> scripts. A working demo (populated with PythonLabs
> history) should soon be available.

+1 on adding gordon to the standard library, and
+1 on deprecating the whinewhinewhine module.

</F>




From aahz@pythoncraft.com  Mon Jul  1 17:02:58 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 1 Jul 2002 12:02:58 -0400
Subject: [Python-Dev] Re: Bug tracking (was: Re: Infinie recursion in Pickle)
In-Reply-To: <3D203E13.22484.1789FB93@localhost>
References: <05a601c22101$741ac2f0$0900a8c0@spiff> <3D203E13.22484.1789FB93@localhost>
Message-ID: <20020701160258.GA26325@panix.com>

On Mon, Jul 01, 2002, Gordon McMillan wrote:
>
> I have to say that my "good enough" has been focussed on funtionality
> more than usability.  I've never noticed any particular[1] difficulty
> in using the SF tracker to enter or post info on a bug, so François
> probably has a different "good enough" :-).
>
> [1] Writing an app as a set of cgi's is a fast way to write a mediocre
> GUI. Writing a *good* GUI in this environment is extremely difficult
> and ugly, because you end up with lots of just-barely-portable-
> by-any-definition-thereof javascript, and Aahz is left out in the
> cold. Oh well, at least no one had a requirement that it be usable
> throught their cell phone...

If it's usable in Lynx, it should be usable on a cell phone.

Anyway, I find it difficult to believe that you're having a lot of
trouble writing a good GUI with plain HTML, at least by the standards of
people here -- a clean, accessible, and functional interface *is* a good
GUI.  If you've got an URL for testing, I'd be glad to give feedback
(and I'll even be willing to fire up Konquerer to cross-check).

I'm curious whether you think that Google Groups Advanced Search is a
good GUI.  What about the Google Groups thread view?

Finally, it takes some effort, but it's not *that* hard to use
JavaScript that degrades gracefully when JavaScript isn't available.
For more info, see http://www.rahul.net/aahz/javascript.html
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From tim.one@comcast.net  Mon Jul  1 18:14:09 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 01 Jul 2002 13:14:09 -0400
Subject: [Python-Dev] Infinie recursion in Pickle
In-Reply-To: <20020701064639.A1003@unpythonic.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEFCABAB.tim.one@comcast.net>

Bug reports are off-topic on Python-Dev, unless the Python developers need
this list to collaborate on an implementation change.

Please keep bug reports on SourceForge.  Like it or not, putting information
in an SF bug report is only way your bug has a chance to get addressed.




From tim.one@comcast.net  Mon Jul  1 18:33:20 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 01 Jul 2002 13:33:20 -0400
Subject: [Python-Dev] Re: Infinie recursion in Pickle
In-Reply-To: <oq1yan8o85.fsf@titan.progiciels-bpi.ca>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEFDABAB.tim.one@comcast.net>

Fran=E7ois, I swear you spend more time complaining about SF than it =
would
take to just use it.  You're not going to prevail on this, so please =
save
everyone's time (yours and ours) by skipping the repetition.

> ...
> The free software movement turned the values around, and refreshing=
ly
> underlined that reporting a problem is a contribution from the user=
 to
> the software maintainer and indirectly, to the community.

Problem reports are certainly appreciated.  Problem reports via one-t=
o-one
email works great for a new open source project with users numbering =
in the
dozens, but it doesn't scale.  Python has hundreds of thousands of us=
ers
now, and more reports than the sum total of developers can handle.  A=
ny
project of this size has to change how it works.  Guido held on to hi=
s
Guido's-inbox bug reporting system for a year after it totally broke =
down,
and it took a lot of extra work to recover from the chaos it fell int=
o.
Despite its flaws, the SF-based trackers work at least a thousand tim=
es
better than that did in the end.  We can't go back.





From tim.one@comcast.net  Mon Jul  1 18:40:16 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 01 Jul 2002 13:40:16 -0400
Subject: [Python-Dev] Infinie recursion in Pickle
In-Reply-To: <20020701143913.A6446@phd.pp.ru>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEFEABAB.tim.one@comcast.net>

[Oleg Broytmann]
>    Ok. From today I have a lot of spare time and very good almost free
> Internet connection, so I can investigate things.

Great!

> I can post the results of my investigation to the developers list or to
> the c.l.py, if anyone is interested.

It's off-topic on Python-Dev, and it will get ignored on c.l.py (you already
tried that -- what changed since the last time you got ignored <wink>?).
Attach info to a bug report -- that's where it belongs.

> ...
>    Still, life is too short to spend it in the SF slooow interface :(

That's a feature:  it encourages people to add only focused comments of real
value <0.9 wink>.




From pinard@iro.umontreal.ca  Mon Jul  1 19:13:41 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 01 Jul 2002 14:13:41 -0400
Subject: [Python-Dev] Re: Infinie recursion in Pickle
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEFDABAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCGEFDABAB.tim.one@comcast.net>
Message-ID: <oqsn338gfe.fsf@titan.progiciels-bpi.ca>

[Tim Peters]

> François, I swear you spend more time complaining about SF than it would
> take to just use it.

So far, its kif-kif.  (I'm not sure it is English: I mean that I spent
about the same time complaining that I spent trying to use the beast).
The balance would probably break if I was trying to use SF more often! :-)

> You're not going to prevail on this, so please save everyone's time
> (yours and ours) by skipping the repetition.

I'm not at all trying to prevail, I do not have such needs.  However, it
is worth underlining that there are other ways to communication, and that
the Python community might disserve itself by asserting there is only one.

> Despite its flaws, the SF-based trackers work at least a thousand times
> better than that did in the end.  We can't go back.

There is only once choice left, then, and that's going forward!  I intend to
give `roundup' an honest try, while understanding it is still in the works.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From mal@lemburg.com  Mon Jul  1 20:59:29 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 01 Jul 2002 21:59:29 +0200
Subject: [Python-Dev] Silent Deprecation Candidate -- buffer()
References: <001f01c21ed9$873f3c00$06ea7ad1@othello> <004c01c21f5c$6dcbf2d0$ced241d5@hagrid> <3D20111E.2090203@lemburg.com> <02e001c2210f$922ca2a0$ced241d5@hagrid>
Message-ID: <3D20B4A1.6000702@lemburg.com>

Fredrik Lundh wrote:
> mal wrote:
> 
> 
>>>does anyone have any real-life use cases?  I've never been
>>>able to use it for anything, and cannot recall ever seeing it
>>>being used by anyone else...
>>
>  
> 
>>I use it in real-life applications to wrap binary data.
> 
> 
> can you elaborate?  how do you use it? 

As I said, I wrap binary data in buffer objects; these can
be memory-mapped files, strings containing binary data or
any other Python object implementing the buffer interface.

IMHO, buffer() is the only way to signify non-string data
while maintaining a string like interface.

> could it be replaced
> by something simpler, and still work in your application?
> 
> would something like this work?
> 
>     class buffer(object):
>         def __len__(...)
>         def __getitem__(...)
>         def __getslice__(...)

Provided these return buffer objects, yes.

>     class basestring(buffer):
>         ...
> 
>     class string(basestring):
>         ...
> 
>     class unicode(basestring):
>         ...

I don't see the simplification, though ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From oren-py-d@hishome.net  Mon Jul  1 21:18:41 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 1 Jul 2002 16:18:41 -0400
Subject: [Python-Dev] Alternative implementation of string interning
Message-ID: <20020701201841.GA52320@hishome.net>

http://python.org/sf/576101 

Interning is done using a flag instead of a pointer (3 bytes less). The 
ob_sinterned pointer was most of the time either NULL or pointing to the 
same object.  Cases where it pointed to another object were rare and the 
code that was cheching for this case was not effective.

Interned strings are no longer immortal.  They die when their refcnt
reaches 0 just like any other object.  The reference from the interned dict
will not keep them alive longer than necessary.

Can anyone explain why they were implemented with a pointer in the first
place? Barry?

	Oren



From tim.one@comcast.net  Mon Jul  1 22:12:31 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 01 Jul 2002 17:12:31 -0400
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <20020701201841.GA52320@hishome.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEGCABAB.tim.one@comcast.net>

[Oren Tirosh, on <http://python.org/sf/576101>]
> ...
> Interned strings are no longer immortal.  They die when their refcnt
> reaches 0 just like any other object.

This may be a problem.  Code now can rely on that id(some_interned_string)
stays the same across the life of a run.

> ...
> Can anyone explain why they were implemented with a pointer in the first
> place? Barry?

It will have to be Guido.  He made a plausible case to me once about why the
indirection is there, but it may be an optimization that's no longer
important.  At the time interned strings were introduced, extension modules
had mountains of code of the form:

    /* at module init time, in one or more modules */
    static PyObject *spam_str = PyString_FromString("spam");

    /* in various module routines */
    PyObject_SetAttr(someobject, spam_str, user_supplied_value);

and PyObject_SetAttr() was changed to make spam_str what you called an
"indirectly interned" string by magic.  This was (or at least Guido thought
it was <wink>) an important optimization at the time.

Extension modules written after interned strings were introduced can exploit
interning directly, a la

    /* at module init time, in one or more modules */
    static PyObject *spam_str = PyString_InternFromString("spam");

and the core was reworked to do that too (note that this optimization wasn't
directed at the core -- it could well be that core code never creates an
indirectly interned string).  I don't know how many extension modules still
implicitly rely on indirect interning for a speed boost.  Zope doesn't, and
that's all that really matters <wink>.




From gmcm@hypernet.com  Mon Jul  1 23:16:30 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Mon, 1 Jul 2002 18:16:30 -0400
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEGCABAB.tim.one@comcast.net>
References: <20020701201841.GA52320@hishome.net>
Message-ID: <3D209C7E.2597.18FACFEA@localhost>

On 1 Jul 2002 at 17:12, Tim Peters wrote:

> ...  I don't know how many
> extension modules still implicitly rely on indirect
> interning for a speed boost.  

I bet most extension authors have been completely
ignorant of it, which makes the answer "most of
them" <wink>.

-- Gordon
http://www.mcmillan-inc.com/




From bsder@mail.allcaps.org  Tue Jul  2 01:20:07 2002
From: bsder@mail.allcaps.org (Andrew P. Lentvorski)
Date: Mon, 1 Jul 2002 17:20:07 -0700 (PDT)
Subject: [Python-Dev] XML module causes profiler to throw
In-Reply-To: <20020701010104.L6236-100000@mail.allcaps.org>
Message-ID: <20020701170702.H290-200000@mail.allcaps.org>

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

--0-1139718406-1025569154=:290
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
Content-ID: <20020701171925.N290@mail.allcaps.org>

Okay, I created a tiny, self-sufficient file which demonstrates the
problem without requiring external files or anything else silly like that.
I tried to attach it to SourceForge bug 534864, but I'm apparently too
dumb to figure out how to do it.

-a

Here's the log, the source file is attached:

Python 2.2.1 (#1, May 27 2002, 16:42:22)
[GCC 2.95.3 20010315 (release) [FreeBSD]] on freebsd4
Type "help", "copyright", "credits" or "license" for more information.
>>> import profile
>>> import xmltest
>>> profile.run('xmltest.main()')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python2.2/profile.py", line 71, in run
    prof = prof.run(statement)
  File "/usr/local/lib/python2.2/profile.py", line 404, in run
    return self.runctx(cmd, dict, dict)
  File "/usr/local/lib/python2.2/profile.py", line 410, in runctx
    exec cmd in globals, locals
  File "<string>", line 1, in ?
  File "xmltest.py", line 24, in main
    xml.sax.parseString(testxml, chand)
  File "/usr/local/lib/python2.2/xml/sax/__init__.py", line 49, in parseString
    parser.parse(inpsrc)
  File "/usr/local/lib/python2.2/xml/sax/expatreader.py", line 90, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/usr/local/lib/python2.2/xml/sax/xmlreader.py", line 123, in parse
    self.feed(buffer)
  File "/usr/local/lib/python2.2/xml/sax/expatreader.py", line 143, in feed
    self._parser.Parse(data, isFinal)
  File "/usr/local/lib/python2.2/xml/sax/expatreader.py", line 216, in start_element
    def start_element(self, name, attrs):
  File "/usr/local/lib/python2.2/profile.py", line 214, in trace_dispatch_i
    if self.dispatch[event](self, frame,t):
  File "/usr/local/lib/python2.2/profile.py", line 260, in trace_dispatch_call
    assert rframe.f_back is frame.f_back, ("Bad call", rfn,
AssertionError: ('Bad call', ('/usr/local/lib/python2.2/xml/sax/expatreader.py', 132, 'feed'), <frame object at 0x81a880c>, <frame object at 0x81a840c>, <frame object at 0x81bb60c>, <frame object at 0x81a8a0c>)
>>> xmltest.main()
start element
end element



--0-1139718406-1025569154=:290
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII; NAME="xmltest.py"
Content-Transfer-Encoding: BASE64
Content-ID: <20020701171914.A290@mail.allcaps.org>
Content-Description: 
Content-Disposition: ATTACHMENT; FILENAME="xmltest.py"

IyEvdXNyL2Jpbi9lbnYgcHl0aG9uDQoNCmltcG9ydCB4bWwuc2F4DQoNCnRl
c3R4bWwgPSBcDQoiIiINCjxodG1sIC8+DQoiIiINCg0KY2xhc3MgQ29udGVu
dEhhbmRsZXIoeG1sLnNheC5Db250ZW50SGFuZGxlcik6DQogICAgIiIiIEhh
bmRsZSBjYWxsYmFja3MgZnJvbSB0aGUgU0FYIFhNTCBwYXJzZXIuICIiIg0K
DQogICAgZGVmIF9faW5pdF9fKHNlbGYpOg0KICAgICAgICBwYXNzDQoNCiAg
ICBkZWYgc3RhcnRFbGVtZW50KHNlbGYsIG5hbWUsIGF0dHJzKToNCiAgICAg
ICAgcHJpbnQgInN0YXJ0IGVsZW1lbnQiDQoNCiAgICBkZWYgZW5kRWxlbWVu
dChzZWxmLCBuYW1lKToNCiAgICAgICAgcHJpbnQgImVuZCBlbGVtZW50Ig0K
DQpkZWYgbWFpbigpOg0KICAgIGNoYW5kID0gQ29udGVudEhhbmRsZXIoKQ0K
ICAgIHhtbC5zYXgucGFyc2VTdHJpbmcodGVzdHhtbCwgY2hhbmQpDQoNCmlm
IF9fbmFtZV9fID09ICJfX21haW5fXyI6DQogICAgbWFpbigpDQo=
--0-1139718406-1025569154=:290--



From tim.one@comcast.net  Tue Jul  2 02:23:15 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 01 Jul 2002 21:23:15 -0400
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <3D209C7E.2597.18FACFEA@localhost>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHBABAB.tim.one@comcast.net>

[Gordon, on extension modules implicitly relying on indirect interning]
> I bet most extension authors have been completely
> ignorant of it, which makes the answer "most of
> them" <wink>.

Could be!  I don't know how much of a speed boost they get, though.  While
the magical interning is done for PyObject_SetAttr(), it's not done for the
has-to-be-more-frequently-called PyObject_GetAttr(), as people call that
with all sorts of garbage strings.  For some reason interning is done for
PyObject_GetAttrString(), although the caller of that can't profit from
indirect interning (it takes a char*, not a PyObject*).

Like I said, maybe this all makes sense to Guido <0.9 wink>.

at-least-we're-not-fighting-over-what-the-comments-mean-ly y'rs  - tim




From tim.one@comcast.net  Tue Jul  2 02:31:09 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 01 Jul 2002 21:31:09 -0400
Subject: [Python-Dev] Some dull gc stats
In-Reply-To: <m3y9cw9cz5.fsf@mira.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEHCABAB.tim.one@comcast.net>

[MvL]
> Do you think this should be backported to 2.2.2 as well?

I already backported the part that's arguably "a bugfix" (the part that
would have solved Kevin's severe speed problem, had 2.2.1 not had a
different bug that prevented the dramatic slowdown he saw in CVS Python).

All that remains is new optimizations, and those can't be sold as a bugfix.
Large (in the sense of lines of code) changes to gc really need to go thru
lots of testing too.  The optimizations involve some new algorithms, not
just tweaking the former ones.

So, no.  If it were-- or evolves into --a dramatic speedup, maybe.




From bsder@mail.allcaps.org  Tue Jul  2 03:50:19 2002
From: bsder@mail.allcaps.org (Andrew P. Lentvorski)
Date: Mon, 1 Jul 2002 19:50:19 -0700 (PDT)
Subject: [Python-Dev] Performance question about math operations
Message-ID: <20020701193609.O547-100000@mail.allcaps.org>

I have a VLSI layout editor written in Python.  At its core, it has to
redraw a lot of polygons.  This requires a lot of coordinate conversion
mathematics.  Essentially the following loop:

#! /usr/bin/env python

def main():
    i = 0

    while i < 1000000:
        i = i + 1

        (1-678)*3.589
        -((1-456)*3.589)

if __name__=="__main__":
    main()

Now, I understand that looping in Python has overhead.  It turns out that
the loop without the math operations takes about .5 seconds.  Fine.
However, each line of math operations adds .75 seconds to the total loop
time for a total run time of about 2 seconds.  This is with -O enabled
(even though it doesn't seem to have any effect).

This same loop in C++ (with classes, indirection, copy contruction, etc)
takes about .05 seconds.

That's about a factor of 30 (1.5 / .05) difference even if I cancel out
the loop overhead.  I could handle factor of 2 or 4, but 30 seems a bit
high.

What is eating all that time?  And can I do anything about it?

-a




From aahz@pythoncraft.com  Tue Jul  2 04:17:57 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 1 Jul 2002 23:17:57 -0400
Subject: [Python-Dev] Performance question about math operations
In-Reply-To: <20020701193609.O547-100000@mail.allcaps.org>
References: <20020701193609.O547-100000@mail.allcaps.org>
Message-ID: <20020702031757.GA15825@panix.com>

On Mon, Jul 01, 2002, Andrew P. Lentvorski wrote:
>
> I have a VLSI layout editor written in Python.  At its core, it has to
> redraw a lot of polygons.  This requires a lot of coordinate conversion
> mathematics.  

Please post this question to comp.lang.python; it is not appropriate for
python-dev.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From oren-py-d@hishome.net  Tue Jul  2 06:27:38 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 2 Jul 2002 08:27:38 +0300
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEGCABAB.tim.one@comcast.net>; from tim.one@comcast.net on Mon, Jul 01, 2002 at 05:12:31PM -0400
References: <20020701201841.GA52320@hishome.net> <LNBBLJKPBEHFEDALKOLCAEGCABAB.tim.one@comcast.net>
Message-ID: <20020702082738.A26155@hishome.net>

On Mon, Jul 01, 2002 at 05:12:31PM -0400, Tim Peters wrote:
> [Oren Tirosh, on <http://python.org/sf/576101>]
> > ...
> > Interned strings are no longer immortal.  They die when their refcnt
> > reaches 0 just like any other object.
> 
> This may be a problem.  Code now can rely on that id(some_interned_string)
> stays the same across the life of a run.

This requires code that stores the id of an object without keeping a 
reference to the actual object.  It also requires that no other piece of 
Python or C code keep a reference to that object and yet for its identity to 
be somehow still significant.  If find that extremely hard to imagine.

> > Can anyone explain why they were implemented with a pointer in the first
> > place? Barry?
...
> and PyObject_SetAttr() was changed to make spam_str what you called an
> "indirectly interned" string by magic.  This was (or at least Guido thought
> it was <wink>) an important optimization at the time.

I see.  As far as I can tell, it isn't any more.


Now for something a bit more radical:

Why not make interned strings a type?  <type 'istr'> could be an 
un-subclassable subclass of string.  intern would just be an alias for this 
type.  No two istr instances are equal unless they are identical.  I guess 
PyString_CheckExact would need to be changed to accept either String or 
InternedString.

	Oren



From martin@v.loewis.de  Tue Jul  2 08:10:31 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 02 Jul 2002 09:10:31 +0200
Subject: [Python-Dev] Some dull gc stats
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEHCABAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCAEHCABAB.tim.one@comcast.net>
Message-ID: <m3ofdqfvvc.fsf@mira.informatik.hu-berlin.de>

Tim Peters <tim.one@comcast.net> writes:

> All that remains is new optimizations, and those can't be sold as a bugfix.

I understand that it is not a requirement anymore that changes to
Python 2.2 are "pure bugfixes". Instead, people expect that Python 2.2
evolves and continues to grow new features, as long as they are
"strictly backwards compatible".

For any user-visible feature, it is normally debatable whether it is
"strictly backwards compatible", since it is, by nature, a change in
observable behaviour.

This specific case is not in that category (i.e. has no
user-observable behaviour change), so I think it qualifies for 2.2 -
provided there is enough trust in its correctness.

> Large (in the sense of lines of code) changes to gc really need to go thru
> lots of testing too.  The optimizations involve some new algorithms, not
> just tweaking the former ones.

I'm concerned that backporting more changes to Python 2.2 will become
difficult in that area, if the GC implementations vary significantly.

Maybe this can be reconsidered when there actually is another change
to backport.

Regards,
Martin




From Jack.Jansen@cwi.nl  Tue Jul  2 10:17:15 2002
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Tue, 2 Jul 2002 11:17:15 +0200
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEGCABAB.tim.one@comcast.net>
Message-ID: <7FD2EA20-8D9C-11D6-A0C5-0030655234CE@cwi.nl>

On Monday, July 1, 2002, at 11:12 , Tim Peters wrote:

> [Oren Tirosh, on <http://python.org/sf/576101>]
>> ...
>> Interned strings are no longer immortal.  They die when their refcnt
>> reaches 0 just like any other object.
>
> This may be a problem.  Code now can rely on that 
> id(some_interned_string)
> stays the same across the life of a run.

The macimport code relies on the ids remaining the same. But it is easy 
to fix (just add an incref). I'll also change it to use 
PyString_CheckInterned.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -




From oren-py-d@hishome.net  Tue Jul  2 11:37:48 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 2 Jul 2002 06:37:48 -0400
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <7FD2EA20-8D9C-11D6-A0C5-0030655234CE@cwi.nl>
References: <LNBBLJKPBEHFEDALKOLCAEGCABAB.tim.one@comcast.net> <7FD2EA20-8D9C-11D6-A0C5-0030655234CE@cwi.nl>
Message-ID: <20020702103748.GA68536@hishome.net>

On Tue, Jul 02, 2002 at 11:17:15AM +0200, Jack Jansen wrote:
> 
> On Monday, July 1, 2002, at 11:12 , Tim Peters wrote:
> 
> >[Oren Tirosh, on <http://python.org/sf/576101>]
> >>...
> >>Interned strings are no longer immortal.  They die when their refcnt
> >>reaches 0 just like any other object.
> >
> >This may be a problem.  Code now can rely on that 
> >id(some_interned_string)
> >stays the same across the life of a run.
> 
> The macimport code relies on the ids remaining the same. But it is easy 
> to fix (just add an incref). I'll also change it to use 
> PyString_CheckInterned.

No, an incref there would leak references.  Nothing needs to be changed.

Any code with correct reference counting will not notice any difference
with this patch.  The only problem that could occur is if Python code uses 
the id function, stores the integer result but doesn't keep an actual 
reference to the string object and no other code does, either. Even this is 
not a problem yet unless the code also expects that if the same string is 
ever interned again it will get the same integer id and breaks if it
doesn't.  I can't believe anyone is stupid enough to do that.  Using the id 
function this way is equivalent to an uncounted reference. 

BTW, my patch already takes care of PyString_CheckInterned in macimport.c

	Oren



From fredrik@pythonware.com  Tue Jul  2 12:18:31 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 2 Jul 2002 13:18:31 +0200
Subject: [Python-Dev] Alternative implementation of string interning
References: <LNBBLJKPBEHFEDALKOLCAEGCABAB.tim.one@comcast.net> <7FD2EA20-8D9C-11D6-A0C5-0030655234CE@cwi.nl> <20020702103748.GA68536@hishome.net>
Message-ID: <012501c221ba$34014f90$0900a8c0@spiff>

Oren Tirosh wrote:

> Even this is not a problem yet unless the code also expects that if =
the
> same string is ever interned again it will get the same integer id and =
breaks
> if it doesn't.  I can't believe anyone is stupid enough to do that.

do what?  trust the documentation?

    intern(string)=20
  Enter string in the table of ``interned'' strings and return the
  interned string - which is string itself or a copy. /.../ Interned
  strings are immortal (never get garbage collected).
</F>




From Jack.Jansen@cwi.nl  Tue Jul  2 14:25:20 2002
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Tue, 2 Jul 2002 15:25:20 +0200
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <20020702103748.GA68536@hishome.net>
Message-ID: <280FD4D5-8DBF-11D6-AE49-0030655234CE@cwi.nl>

On Tuesday, Jul 2, 2002, at 12:37 Europe/Amsterdam, Oren Tirosh wrote:
>> The macimport code relies on the ids remaining the same. But it is easy
>> to fix (just add an incref). I'll also change it to use
>> PyString_CheckInterned.
>
> No, an incref there would leak references.  Nothing needs to be changed.
>
Uhm... I'm confused: macimport stores a pointer to the object if it's interned (the object in question is one of the strings in sys.path). It didn't INCREF the object, and that wasn't needed up until now because interned objects can never go away. However, if they can go away I would think that storing a pointer would definitely call for an INCREF...




From gward@python.net  Tue Jul  2 14:53:25 2002
From: gward@python.net (Greg Ward)
Date: Tue, 2 Jul 2002 09:53:25 -0400
Subject: Bug tracking (was: [Python-Dev] Re: Infinie recursion in Pickle)
In-Reply-To: <3D203E13.22484.1789FB93@localhost>
References: <05a601c22101$741ac2f0$0900a8c0@spiff> <3D203E13.22484.1789FB93@localhost>
Message-ID: <20020702135325.GA5085@gerg.ca>

On 01 July 2002, Gordon McMillan said:
> [1] Writing an app as a set of cgi's is a fast way
> to write a mediocre GUI.

Rumour has it that there are several fine web application frameworks
available for Python.  I'm partial to Quixote [1] myself, but I've also
heard good things about WebWare.

        Greg

[1] http://www.mems-exchange.org/software/quixote/

-- 
Greg Ward - Unix weenie                                 gward@python.net
http://starship.python.net/~gward/
All programmers are playwrights and all computers are lousy actors.



From oren-py-d@hishome.net  Tue Jul  2 14:57:56 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 2 Jul 2002 09:57:56 -0400
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <280FD4D5-8DBF-11D6-AE49-0030655234CE@cwi.nl>
References: <20020702103748.GA68536@hishome.net> <280FD4D5-8DBF-11D6-AE49-0030655234CE@cwi.nl>
Message-ID: <20020702135756.GA95955@hishome.net>

On Tue, Jul 02, 2002 at 03:25:20PM +0200, Jack Jansen wrote:
> 
> On Tuesday, Jul 2, 2002, at 12:37 Europe/Amsterdam, Oren Tirosh wrote:
> >>The macimport code relies on the ids remaining the same. But it is easy
> >>to fix (just add an incref). I'll also change it to use
> >>PyString_CheckInterned.
> >
> >No, an incref there would leak references.  Nothing needs to be changed.
> >
> Uhm... I'm confused: macimport stores a pointer to the object if it's 
> interned (the object in question is one of the strings in sys.path). It 
> didn't INCREF the object, and that wasn't needed up until now because 
> interned objects can never go away. However, if they can go away I would 
> think that storing a pointer would definitely call for an INCREF...

Are you saying that this code is not following reference counting rules
and got away with it only because interned strings are immortal?

I don't see how adding only an incref could be correct - there must be a
corresponding decref somewhere.

	Oren



From jacobs@penguin.theopalgroup.com  Tue Jul  2 15:12:30 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Tue, 2 Jul 2002 10:12:30 -0400 (EDT)
Subject: Bug tracking (was: [Python-Dev] Re: Infinie recursion in Pickle)
In-Reply-To: <20020702135325.GA5085@gerg.ca>
Message-ID: <Pine.LNX.4.44.0207021001340.31716-100000@penguin.theopalgroup.com>

On Tue, 2 Jul 2002, Greg Ward wrote:
> On 01 July 2002, Gordon McMillan said:
> > [1] Writing an app as a set of cgi's is a fast way
> > to write a mediocre GUI.
> 
> Rumour has it that there are several fine web application frameworks
> available for Python.  I'm partial to Quixote [1] myself, but I've also
> heard good things about WebWare.

I think the point was that web-based GUIs tend to be rather mediocre,
regardless of which toolkit is used.  To some degree I have to agree -- you
typically end up with a very clunky GUI with lots of high latency hits to a
server for updates, or a very complex frontend implemented with a large and
difficult to maintain body of Javascript.

Some progress has been made to improve the situation, although the
state-of-the-art is far from ideal.

We can take this discussion off python-dev if anyone wants to know more
about my thoughts on this matter.

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From Jack.Jansen@cwi.nl  Tue Jul  2 15:28:49 2002
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Tue, 2 Jul 2002 16:28:49 +0200
Subject: [Python-Dev] HUGE_VAL and INFINITY
Message-ID: <0651D5E2-8DC8-11D6-9F20-0030655234CE@cwi.nl>

I think there is a problem with the way pyport.h treats HUGE_VAL and 
INFINITY. But as this whole area is a great can of worms I'd like 
someone with more knowledge of C standards and floating point and such 
to ponder it, please.

If both INFINITY and HUGE_VAL are defined then INFINITY takes 
precedence. However, all references I've seen to INFINITY seem to 
indicate that this is a float value, not a double value, according to 
the C99 standard. And I've now come across a platform where 
HUGE_VAL==1e500 and INFINITY==HUGE_VALF==1e50, and these latter values 
are not infinite for doubles (I assume they are infinite for floats, but 
I haven't checked).

I have a patch that will fix this problem for my specific case, but I 
have the feeling that it may be the pyport.h logic that is at fault 
here. If no-one jumps in I'll commit my fix in a few days time.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -




From Jack.Jansen@cwi.nl  Tue Jul  2 15:37:42 2002
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Tue, 2 Jul 2002 16:37:42 +0200
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <20020702135756.GA95955@hishome.net>
Message-ID: <4401F110-8DC9-11D6-9F20-0030655234CE@cwi.nl>

On Tuesday, July 2, 2002, at 03:57 , Oren Tirosh wrote:
>> Uhm... I'm confused: macimport stores a pointer to the object if it's
>> interned (the object in question is one of the strings in sys.path). It
>> didn't INCREF the object, and that wasn't needed up until now because
>> interned objects can never go away. However, if they can go away I 
>> would
>> think that storing a pointer would definitely call for an INCREF...
>
> Are you saying that this code is not following reference counting rules
> and got away with it only because interned strings are immortal?

I'm afraid so. Or, actually, "afraid so" sounds too apologetic:-): 
interned
strings were specifically defined to be immortal.

> I don't see how adding only an incref could be correct - there must be a
> corresponding decref somewhere.

No, there isn't, because this list of pointers is never cleared. Which 
was never
needed, because they were borrowed references.

Again, it isn't rocket science to fix this: _PyImport_Fini() will need 
to call
out to a new routine _PyMacImport_Fini() that DECREFs the stored 
pointers.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -




From marklists@mceahern.com  Tue Jul  2 15:39:16 2002
From: marklists@mceahern.com (Mark McEahern)
Date: Tue, 2 Jul 2002 09:39:16 -0500
Subject: Bug tracking (was: [Python-Dev] Re: Infinie recursion in Pickle)
In-Reply-To: <Pine.LNX.4.44.0207021001340.31716-100000@penguin.theopalgroup.com>
Message-ID: <JHEOKEOOLIGLDHCMAHMOCEIGCPAA.marklists@mceahern.com>

[Kevin Jacobs]
> I think the point was that web-based GUIs tend to be rather mediocre,
> regardless of which toolkit is used.  To some degree I have to
> agree -- you
> typically end up with a very clunky GUI with lots of high latency
> hits to a
> server for updates, or a very complex frontend implemented with a
> large and
> difficult to maintain body of Javascript.
>
> Some progress has been made to improve the situation, although the
> state-of-the-art is far from ideal.
>
> We can take this discussion off python-dev if anyone wants to know more
> about my thoughts on this matter.

I'd be interested to hear more.  There's a current discussion on
comp.lang.python where you may want to post your thoughts:

Here are two different pointers to the beginning of today's installments:


http://groups.google.com/groups?selm=3d21259f%240%2428006%24afc38c87%40news.
optusnet.com.au

  http://mail.python.org/pipermail/python-list/2002-July/111256.html

Cheers,

// mark

-




From tim.one@comcast.net  Tue Jul  2 16:28:08 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 02 Jul 2002 11:28:08 -0400
Subject: [Python-Dev] HUGE_VAL and INFINITY
In-Reply-To: <0651D5E2-8DC8-11D6-9F20-0030655234CE@cwi.nl>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAENADFAA.tim.one@comcast.net>

[Jack Jansen]
> I think there is a problem with the way pyport.h treats HUGE_VAL and
> INFINITY.

I'm afraid there are necessarily problems there, since this stuff is
insufficiently standardized.

> But as this whole area is a great can of worms I'd like someone with
> more knowledge of C standards and floating point and such to ponder it,
> please.

Let's look at the code:

"""
/* According to
 * http://www.cray.com/swpubs/manuals/SN-2194_2.0/html-SN-2194_2.0/x3138.htm
 * on some Cray systems HUGE_VAL is incorrectly (according to the C std)
 * defined to be the largest positive finite rather than infinity.  We
 * need the std-conforming infinity meaning (provided the platform has
 * one!).
 *
 * Then, according to a bug report on SourceForge, defining Py_HUGE_VAL as
 * INFINITY caused internal compiler errors under BeOS using some version
 * of gcc.  Explicitly casting INFINITY to double made that problem go
 * away.
 */
#ifdef INFINITY
#define Py_HUGE_VAL ((double)INFINITY)
#else
#define Py_HUGE_VAL HUGE_VAL
#endif
"""

> If both INFINITY and HUGE_VAL are defined then INFINITY takes
> precedence.

Right, and the comment explains why (a broken Cray system).

> However, all references I've seen to INFINITY seem to indicate that
> this is a float value, not a double value, according to the C99 standard.

It is a float value, but is explicitly cast to double in the above.

> And I've now come across a platform where HUGE_VAL==1e500 and
> INFINITY==HUGE_VALF==1e50, and these latter values are not infinite for
> doubles (I assume they are infinite for floats, but I haven't checked).

The platform's header files are braindead.  That doesn't mean we shouldn't
try to survive despite them, but you should file a bug report with whoever
supplies this C.  If (double)INFINITY isn't a double-precision infinity,
their definition of INFINITY is hosed (the C89 std doesn't say anything
useful about this, it's a matter of respecting the spirit of IEEE-754 and
that C didn't bother to define a double-precision version of the INFINITY
macro -- that means a *useful* float INFINITY has to be defined in such a
way that it can do double-duty).

> I have a patch that will fix this problem for my specific case, but I
> have the feeling that it may be the pyport.h logic that is at fault
> here. If no-one jumps in I'll commit my fix in a few days time.

Don't check in a change here without review.  Why are you keeping "the fix"
secret?  At this point, I'd be happy to drop the hack-around for the broken
Cray, and reduce the whole mess to:

#ifndef Py_HUGE_VAL
#define Py_HUGE_VAL HUGE_VAL
#endif

Then someone on a broken box can #define their own Py_HUGE_VAL in their own
stinkin' config file.




From oren-py-d@hishome.net  Tue Jul  2 18:55:07 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 2 Jul 2002 20:55:07 +0300
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <4401F110-8DC9-11D6-9F20-0030655234CE@cwi.nl>; from Jack.Jansen@cwi.nl on Tue, Jul 02, 2002 at 04:37:42PM +0200
References: <20020702135756.GA95955@hishome.net> <4401F110-8DC9-11D6-9F20-0030655234CE@cwi.nl>
Message-ID: <20020702205507.A32734@hishome.net>

On Tue, Jul 02, 2002 at 04:37:42PM +0200, Jack Jansen wrote:
> 
> On Tuesday, July 2, 2002, at 03:57 , Oren Tirosh wrote:
> >> Uhm... I'm confused: macimport stores a pointer to the object if it's
> >> interned (the object in question is one of the strings in sys.path). It
> >> didn't INCREF the object, and that wasn't needed up until now because
> >> interned objects can never go away. However, if they can go away I 
> >> would
> >> think that storing a pointer would definitely call for an INCREF...
> >
> > Are you saying that this code is not following reference counting rules
> > and got away with it only because interned strings are immortal?
> 
> I'm afraid so. Or, actually, "afraid so" sounds too apologetic:-): 
> interned
> strings were specifically defined to be immortal.

I know it says so in the doc, but I always tended to look at it as an 
implementation limitation rather than a feature...

	Oren



From David Abrahams" <david.abrahams@rcn.com  Tue Jul  2 19:09:42 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Tue, 2 Jul 2002 14:09:42 -0400
Subject: [Python-Dev] weakref (or doc) bug?
Message-ID: <14c401c221f4$51c97810$6501a8c0@boostconsulting.com>

>>> help(weakref.ref)
Help on built-in function ref:

ref(...)
    new(object[, callback]) -- create a weak reference to 'object';
    when 'object' is finalized, 'callback' will be called and passed
    a reference to 'object'.
    ^^^^^^^^^^^^^^^^^^^^^^^

This appears to be a lie, or at least misleadingly phrased, in Python
2.2.1:

>>> class Z: pass
...
>>> def dying(x): print x, 'is dying'
...
>>> z = Z()
>>> r = weakref.ref(z, dying)
>>> z = 1
<weakref at 9826f8; dead> is dying

It appears that it's a reference to the weakref object that's passed, not
the dying object itself.

What's the intention?

TIA,
Dave

+---------------------------------------------------------------+
                  David Abrahams
      C++ Booster (http://www.boost.org)               O__  ==
      Pythonista (http://www.python.org)              c/ /'_ ==
  resume: http://users.rcn.com/abrahams/resume.html  (*) \(*) ==
          email: david.abrahams@rcn.com
+---------------------------------------------------------------+




From tim@zope.com  Tue Jul  2 20:06:03 2002
From: tim@zope.com (Tim Peters)
Date: Tue, 2 Jul 2002 15:06:03 -0400
Subject: [Python-Dev] Some dull gc stats
In-Reply-To: <m3ofdqfvvc.fsf@mira.informatik.hu-berlin.de>
Message-ID: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>

[martin@v.loewis.de]
> I understand that it is not a requirement anymore that changes to
> Python 2.2 are "pure bugfixes". Instead, people expect that Python 2.2
> evolves and continues to grow new features, as long as they are
> "strictly backwards compatible".

Alex made a case here for "new features", but the Python Business Forum
hasn't shown interest in that.  Like most businessfolk, I expect they'll
ignore such issues until someone discovers that the lack of a new feature is
putting them out of business <0.8 wink>.

> For any user-visible feature, it is normally debatable whether it is
> "strictly backwards compatible", since it is, by nature, a change in
> observable behaviour.
>
> This specific case is not in that category (i.e. has no
> user-observable behaviour change), so I think it qualifies for 2.2 -
> provided there is enough trust in its correctness.

The "bugfix part" of these changes certainly had user-visible aspects, in
that before it was possible for objects in older generations to get yanked
back into younger generations.  This can affect when objects get collected,
and so throw off over-tuned programs slinging gc.enable() and disable() "at
exactly the best time(s)".

> ...
> I'm concerned that backporting more changes to Python 2.2 will become
> difficult in that area, if the GC implementations vary significantly.

Maintaining multiple branches is always a PITA.

> Maybe this can be reconsidered when there actually is another change
> to backport.

Anyone who is so inclined is welcome to reconsider it non-stop <wink>.




From tim.one@comcast.net  Tue Jul  2 20:31:03 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 02 Jul 2002 15:31:03 -0400
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <20020702082738.A26155@hishome.net>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCENPDFAA.tim.one@comcast.net>

[Tim]
> This may be a problem.  Code now can rely on that
> id(some_interned_string) stays the same across the life of a run.

[Oren Tirosh]
> This requires code that stores the id of an object without keeping a
> reference to the actual object.  It also requires that no other piece of
> Python or C code keep a reference to that object and yet for its
> identity to be somehow still significant.  If find that extremely hard
> to imagine.

I would have guessed you had a more vivid imagination <wink>.  It's
precisely because the id has been guaranteed that a program may not care to
save a reference to an interned string.  For example,

"""
_ids = map(id, map(intern, "if then elif else".split()))
TOKEN_IF, TOKEN_THEN, TOKEN_ELIF, TOKEN_ELSE, TOKEN_NAME = range(5)
id2token = dict(zip(_ids, range(4)))
del _ids

def tokenvector(s):
    return [id2token.get(id(intern(word)), TOKEN_NAME)
            for word in s.split()]

print tokenvector("if this is the example, then what's the question?")
"""

This works reliably today to classify tokens.  I'm not certain I'd care if
it broke, but we have to consider that it hasn't been difficult to write
code that would break.

>> This was (or at least Guido thought it was <wink>) an important
>> optimization at the time.

> I see.  As far as I can tell, it isn't any more.

Which extension modules have you investigated?  The claim is too vague to
carry weight.  Zope's C code uses the interned-string C API directly, so it
doesn't matter to Zope code.  That's all I've looked at.  Making a case that
the optimization is no longer important requires investigating code.

> Now for something a bit more radical:
>
> Why not make interned strings a type?  <type 'istr'> could be an
> un-subclassable subclass of string.  intern would just be an
> alias for this type.  No two istr instances are equal unless they are
> identical.  I guess PyString_CheckExact would need to be changed to
> accept either String or InternedString.

What would the point be?  That is, instead of "why not?", why?  As to "why
not?", there's something about elevating what's basically an optimization
hack to a type that makes me squirm.




From niemeyer@conectiva.com  Tue Jul  2 20:54:00 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Tue, 2 Jul 2002 16:54:00 -0300
Subject: [Python-Dev] weakref (or doc) bug?
In-Reply-To: <14c401c221f4$51c97810$6501a8c0@boostconsulting.com>
References: <14c401c221f4$51c97810$6501a8c0@boostconsulting.com>
Message-ID: <20020702165400.B25194@ibook.distro.conectiva>

> >>> help(weakref.ref)
> Help on built-in function ref:
> 
> ref(...)
>     new(object[, callback]) -- create a weak reference to 'object';
>     when 'object' is finalized, 'callback' will be called and passed
>     a reference to 'object'.
>     ^^^^^^^^^^^^^^^^^^^^^^^
[...]
> It appears that it's a reference to the weakref object that's passed, not
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I'm not sure if that's what was intended, but the documentation seems 
compliant with the current behavior.

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From David Abrahams" <david.abrahams@rcn.com  Tue Jul  2 21:01:15 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Tue, 2 Jul 2002 16:01:15 -0400
Subject: [Python-Dev] weakref (or doc) bug?
References: <14c401c221f4$51c97810$6501a8c0@boostconsulting.com> <20020702165400.B25194@ibook.distro.conectiva>
Message-ID: <155201c22203$39f132f0$6501a8c0@boostconsulting.com>

From: "Gustavo Niemeyer" <niemeyer@conectiva.com>


> > >>> help(weakref.ref)
> > Help on built-in function ref:
> >
> > ref(...)
> >     new(object[, callback]) -- create a weak reference to 'object';
> >     when 'object' is finalized, 'callback' will be called and passed
> >     a reference to 'object'.
> >     ^^^^^^^^^^^^^^^^^^^^^^^
> [...]
> > It appears that it's a reference to the weakref object that's passed,
not
>                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> I'm not sure if that's what was intended, but the documentation seems
> compliant with the current behavior.

Only if you take the unqualified term "reference" to mean a weakref.ref
object. I read it as being a regular reference. You might think I have
Java-on-the-brain, but I've never programmed a line of that foul black
sludge in my life. I'm sure other people will read it the way I do.

-Dave





From niemeyer@conectiva.com  Tue Jul  2 21:14:17 2002
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Tue, 2 Jul 2002 17:14:17 -0300
Subject: [Python-Dev] weakref (or doc) bug?
In-Reply-To: <155201c22203$39f132f0$6501a8c0@boostconsulting.com>
References: <14c401c221f4$51c97810$6501a8c0@boostconsulting.com> <20020702165400.B25194@ibook.distro.conectiva> <155201c22203$39f132f0$6501a8c0@boostconsulting.com>
Message-ID: <20020702171416.A25592@ibook.distro.conectiva>

> Only if you take the unqualified term "reference" to mean a weakref.ref
> object. I read it as being a regular reference. You might think I have
> Java-on-the-brain, but I've never programmed a line of that foul black
> sludge in my life. I'm sure other people will read it the way I do.

Maybe the documentation should be clarified then. Giving a usage example
for the callback would help as well.

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]



From mal@lemburg.com  Tue Jul  2 21:18:14 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 02 Jul 2002 22:18:14 +0200
Subject: [Python-Dev] Some dull gc stats
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
Message-ID: <3D220A86.5070003@lemburg.com>

Tim Peters wrote:
> [martin@v.loewis.de]
> 
>>I understand that it is not a requirement anymore that changes to
>>Python 2.2 are "pure bugfixes". Instead, people expect that Python 2.2
>>evolves and continues to grow new features, as long as they are
>>"strictly backwards compatible".
> 
> 
> Alex made a case here for "new features", but the Python Business Forum
> hasn't shown interest in that.  Like most businessfolk, I expect they'll
> ignore such issues until someone discovers that the lack of a new feature is
> putting them out of business <0.8 wink>.

Patch level releases should *never* include new features (unless
these are essential to fix a serious bug or a simple byproduct
of a fix). I don't know where you got the impression that Python
should move back to the 1.5 branch development process where patch
levels added new features.

W/r to the PBF: at EuroPython we did a poll to see which version
to base the PBF's activities on. The result was that a majority
voted for Python 2.2 as first target.

Patch levels are there to stabilize a release, not make it
more powerful.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From Jack.Jansen@oratrix.com  Tue Jul  2 21:32:33 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Tue, 2 Jul 2002 22:32:33 +0200
Subject: [Python-Dev] HUGE_VAL and INFINITY
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHAENADFAA.tim.one@comcast.net>
Message-ID: <D6E0E784-8DFA-11D6-A45D-003065517236@oratrix.com>

On dinsdag, juli 2, 2002, at 05:28 , Tim Peters wrote:

> [Jack Jansen]
>> I think there is a problem with the way pyport.h treats HUGE_VAL and
>> INFINITY.
>
> I'm afraid there are necessarily problems there, since this stuff is
> insufficiently standardized.

I found a couple of references to INFINITY being a float. Here's 
one (no idea as to it's status, though, that's why I asked for 
help of a standards guru): 
http://www.opengroup.org/onlinepubs/007904975/basedefs/math.h.html
(googling for "INFINITY math.h" will find many more).

> #define Py_HUGE_VAL ((double)INFINITY)

Is the intention of this define that it would first convert the 
constant "1e50" to an IEEE float "Infinity", and that this float 
would then be promoted to a double "Infinity"? If it is indeed 
stated somewhere in the C standard that this is the course of 
action to take then the compiler is wrong, because what it 
actually seems to be doing is parsing the "1e50" as a double 
because of the cast (speculating here, but this is consistent 
with the results).
If you happen to have a reference then I can post a bug report.

>> I have a patch that will fix this problem for my specific case, but I
>> have the feeling that it may be the pyport.h logic that is at fault
>> here. If no-one jumps in I'll commit my fix in a few days time.
>
> Don't check in a change here without review.

The patch is a simple
#ifdef __APPLE__
#undef INFINITY
#endif

I'll post a sourceforge bug tomorrow and assign it to you. Feel 
free to completely ignore it and do the config magic to handle 
the Cray case specially, though.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -




From David Abrahams" <david.abrahams@rcn.com  Tue Jul  2 22:03:19 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Tue, 2 Jul 2002 17:03:19 -0400
Subject: [Python-Dev] weakref (or doc) bug?
References: <14c401c221f4$51c97810$6501a8c0@boostconsulting.com> <20020702165400.B25194@ibook.distro.conectiva> <155201c22203$39f132f0$6501a8c0@boostconsulting.com>
Message-ID: <159b01c2220c$5b7b62c0$6501a8c0@boostconsulting.com>

From: "David Abrahams" <david.abrahams@rcn.com>
> Java-on-the-brain, but I've never programmed a line of that foul black
> sludge in my life.

Sorry everyone, that remark was in poor taste.
-Dave




From tim@zope.com  Tue Jul  2 22:13:08 2002
From: tim@zope.com (Tim Peters)
Date: Tue, 2 Jul 2002 17:13:08 -0400
Subject: [Python-Dev] HUGE_VAL and INFINITY
In-Reply-To: <D6E0E784-8DFA-11D6-A45D-003065517236@oratrix.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHKEOJDFAA.tim@zope.com>

[Jack Jansen]
> I found a couple of references to INFINITY being a float. ...

Yes, INFINITY must expand to a constant expression of type float, although
your compiler isn't doing that (see below).  The header files you're using
are still braindead for the reasons I explained last time regardless.

>> #define Py_HUGE_VAL ((double)INFINITY)

> Is the intention of this define that it would first convert the
> constant "1e50" to an IEEE float "Infinity", and that this float
> would then be promoted to a double "Infinity"?

No.  As the comments before this code said, the explicit cast to double was
for the benefit of some other broken compiler.

The literal "1e50" has type double in C, so if they're really #define'ing
INFINITY as 1e50 then they're violating that INFINITY must expand to an
expression of type float.  They could have made it a float literal by
appending "f" or "F", but then it wouldn't be a legal float literal.
They're screwed either way -- they're doing this part incorrectly no matter
how you cut it.  They can look at any other compiler for a correct way to do
it <wink>.

> If it is indeed stated somewhere in the C standard that this is the
> course of action to take then the compiler is wrong,

The compiler is wrong, but for other reasons.

> because what it actually seems to be doing is parsing the "1e50" as a
> double because of the cast (speculating here, but this is consistent
> with the results).

1e50 is a double with or without the cast.

> The patch is a simple
> #ifdef __APPLE__
> #undef INFINITY
> #endif

Bleech.  I'm going to remove all this crap.  If some Crays still have broken
HUGE_VAL definitions, tough -- someone on a Cray can fix it.  Putting this
junk in the core just ensures it will always stay broken.




From barry@zope.com  Tue Jul  2 22:36:49 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 2 Jul 2002 17:36:49 -0400
Subject: [Python-Dev] weakref (or doc) bug?
References: <14c401c221f4$51c97810$6501a8c0@boostconsulting.com>
 <20020702165400.B25194@ibook.distro.conectiva>
 <155201c22203$39f132f0$6501a8c0@boostconsulting.com>
 <159b01c2220c$5b7b62c0$6501a8c0@boostconsulting.com>
Message-ID: <15650.7409.971862.910898@anthem.wooz.org>

>>>>> "DA" == David Abrahams <david.abrahams@rcn.com> writes:

    >> Java-on-the-brain, but I've never programmed a line of that
    >> foul black sludge in my life.

    | Sorry everyone, that remark was in poor taste.
-----------------------------------------^^^^^^^^^^

Oh, now I get it!
-Barry



From tim@zope.com  Wed Jul  3 00:06:06 2002
From: tim@zope.com (Tim Peters)
Date: Tue, 2 Jul 2002 19:06:06 -0400
Subject: [Python-Dev] Some dull gc stats
In-Reply-To: <3D220A86.5070003@lemburg.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHOEPCDFAA.tim@zope.com>

[MaL, replying to me, but presumably bonding with Martin again <wink>]
> Patch level releases should *never* include new features (unless
> these are essential to fix a serious bug or a simple byproduct
> of a fix). I don't know where you got the impression that Python
> should move back to the 1.5 branch development process where patch
> levels added new features.

The pre-PBF Patch Czars generally took a hard "no new features!" stance, but
it seems to be up in the air now.

> W/r to the PBF: at EuroPython we did a poll to see which version
> to base the PBF's activities on. The result was that a majority
> voted for Python 2.2 as first target.

Cool!  Good choice.

> Patch levels are there to stabilize a release, not make it
> more powerful.

This is one popular view, although there's plenty of wiggle room in what
"stabilize" means (e.g., is it "stabilizing" to port Python to a new
platform?  to speed a bottleneck?  to add a new encoding?  etc).




From lalo@laranja.org  Wed Jul  3 00:17:15 2002
From: lalo@laranja.org (Lalo Martins)
Date: Tue, 2 Jul 2002 20:17:15 -0300
Subject: [Python-Dev] [development doc updates]
In-Reply-To: <20020702222813.8990118EC22@grendel.zope.com>
References: <20020702222813.8990118EC22@grendel.zope.com>
Message-ID: <20020702231715.GG25927@laranja.org>

On Tue, Jul 02, 2002 at 06:28:13PM -0400, Fred L. Drake wrote:
> The development version of the documentation has been updated:
> 
>     http://www.python.org/dev/doc/devel/
> 
> Many updates and corrections to the documentation, including docs for the
> new textwrap module.

Re: textwrap.TextWrapper.fix_sentence_endings

} ... Furthermore, since it relies on string.lowercase ... it is specific to
} English-language texts.

Well, actually the convention of separating sentences by two spaces is also
specific to the English language, so I don't see that as a problem.

[]s,
                                               |alo
                                               +----
--
  It doesn't bother me that people say things like
   "you'll never get anywhere with this attitude".
   In a few decades, it will make a good paragraph
      in my biography. You know, for a laugh.
--
http://www.laranja.org/                mailto:lalo@laranja.org
         pgp key: http://www.laranja.org/pessoal/pgp

Eu jogo RPG! (I play RPG)         http://www.eujogorpg.com.br/
Python Foundry Guide http://www.sf.net/foundry/python-foundry/



From tim.one@comcast.net  Wed Jul  3 03:10:36 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 02 Jul 2002 22:10:36 -0400
Subject: [Python-Dev] Death to WITH_CYCLE_GC
Message-ID: <LNBBLJKPBEHFEDALKOLCEELBABAB.tim.one@comcast.net>

I don't consider cyclic gc to be an experiment anymore.  It's proved to be
very solid code, and it hasn't become orphaned either <wink>.

What say ye to nuking the #ifdefs conditionalizing it in the core for 2.3?
They're irritating, the code base without cyclic gc is never tested, the
touchy trashcan mechanism works in a radically different way when cyclic gc
isn't compiled in, and if cyclic gc is compiled in it's easy to turn it off
at will (gc.disable()).  It does cost memory for the gc header on
containers, but since we never test without it the ability to compile it out
isn't much of "a feature".

+1 from me <ahem>.




From tim.one@comcast.net  Wed Jul  3 03:53:44 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 02 Jul 2002 22:53:44 -0400
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <20020702205507.A32734@hishome.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKELFABAB.tim.one@comcast.net>

[Jack Jansen]
> I'm afraid so. Or, actually, "afraid so" sounds too apologetic:-):
> interned strings were specifically defined to be immortal.

[Oren Tirosh]
> I know it says so in the doc, but I always tended to look at it as an
> implementation limitation rather than a feature...

Me too:  I always read it as a warning not to use interning "too much".
However, you can see how far common sense goes once users get ahold of a
thing <wink>.




From oren-py-d@hishome.net  Wed Jul  3 05:52:11 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 3 Jul 2002 00:52:11 -0400
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHCENPDFAA.tim.one@comcast.net>
References: <20020702082738.A26155@hishome.net> <BIEJKCLHCIOIHAGOKOLHCENPDFAA.tim.one@comcast.net>
Message-ID: <20020703045211.GA7978@hishome.net>

On Tue, Jul 02, 2002 at 03:31:03PM -0400, Tim Peters wrote:
> I would have guessed you had a more vivid imagination <wink>.  It's
> precisely because the id has been guaranteed that a program may not care to
> save a reference to an interned string.  For example,
> 
> """
> _ids = map(id, map(intern, "if then elif else".split()))
> TOKEN_IF, TOKEN_THEN, TOKEN_ELIF, TOKEN_ELSE, TOKEN_NAME = range(5)
> id2token = dict(zip(_ids, range(4)))
> del _ids
> 
> def tokenvector(s):
>     return [id2token.get(id(intern(word)), TOKEN_NAME)
>             for word in s.split()]
> 
> print tokenvector("if this is the example, then what's the question?")
> """
> 
> This works reliably today to classify tokens.  I'm not certain I'd care if
> it broke, but we have to consider that it hasn't been difficult to write
> code that would break.

Ironically, this code is actually slower than using the strings themselves as 
keys (interned or not).  But I get the point.

> > Now for something a bit more radical:
> >
> > Why not make interned strings a type?  <type 'istr'> could be an
> > un-subclassable subclass of string.  intern would just be an
> > alias for this type.  No two istr instances are equal unless they are
> > identical.  I guess PyString_CheckExact would need to be changed to
> > accept either String or InternedString.
> 
> What would the point be?  That is, instead of "why not?", why?  As to "why
> not?", there's something about elevating what's basically an optimization
> hack to a type that makes me squirm.

Change the name from 'istr' to 'symbol' and add a mild case of language envy
and you'll see why ;-)

	Oren



From fdrake@acm.org  Wed Jul  3 06:10:00 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 3 Jul 2002 01:10:00 -0400
Subject: [Doc-SIG] Re: [Python-Dev] [development doc updates]
In-Reply-To: <20020702231715.GG25927@laranja.org>
References: <20020702222813.8990118EC22@grendel.zope.com>
 <20020702231715.GG25927@laranja.org>
Message-ID: <15650.34600.410233.510315@grendel.zope.com>

Lalo Martins writes:
 > Re: textwrap.TextWrapper.fix_sentence_endings
...
 > Well, actually the convention of separating sentences by two spaces is also
 > specific to the English language, so I don't see that as a problem.

Insidious, isn't it?  I've tried to clarify the matter further in the
documentation; please let me know if you think more is needed.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From greg@cosc.canterbury.ac.nz  Wed Jul  3 06:23:23 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 03 Jul 2002 17:23:23 +1200 (NZST)
Subject: [Python-Dev] Alternative implementation of string interning
In-Reply-To: <20020703045211.GA7978@hishome.net>
Message-ID: <200207030523.g635NNC18943@oma.cosc.canterbury.ac.nz>

Oren Tirosh <oren-py-d@hishome.net>:

> Tim Peters:
> 
> > What would the point be?  That is, instead of "why not?", why?  As to "why
> > not?", there's something about elevating what's basically an optimization
> > hack to a type that makes me squirm.
> 
> Change the name from 'istr' to 'symbol' and add a mild case of language envy
> and you'll see why ;-)

But in Lisp, symbols and strings really are completely
separate types. That's not the case in Python, and you
still haven't really given a reason why they should be.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From lalo@laranja.org  Wed Jul  3 06:28:32 2002
From: lalo@laranja.org (Lalo Martins)
Date: Wed, 3 Jul 2002 02:28:32 -0300
Subject: [Doc-SIG] Re: [Python-Dev] [development doc updates]
In-Reply-To: <15650.34600.410233.510315@grendel.zope.com>
References: <20020702222813.8990118EC22@grendel.zope.com> <20020702231715.GG25927@laranja.org> <15650.34600.410233.510315@grendel.zope.com>
Message-ID: <20020703052832.GA5023@laranja.org>

On Wed, Jul 03, 2002 at 01:10:00AM -0400, Fred L. Drake, Jr. wrote:
> 
> Lalo Martins writes:
>  > Re: textwrap.TextWrapper.fix_sentence_endings
> ...
>  > Well, actually the convention of separating sentences by two spaces is also
>  > specific to the English language, so I don't see that as a problem.
> 
> Insidious, isn't it?  I've tried to clarify the matter further in the
> documentation; please let me know if you think more is needed.

Seems fine for my particular taste now.

thanks,
                                               |alo
                                               +----
--
  It doesn't bother me that people say things like
   "you'll never get anywhere with this attitude".
   In a few decades, it will make a good paragraph
      in my biography. You know, for a laugh.
--
http://www.laranja.org/                mailto:lalo@laranja.org
         pgp key: http://www.laranja.org/pessoal/pgp

Eu jogo RPG! (I play RPG)         http://www.eujogorpg.com.br/
Python Foundry Guide http://www.sf.net/foundry/python-foundry/



From ping@zesty.ca  Wed Jul  3 07:07:14 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Tue, 2 Jul 2002 23:07:14 -0700 (PDT)
Subject: [Python-Dev] Re: Alternative implementation of string interning
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHCENPDFAA.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.44.0207022259020.5227-100000@ziggy>

Oren Tirosh wrote:
> Why not make interned strings a type?  <type 'istr'> could be an
> un-subclassable subclass of string.  intern would just be an
> alias for this type.  No two istr instances are equal unless they are
> identical.  I guess PyString_CheckExact would need to be changed to
> accept either String or InternedString.

The possibility of people starting to write code that depended on
whether strings were 'string' or 'istr', and all the breakage and
incompatibility that would result, seems much too ugly to contemplate.
Pass an 'istr' into a routine that expects strings, and it would
appear to be a string right up until someone tried to == it, whereupon
all hell would break loose.

The acid test for subtyping is substitutability: type 'istr' would not
fulfill the contract of 'string', and neither would 'string' fulfill the
contract of 'istr'.  Therefore, if you really wanted to do this, your
new type (let's call it 'symbol') would have to be completely independent
from both strings *and* interned strings.  There's no subclass relationship.


-- ?!ng




From martin@v.loewis.de  Wed Jul  3 07:22:08 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 03 Jul 2002 08:22:08 +0200
Subject: [Python-Dev] Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: <3D220A86.5070003@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
 <3D220A86.5070003@lemburg.com>
Message-ID: <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> Patch level releases should *never* include new features (unless
> these are essential to fix a serious bug or a simple byproduct
> of a fix). I don't know where you got the impression that Python
> should move back to the 1.5 branch development process where patch
> levels added new features.

>From discussions on python-dev...

> Patch levels are there to stabilize a release, not make it
> more powerful.

What precisely does that mean?

Specific case in question: xml.dom.minidom.toxml does not support the
specification of an encoding of the resulting XML document. Instead,
if there are non-ASCII characters in the output document, it returns a
Unicode object that starts with u"<?xml version='1.0' ?>". People
cannot write this to a file as-is, and they cannot encode it in
anything but UTF-8 (because the document would then be incorrect).

So I added an optional encoding= argument to .toxml, for 2.3. The
question now is: should that argument also be made available for
2.2.2?

Regards,
Martin




From martin@v.loewis.de  Wed Jul  3 07:23:46 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 03 Jul 2002 08:23:46 +0200
Subject: [Python-Dev] Death to WITH_CYCLE_GC
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEELBABAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEELBABAB.tim.one@comcast.net>
Message-ID: <m33cv1jpn1.fsf@mira.informatik.hu-berlin.de>

Tim Peters <tim.one@comcast.net> writes:

> What say ye to nuking the #ifdefs conditionalizing it in the core for 2.3?

Good idea.

Martin



From oren-py-d@hishome.net  Wed Jul  3 08:06:17 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 3 Jul 2002 03:06:17 -0400
Subject: [Python-Dev] Re: Alternative implementation of string interning
In-Reply-To: <Pine.LNX.4.44.0207022259020.5227-100000@ziggy>
References: <BIEJKCLHCIOIHAGOKOLHCENPDFAA.tim.one@comcast.net> <Pine.LNX.4.44.0207022259020.5227-100000@ziggy>
Message-ID: <20020703070617.GA25449@hishome.net>

On Tue, Jul 02, 2002 at 11:07:14PM -0700, Ka-Ping Yee wrote:
> Oren Tirosh wrote:
> > Why not make interned strings a type?  <type 'istr'> could be an
> > un-subclassable subclass of string.  intern would just be an
> > alias for this type.  No two istr instances are equal unless they are
> > identical.  I guess PyString_CheckExact would need to be changed to
> > accept either String or InternedString.
> 
> The possibility of people starting to write code that depended on
> whether strings were 'string' or 'istr', and all the breakage and
> incompatibility that would result, seems much too ugly to contemplate.
> Pass an 'istr' into a routine that expects strings, and it would
> appear to be a string right up until someone tried to == it, whereupon
> all hell would break loose.

I don't understand your assumptions.  What kind of hell?  Are you assuming
that == would be equivalent to 'is' for istrs?  The == operator should work 
exactly the same, just possibly a little faster when comparing two istrs.

> The acid test for subtyping is substitutability: type 'istr' would not
> fulfill the contract of 'string', and neither would 'string' fulfill the
> contract of 'istr'.  

Can you be more specific?  As i see it an istr would be completely compatible 
to str with the exception of being non subclassable.  

It has the additional property that

  (type(s) is istr and type(t) is istr and s == t) implies (s is t).

But that doesn't break anything.

	Oren



From tdelaney@avaya.com  Wed Jul  3 08:17:54 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Wed, 3 Jul 2002 17:17:54 +1000
Subject: [Python-Dev] Re: Alternative implementation of string interni
 ng
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A3E7@natasha.auslabs.avaya.com>

> From: Oren Tirosh [mailto:oren-py-d@hishome.net]
> 
> > Oren Tirosh wrote:
> > > alias for this type.  No two istr instances are equal 
> unless they are
> > > identical.  I guess PyString_CheckExact would need to be 
> 
> you assuming
> that == would be equivalent to 'is' for istrs?  The == 
> operator should work 
> exactly the same, just possibly a little faster when 
> comparing two istrs.
> 
>   (type(s) is istr and type(t) is istr and s == t) implies (s is t).

Do you mean that comparing two instances of istr would use *is*, but
comparing an istr with any other instance would use the normal str compare?
Because that is not how it has come across.

My first thought when I saw this proposal was "neat". My second was "yuk".

The #1 most important consideration here is backwards compatibility IMO.
Whilst I would be personally unaffected by this change (allowing interned
strings to be collected), we've already had examples of people and code that
would be.

Tim Delaney



From mal@lemburg.com  Wed Jul  3 08:37:26 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 03 Jul 2002 09:37:26 +0200
Subject: [Python-Dev] Death to WITH_CYCLE_GC
References: <LNBBLJKPBEHFEDALKOLCEELBABAB.tim.one@comcast.net>
Message-ID: <3D22A9B6.1050208@lemburg.com>

Tim Peters wrote:
> I don't consider cyclic gc to be an experiment anymore.  It's proved to be
> very solid code, and it hasn't become orphaned either <wink>.
> 
> What say ye to nuking the #ifdefs conditionalizing it in the core for 2.3?
> They're irritating, the code base without cyclic gc is never tested, the
> touchy trashcan mechanism works in a radically different way when cyclic gc
> isn't compiled in, and if cyclic gc is compiled in it's easy to turn it off
> at will (gc.disable()).  It does cost memory for the gc header on
> containers, but since we never test without it the ability to compile it out
> isn't much of "a feature".
> 
> +1 from me <ahem>.

Hmm, isn't the idea of having compile time options to give
people a chance to eliminate the feature altogether ?

I'm thinking in terms of memory footprint of the running
interpreter and its binary. Platforms like e.g. Palm
or Pocket PC are very touchy about this. Embedded devices
even more.

How much memory footprint would removing the #ifdefs
cause on average ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Wed Jul  3 08:55:05 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 03 Jul 2002 09:55:05 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>	<3D220A86.5070003@lemburg.com> <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>
Message-ID: <3D22ADD9.1030901@lemburg.com>

Martin v. Loewis wrote:
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
> 
>>Patch level releases should *never* include new features (unless
>>these are essential to fix a serious bug or a simple byproduct
>>of a fix). I don't know where you got the impression that Python
>>should move back to the 1.5 branch development process where patch
>>levels added new features.
> 
> 
>>From discussions on python-dev...
> 
> 
>>Patch levels are there to stabilize a release, not make it
>>more powerful.
> 
> 
> What precisely does that mean?

Mainly that only bugs should be fixed. Adding new features doesn't
help in fixing bugs since you can't expect that existing code for
a particular Python branch will get changed to make use of it.

Stabilizing means that code using the existing features in
a branch runs more stable, i.e. there are fewer situations where
a program can trigger a bug hiding in the Python release.

> Specific case in question: xml.dom.minidom.toxml does not support the
> specification of an encoding of the resulting XML document. Instead,
> if there are non-ASCII characters in the output document, it returns a
> Unicode object that starts with u"<?xml version='1.0' ?>". People
> cannot write this to a file as-is, and they cannot encode it in
> anything but UTF-8 (because the document would then be incorrect).
> 
> So I added an optional encoding= argument to .toxml, for 2.3. The
> question now is: should that argument also be made available for
> 2.2.2?

Adding the argument would only help applications which would
make use of it. An application written for Python 2.2 couldn't
do this since the optional argument wouldn't be available.

BTW, the above is trying to fix an application bug rather
than a Python one: if the application cannot deal with Unicode,
it is not non-ASCII compatible.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Wed Jul  3 09:02:35 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 03 Jul 2002 10:02:35 +0200
Subject: [Python-Dev] Some dull gc stats
References: <BIEJKCLHCIOIHAGOKOLHOEPCDFAA.tim@zope.com>
Message-ID: <3D22AF9B.5030104@lemburg.com>

Tim Peters wrote:
> [MaL, replying to me, but presumably bonding with Martin again <wink>]
> 
>>Patch level releases should *never* include new features (unless
>>these are essential to fix a serious bug or a simple byproduct
>>of a fix). I don't know where you got the impression that Python
>>should move back to the 1.5 branch development process where patch
>>levels added new features.
> 
> 
> The pre-PBF Patch Czars generally took a hard "no new features!" stance, but
> it seems to be up in the air now.

I wonder why... just because Fossetts can't get back to solid
ground doesn't mean we have to follow him ;-)

>>W/r to the PBF: at EuroPython we did a poll to see which version
>>to base the PBF's activities on. The result was that a majority
>>voted for Python 2.2 as first target.
> 
> 
> Cool!  Good choice.
> 
> 
>>Patch levels are there to stabilize a release, not make it
>>more powerful.
> 
> 
> This is one popular view, although there's plenty of wiggle room in what
> "stabilize" means (e.g., is it "stabilizing" to port Python to a new
> platform?  to speed a bottleneck?  to add a new encoding?  etc).

"Stabilize" should mean to make triggering bugs in a Python release
less likely.

I don't think that porting to a new platform falls under
this definition, a new encoding might (but then only if the encoding
is so popular that people consider its absence a bug), performance
tweaks are probably within range if they are in the micro-optimization
area and hidden within the interpreter.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From gherron@islandtraining.com  Wed Jul  3 09:25:37 2002
From: gherron@islandtraining.com (Gary Herron)
Date: Wed, 3 Jul 2002 01:25:37 -0700
Subject: [Python-Dev] [development doc updates]
In-Reply-To: <20020702222813.8990118EC22@grendel.zope.com>
References: <20020702222813.8990118EC22@grendel.zope.com>
Message-ID: <200207030125.38106.gherron@islandtraining.com>

Fred,

Reading the "What's New in Python 2.3" section, I find the following
sentence in "5 Extended Slices":

  Ever since Python 1.4 the slice syntax has supported a third
  ``Stride'' argument, but the builtin sequence types have not
  supported this feature (it was initially included at the behest of
  the developers of the Numerical Python package). This changes with
  Python 2.3.

This is ambiguous.  Exactly *HOW* does it change with Python 2.3?
Does the stride argument go away, or do builtin sequence types now
support the stride argument?  If I'd followed this newsgroup more
carefully, I'd probably know the answer.

The paragraph about PendingDeprecationWarning, which follows the above
quote, probably provides a clue, but it seems out of place, having
nothing to do with slices.

Gary Herron
gherron@islandtraining.com




From ping@zesty.ca  Wed Jul  3 10:33:24 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Wed, 3 Jul 2002 02:33:24 -0700 (PDT)
Subject: [Python-Dev] Re: Alternative implementation of string interning
In-Reply-To: <20020703070617.GA25449@hishome.net>
Message-ID: <Pine.LNX.4.44.0207030222090.5227-100000@ziggy>

On Wed, 3 Jul 2002, Oren Tirosh wrote:
> On Tue, Jul 02, 2002 at 11:07:14PM -0700, Ka-Ping Yee wrote:
> > Oren Tirosh wrote:
> > > No two istr instances are equal unless they are
> > > identical.  I guess PyString_CheckExact would need to be changed to
> > > accept either String or InternedString.
[...]
> > Pass an 'istr' into a routine that expects strings, and it would
> > appear to be a string right up until someone tried to == it, whereupon
> > all hell would break loose.
>
> I don't understand your assumptions.

I just went on what you wrote: "No two istr instances are equal unless
they are identical."  I read that to mean that == would be implemented
with pointer comparison, which would break contracts the way i described.
I see now that is not what you meant.

It appears that what you are proposing is what interned string
comparison already does (since == checks for pointer equality first).
So, the only observable effect of the change would be to break all
code that tests for type(s) == str.


-- ?!ng




From oren-py-d@hishome.net  Wed Jul  3 10:59:15 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 3 Jul 2002 05:59:15 -0400
Subject: [Python-Dev] Re: Alternative implementation of string interning
In-Reply-To: <Pine.LNX.4.44.0207030222090.5227-100000@ziggy>
References: <20020703070617.GA25449@hishome.net> <Pine.LNX.4.44.0207030222090.5227-100000@ziggy>
Message-ID: <20020703095915.GA43336@hishome.net>

On Wed, Jul 03, 2002 at 02:33:24AM -0700, Ka-Ping Yee wrote:
> I just went on what you wrote: "No two istr instances are equal unless
> they are identical."  I read that to mean that == would be implemented
> with pointer comparison, which would break contracts the way i described.
> I see now that is not what you meant.

If all dutchmen like Monty Python it doesn't mean that anyone who likes
Monty Python is a dutchman.

> It appears that what you are proposing is what interned string
> comparison already does (since == checks for pointer equality first).

But INequality checking may still require strcmp.  Inverse logic again.

> So, the only observable effect of the change would be to break all
> code that tests for type(s) == str.

Yes, that's certainly a problem.  

This thought experiment is part of a strange fantasy I have that Python 
might one day use only interned strings to represent names. There are 
relatively few places where a string may be converted to a name (getattr, 
hasattr, etc) and these could be interned at the interface if interned 
strings are not immortal. I expect that nothing will ever come out of this, 
but it's fun to think about it anyway...

	Oren



From ping@zesty.ca  Wed Jul  3 11:14:33 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Wed, 3 Jul 2002 03:14:33 -0700 (PDT)
Subject: [Python-Dev] Re: Alternative implementation of string interning
In-Reply-To: <20020703095915.GA43336@hishome.net>
Message-ID: <Pine.LNX.4.44.0207030309320.5227-100000@ziggy>

On Wed, 3 Jul 2002, Oren Tirosh wrote:
> > It appears that what you are proposing is what interned string
> > comparison already does (since == checks for pointer equality first).
>
> But INequality checking may still require strcmp.  Inverse logic again.

I never claimed it wouldn't.  All i'm saying is that string comparison
already does this: compare pointers, then if not equal, compare strings.

> > So, the only observable effect of the change would be to break all
> > code that tests for type(s) == str.
>
> Yes, that's certainly a problem.

But you haven't responded to my point.  Would there be *any* effect
other than breakage?


-- ?!ng




From oren-py-d@hishome.net  Wed Jul  3 12:07:35 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 3 Jul 2002 07:07:35 -0400
Subject: [Python-Dev] Re: Alternative implementation of string interning
In-Reply-To: <Pine.LNX.4.44.0207030309320.5227-100000@ziggy>
References: <20020703095915.GA43336@hishome.net> <Pine.LNX.4.44.0207030309320.5227-100000@ziggy>
Message-ID: <20020703110735.GA50268@hishome.net>

On Wed, Jul 03, 2002 at 03:14:33AM -0700, Ka-Ping Yee wrote:
> On Wed, 3 Jul 2002, Oren Tirosh wrote:
> > > It appears that what you are proposing is what interned string
> > > comparison already does (since == checks for pointer equality first).
> >
> > But INequality checking may still require strcmp.  Inverse logic again.
> 
> I never claimed it wouldn't.  All i'm saying is that string comparison
> already does this: compare pointers, then if not equal, compare strings.
> 
> > > So, the only observable effect of the change would be to break all
> > > code that tests for type(s) == str.
> >
> > Yes, that's certainly a problem.
> 
> But you haven't responded to my point.  Would there be *any* effect
> other than breakage?

The warm fuzzy feeling that you have a real symbol type :-)

Just for the record: I am not a LISP zealot.

	Oren



From mwh@python.net  Wed Jul  3 13:03:49 2002
From: mwh@python.net (Michael Hudson)
Date: 03 Jul 2002 13:03:49 +0100
Subject: [Python-Dev] [development doc updates]
In-Reply-To: Gary Herron's message of "Wed, 3 Jul 2002 01:25:37 -0700"
References: <20020702222813.8990118EC22@grendel.zope.com> <200207030125.38106.gherron@islandtraining.com>
Message-ID: <2m1yalf26y.fsf@starship.python.net>

Gary Herron <gherron@islandtraining.com> writes:

> Fred,
> 
> Reading the "What's New in Python 2.3" section, I find the following
> sentence in "5 Extended Slices":
> 
>   Ever since Python 1.4 the slice syntax has supported a third
>   ``Stride'' argument, but the builtin sequence types have not
>   supported this feature (it was initially included at the behest of
>   the developers of the Numerical Python package). This changes with
>   Python 2.3.
> 
> This is ambiguous.  

Unfinished is closer to the truth.

> Exactly *HOW* does it change with Python 2.3?  Does the stride
> argument go away, 

No.

> or do builtin sequence types now support the
> stride argument? 

Yes.

> If I'd followed this newsgroup more carefully, I'd probably know the
> answer.

The section will be suitably fleshed out by the time of the first 2.3
alpha (I sincerely hope).

> The paragraph about PendingDeprecationWarning, which follows the above
> quote, probably provides a clue,

Nope.

> but it seems out of place, having nothing to do with slices.

This is because there's a commented out section break in the source.
I'll uncomment it.  There probably needs to be some editorial work
done on the whole document wrt. section ordering, whether things count
as sections or subsections, etc.  But not by me.

Cheers,
M.

-- 
  Those who have deviant punctuation desires should take care of their
  own perverted needs.                  -- Erik Naggum, comp.lang.lisp



From fdrake@acm.org  Wed Jul  3 13:04:30 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 3 Jul 2002 08:04:30 -0400
Subject: [Python-Dev] [development doc updates]
In-Reply-To: <200207030125.38106.gherron@islandtraining.com>
References: <20020702222813.8990118EC22@grendel.zope.com>
 <200207030125.38106.gherron@islandtraining.com>
Message-ID: <15650.59470.76955.48172@grendel.zope.com>

Gary Herron writes:
 > Reading the "What's New in Python 2.3" section, I find the following
 > sentence in "5 Extended Slices":
...
 > This is ambiguous.  Exactly *HOW* does it change with Python 2.3?
 > Does the stride argument go away, or do builtin sequence types now
 > support the stride argument?  If I'd followed this newsgroup more
 > carefully, I'd probably know the answer.

The built-in types now support stride.  Thanks for pointing this
ambiguity out; I've changed the explanation in the document so that
this is clear.

 > The paragraph about PendingDeprecationWarning, which follows the above
 > quote, probably provides a clue, but it seems out of place, having
 > nothing to do with slices.

There was a section heading that was commented out in the document
source; I've uncommented the heading.  More material will be added to
the new section as we have time to complete the material.

Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From barry@zope.com  Wed Jul  3 14:12:42 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 3 Jul 2002 09:12:42 -0400
Subject: [Python-Dev] Re: Alternative implementation of string interning
References: <B43D149A9AB2D411971300B0D03D7E8BF0A3E7@natasha.auslabs.avaya.com>
Message-ID: <15650.63562.102139.566843@anthem.wooz.org>

>>>>> "TD" == Timothy Delaney <tdelaney@avaya.com> writes:

    TD> The #1 most important consideration here is backwards
    TD> compatibility IMO.  Whilst I would be personally unaffected by
    TD> this change (allowing interned strings to be collected), we've
    TD> already had examples of people and code that would be.

I still think most applications don't care about interned strings, and
they really don't care whether they're immortal or not.  Long running
apps probably do care, but for them, I'd rather see the application
writers have to take an explicit action to free the intern strings.
Only they are going to know whether they're depending on immortal
interns, and when it's "safe" and prudent to reclaim them.

-Barry



From barry@zope.com  Wed Jul  3 14:26:15 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 3 Jul 2002 09:26:15 -0400
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
 <3D220A86.5070003@lemburg.com>
 <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>
 <3D22ADD9.1030901@lemburg.com>
Message-ID: <15650.64375.162977.160780@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> Adding the argument would only help applications which would
    MAL> make use of it. An application written for Python 2.2
    MAL> couldn't do this since the optional argument wouldn't be
    MAL> available.

Ok, here's another question.  When I updated the email package in
Python 2.3, Guido wanted me to backport it to Python 2.2.x.  I did
that once, but there's been a lot of changes since then, both bug
fixes, API "fixes", and new functionality.

The email package can be installed separately as a distutils package,
and it is compatible all the way back to Python 2.1.x.  Which means
someone /could/ install the latest version in their site-packages and
have the new functionality in any of the last 3 versions of Python,
although it would be tricky for Python 2.2.1.

So does it make sense to backport the latest email package to Python
2.2.2?  That's what Guido wanted, and I could argue that doing so
improves stability of that branch, because while it adds a lot of new
stuff, the old stuff was fairly well broken.  E.g. you can't properly
encode RFC 2047 headers in Python 2.2.1's email package.  Backporting
allows application writers to fix their code so that it works
compatibly and correctly across more versions of Python than if we
didn't backport.  It also makes no sense to maintain two different
code bases (especially now that that's been reduced from 3! :).

OTOH, it definitely adds new features.  Maybe email is special because
it was so new in Python 2.2, and so I took a more naive approach to
some issues that a wider use uncovered.

it-ain't-always-simple-ly y'rs,
-Barry



From barry@zope.com  Wed Jul  3 14:34:12 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 3 Jul 2002 09:34:12 -0400
Subject: [Python-Dev] Re: Alternative implementation of string interning
References: <20020703070617.GA25449@hishome.net>
 <Pine.LNX.4.44.0207030222090.5227-100000@ziggy>
Message-ID: <15650.64852.728915.275239@anthem.wooz.org>

>>>>> "KY" == Ka-Ping Yee <ping@zesty.ca> writes:

    KY> It appears that what you are proposing is what interned string
    KY> comparison already does (since == checks for pointer equality
    KY> first).  So, the only observable effect of the change would be
    KY> to break all code that tests for type(s) == str.

Shouldn't those already be written as isinstance(s, str)?  Maybe with
StringType for str?

Even so, I'm not much in favor of adding more string types to the
language.  I think we should be /collapsing/ string types not
proliferating them (i.e. removing the distinction between str and
unicode -- Jython seems to get by just fine that way).

-Barry



From tim@zope.com  Wed Jul  3 17:56:17 2002
From: tim@zope.com (Tim Peters)
Date: Wed, 3 Jul 2002 12:56:17 -0400
Subject: [Python-Dev] Death to WITH_CYCLE_GC
In-Reply-To: <3D22A9B6.1050208@lemburg.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEBEDGAA.tim@zope.com>

[M.-A. Lemburg]
> Hmm, isn't the idea of having compile time options to give
> people a chance to eliminate the feature altogether ?

That may be idea for some symbols; e.g., I suppose HAVE_UNICODE is of that
nature, although PythonLabs never tests with that disabled either.

WITH_CYCLE_GC wasn't of that nature.  Like pymalloc before it, cyclic gc was
*thought* to be such a large change that it would be prudent to leave cyclic
gc off for a release, but give adventurous people a symbol they could use to
try it.  WITH_CYCLE_GC was enabled by default in the first alpha release to
get it some exercise.  That didn't turn up any significant problems, so we
left it on in the next alpha release too.  Still no problems, so we left it
on for all the alphas releases.  Still no problems, so we left it on for all
the beta releases.  Still no problems, so we concluded "screw this, let's
leave it enabled for the final release too".  So the purpose for which
WITH_CYCLE_GC was introduced went away before anyone had a chance to use it
for that purpose.

> I'm thinking in terms of memory footprint of the running
> interpreter and its binary. Platforms like e.g. Palm
> or Pocket PC are very touchy about this. Embedded devices
> even more.

I don't buy this.  I don't work on embedded devices in this incarnation, and
from what I've seen the people who do aren't helped at all by people who
don't guessing about what they might need.  If people on embedded devices
need help in the core, they can speak for themselves, and get the help they
*really* need.

> How much memory footprint would removing the #ifdefs
> cause on average ?

6, give or take.




From skip@pobox.com  Wed Jul  3 17:59:37 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 3 Jul 2002 11:59:37 -0500
Subject: [Python-Dev] Idle thoughts about array objects and xmlrpclib
Message-ID: <15651.11641.32411.963918@12-248-8-148.client.attbi.com>

I installed the latest version of MySQLdb yesterday and got mildly bitten by
a change Andy Dustman made.  He began returning BLOB fields as array objects
created with a 'c' typecode.  Since MySQL doesn't distinguish between TEXT
and BLOB fields, I was temporarily unable to pass SQL results back through
my XML-RPC interface (I use TEXT, but not BLOB).  Andy and I discussed it
and he decided to back out this change to MySQLdb.

That got me to thinking.  Perhaps xmlrpclib should do the obvious thing with
array objects.  For all numeric typecodes it should marshal them to lists.
For 'c' and 'u' typecodes it's a bit more problematic.  Should they be lists
or strings (or Unicode strings)?

Fredrik, have you considered whether xmlrpclib could or should support array
objects?

Skip



From jeremy@zope.com  Wed Jul  3 18:49:28 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Wed, 3 Jul 2002 13:49:28 -0400
Subject: [Python-Dev] Death to WITH_CYCLE_GC
In-Reply-To: <m33cv1jpn1.fsf@mira.informatik.hu-berlin.de>
References: <LNBBLJKPBEHFEDALKOLCEELBABAB.tim.one@comcast.net>
 <m33cv1jpn1.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15651.14632.499800.910362@slothrop.zope.com>

>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

  MvL> Tim Peters <tim.one@comcast.net> writes:
  >> What say ye to nuking the #ifdefs conditionalizing it in the core
  >> for 2.3?

  MvL> Good idea.

+1

Jeremy




From jeremy@zope.com  Wed Jul  3 19:17:12 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Wed, 3 Jul 2002 14:17:12 -0400
Subject: [Python-Dev] Re: Alternative implementation of string interning
In-Reply-To: <20020703095915.GA43336@hishome.net>
References: <20020703070617.GA25449@hishome.net>
 <Pine.LNX.4.44.0207030222090.5227-100000@ziggy>
 <20020703095915.GA43336@hishome.net>
Message-ID: <15651.16296.855975.476641@slothrop.zope.com>

>>>>> "OT" == Oren Tirosh <oren-py-d@hishome.net> writes:

  OT> This thought experiment is part of a strange fantasy I have that
  OT> Python might one day use only interned strings to represent
  OT> names. There are relatively few places where a string may be
  OT> converted to a name (getattr, hasattr, etc) and these could be
  OT> interned at the interface if interned strings are not
  OT> immortal. I expect that nothing will ever come out of this, but
  OT> it's fun to think about it anyway...

two responses:

What do you mean by "represent names"?  Code objects already use
interned strings for names.  Did you have something else in mind?

You might have mentioned this thought experiment / strange fantasy at
the outset of the thread <0.2 wink>.  There was a lot of email
thrashing on this subject, but none of it apeears to have been
necessary.

Jeremy




From fredrik@pythonware.com  Wed Jul  3 19:20:39 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 3 Jul 2002 20:20:39 +0200
Subject: [Python-Dev] Death to WITH_CYCLE_GC
References: <LNBBLJKPBEHFEDALKOLCEELBABAB.tim.one@comcast.net>
Message-ID: <017f01c222be$5792e310$ced241d5@hagrid>

tim wrote:

> What say ye to nuking the #ifdefs conditionalizing it in the core for 2.3?

+1.  go ahead.

</F>




From martin@v.loewis.de  Wed Jul  3 19:10:57 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 03 Jul 2002 20:10:57 +0200
Subject: [Python-Dev] Some dull gc stats
In-Reply-To: <3D22AF9B.5030104@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHOEPCDFAA.tim@zope.com>
 <3D22AF9B.5030104@lemburg.com>
Message-ID: <m3fzz0fzri.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> I don't think that porting to a new platform falls under
> this definition, a new encoding might (but then only if the encoding
> is so popular that people consider its absence a bug)

Here we go. If "many people consider absence of foo a bug" is enough
to allow for a change, I can backport any change if I only find enough
people to testify that absence of that change is a bug...

Regards,
Martin



From faassen@vet.uu.nl  Wed Jul  3 19:26:57 2002
From: faassen@vet.uu.nl (Martijn Faassen)
Date: Wed, 3 Jul 2002 20:26:57 +0200
Subject: [Python-Dev] Death to WITH_CYCLE_GC
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEELBABAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEELBABAB.tim.one@comcast.net>
Message-ID: <20020703182657.GA20472@vet.uu.nl>

Tim Peters wrote:
> but since we never test without it the ability to compile it out
> isn't much of "a feature".

If the problem is you don't have time to test it, what about talking to
the Snake Farm people of the Python Business Forum?

http://www.lysator.liu.se/~sfarmer/

They may be able and willing to help.

Regards,

Martijn




From mal@lemburg.com  Wed Jul  3 19:39:11 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 03 Jul 2002 20:39:11 +0200
Subject: [Python-Dev] Some dull gc stats
References: <BIEJKCLHCIOIHAGOKOLHOEPCDFAA.tim@zope.com>	<3D22AF9B.5030104@lemburg.com> <m3fzz0fzri.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D2344CF.1@lemburg.com>

Martin v. Loewis wrote:
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
> 
>>I don't think that porting to a new platform falls under
>>this definition, a new encoding might (but then only if the encoding
>>is so popular that people consider its absence a bug)
> 
> 
> Here we go. If "many people consider absence of foo a bug" is enough
> to allow for a change, I can backport any change if I only find enough
> people to testify that absence of that change is a bug...

No, I was not talking about a missing foo; the comment
was specifically about an encoding. Adding a new encoding
would not need applications to be changed since the encoding
information is part of the processed data.

Anyway, if this confuses too much, simply go for the more
restrictive: no new features at all paradigm.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From tim@zope.com  Wed Jul  3 19:39:59 2002
From: tim@zope.com (Tim Peters)
Date: Wed, 3 Jul 2002 14:39:59 -0400
Subject: [Python-Dev] Death to WITH_CYCLE_GC
In-Reply-To: <20020703182657.GA20472@vet.uu.nl>
Message-ID: <BIEJKCLHCIOIHAGOKOLHMECADGAA.tim@zope.com>

[Tim Peters]
> but since we never test without it the ability to compile it out
> isn't much of "a feature".

[Martijn Faassen]
> If the problem is you don't have time to test it,

That's not a problem for me.  In context, it was just one more reason why
keeping WITH_CYCLE_GC has become a poor idea at best.

> what about talking to the Snake Farm people of the Python Business
> Forum?
>
> http://www.lysator.liu.se/~sfarmer/
>
> They may be able and willing to help.

This would be a good idea for an "optional feature" somebody actually wants.
For example, is HAVE_UNICODE actually turned off out in the world?  If that
possibility is important to the PBF, then they should arrange to test it.
We certainly don't.  We shouldn't be the ones telling the PBF what's
important to them, either -- they need to figure that out following their
own lights.

We don't test any variations beyond debug vs release build, and it appears
that the debug build isn't tested much except on Windows (although I expect
the debugging memory allocator in 2.3 will suck more Linux developers into
running debug builds).




From jeremy@zope.com  Wed Jul  3 19:46:58 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Wed, 3 Jul 2002 14:46:58 -0400
Subject: [Python-Dev] Death to WITH_CYCLE_GC
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHMECADGAA.tim@zope.com>
References: <20020703182657.GA20472@vet.uu.nl>
 <BIEJKCLHCIOIHAGOKOLHMECADGAA.tim@zope.com>
Message-ID: <15651.18082.124379.188458@slothrop.zope.com>

>>>>> "TP" == Tim Peters <tim@zope.com> writes:

  TP> We don't test any variations beyond debug vs release build, and
  TP> it appears that the debug build isn't tested much except on
  TP> Windows (although I expect the debugging memory allocator in 2.3
  TP> will suck more Linux developers into running debug builds).

It was good enough to suck me in.  What's more, it was so helpful that
it motivated me to fix Zope so that it runs under a debug build.

Jeremy




From barry@zope.com  Wed Jul  3 19:48:41 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 3 Jul 2002 14:48:41 -0400
Subject: [Python-Dev] Death to WITH_CYCLE_GC
References: <20020703182657.GA20472@vet.uu.nl>
 <BIEJKCLHCIOIHAGOKOLHMECADGAA.tim@zope.com>
Message-ID: <15651.18185.961668.124297@anthem.wooz.org>

>>>>> "TP" == Tim Peters <tim@zope.com> writes:

    TP> We don't test any variations beyond debug vs release build,
    TP> and it appears that the debug build isn't tested much except
    TP> on Windows (although I expect the debugging memory allocator
    TP> in 2.3 will suck more Linux developers into running debug
    TP> builds).

Yep, I typically run Python2.3cvs --with-pydebug.

-Barry



From oren-py-d@hishome.net  Wed Jul  3 20:21:34 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Wed, 3 Jul 2002 15:21:34 -0400
Subject: [Python-Dev] Re: Alternative implementation of string interning
In-Reply-To: <15651.16296.855975.476641@slothrop.zope.com>
References: <20020703070617.GA25449@hishome.net> <Pine.LNX.4.44.0207030222090.5227-100000@ziggy> <20020703095915.GA43336@hishome.net> <15651.16296.855975.476641@slothrop.zope.com>
Message-ID: <20020703192134.GA2206@hishome.net>

On Wed, Jul 03, 2002 at 02:17:12PM -0400, Jeremy Hylton wrote:
> >>>>> "OT" == Oren Tirosh <oren-py-d@hishome.net> writes:
> 
>   OT> This thought experiment is part of a strange fantasy I have that
>   OT> Python might one day use only interned strings to represent
>   OT> names. There are relatively few places where a string may be
>   OT> converted to a name (getattr, hasattr, etc) and these could be
>   OT> interned at the interface if interned strings are not
>   OT> immortal. I expect that nothing will ever come out of this, but
>   OT> it's fun to think about it anyway...
> 
> two responses:
> 
> What do you mean by "represent names"?  Code objects already use
> interned strings for names.  Did you have something else in mind?

Not something else - just more of the same.  Interned names in co_names 
tuples are a good start but there are tons of places where literal C-strings 
are used such as in descriptors. These names are converted to temporary
Python strings on demand.  My humble goal is for any name that has a 
predefined meaning in Python to appear exactly once in the executable and 
that instance will be in the form of a static preinitialized Python string 
object, not a C string literal.

Here's how it might work: to use the name 'foo' you just refer to the C
name PYSYMfoo.  During build a helper program scans all C sources for names 
starting with PYSYM and automatically generates a .c file where each of 
these names appears once as a pre-initialized string object and an .h file 
included by Python.h. On startup all these string objects are interned, of
course.

So any name used from C is resolved by the linker to point to the interned
single instance.  Any name appearing unquoted in Python code is interned 
when when it's compiled or loaded from the .pyc file.  There are some cases
where a string becomes a name such as the arguments to functions like 
getattr and hasattr. These would need to be interned before reaching the 
100% interned core of the language. I guess this could be done by a new
PyArgs_ParseTuple format char.  This obviously requires interned strings to
be non-immortal.

For example:

	if (strcmp(sname, "__class__") == 0) 
becomes
	if (if sname == PYSYM__class__)

This is a pretty trivial example but I have other ideas for optimizations
and cleanups that this would enable. These might lead to significant
improvements in code size and performance.

Well, that's my fantasy. There are still some "minor" problems like totally 
breaking the C API.

	Oren




From tim@zope.com  Wed Jul  3 21:22:36 2002
From: tim@zope.com (Tim Peters)
Date: Wed, 3 Jul 2002 16:22:36 -0400
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: <15650.64375.162977.160780@anthem.wooz.org>
Message-ID: <BIEJKCLHCIOIHAGOKOLHEECMDGAA.tim@zope.com>

[people ask assorted hypothetical questions about backporting]

Note that if the PBF is a success (and I sure hope that it is!), backporting
stuff to the py-in-a-tie release line is supposed to become its job, not
Python-Dev's.  They'll backport whatever they see fit, and it won't matter
what even Guido thinks then.

In the meantime, I suggest *we* stick to backporting unarguable bugfixes.
How can you tell whether something is unarguable?  If in doubt, backport it,
and if someone complains, tell them to revert it <0.8 wink>.




From mal@lemburg.com  Wed Jul  3 21:36:27 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 03 Jul 2002 22:36:27 +0200
Subject: [Python-Dev] Death to WITH_CYCLE_GC
References: <BIEJKCLHCIOIHAGOKOLHCEBEDGAA.tim@zope.com>
Message-ID: <3D23604B.4080408@lemburg.com>

Tim Peters wrote:
> [M.-A. Lemburg]
> 
>>Hmm, isn't the idea of having compile time options to give
>>people a chance to eliminate the feature altogether ?
> 
> 
> That may be idea for some symbols; e.g., I suppose HAVE_UNICODE is of that
> nature, although PythonLabs never tests with that disabled either.
> 
> WITH_CYCLE_GC wasn't of that nature.  Like pymalloc before it, cyclic gc was
> *thought* to be such a large change that it would be prudent to leave cyclic
> gc off for a release, but give adventurous people a symbol they could use to
> try it.  WITH_CYCLE_GC was enabled by default in the first alpha release to
> get it some exercise.  That didn't turn up any significant problems, so we
> left it on in the next alpha release too.  Still no problems, so we left it
> on for all the alphas releases.  Still no problems, so we left it on for all
> the beta releases.  Still no problems, so we concluded "screw this, let's
> leave it enabled for the final release too".  So the purpose for which
> WITH_CYCLE_GC was introduced went away before anyone had a chance to use it
> for that purpose.

Fine.

>>I'm thinking in terms of memory footprint of the running
>>interpreter and its binary. Platforms like e.g. Palm
>>or Pocket PC are very touchy about this. Embedded devices
>>even more.
> 
> 
> I don't buy this.  I don't work on embedded devices in this incarnation, and
> from what I've seen the people who do aren't helped at all by people who
> don't guessing about what they might need.  If people on embedded devices
> need help in the core, they can speak for themselves, and get the help they
> *really* need.

Then why do we have a switch to optionally remove the Unicode
support ? or for disabling interning of strings ? or for
caching small integers ?

>>How much memory footprint would removing the #ifdefs
>>cause on average ?
> 
> 
> 6, give or take.

6 what ? snakes, rabbits, swallows ?

I'm missing a concise concept here :-)

If you want to make life hard for people who want to customize
the interpreter, then you should remove *all* such #ifdefs. If
not, then having the #ifdefs adds important meta-information
to the code.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From tim@zope.com  Wed Jul  3 22:05:18 2002
From: tim@zope.com (Tim Peters)
Date: Wed, 3 Jul 2002 17:05:18 -0400
Subject: [Python-Dev] Death to WITH_CYCLE_GC
In-Reply-To: <3D23604B.4080408@lemburg.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHEEDBDGAA.tim@zope.com>

[MAL]
> Then why do we have a switch to optionally remove the Unicode
> support ?

I don't know, although I've asked that question myself.  People used to be
frightened of the Unicode database sizes; I'm not sure they are anymore.

> or for disabling interning of strings ?

There is no such switch now.  There used to be one.  Ditto for whether to
cache hash values.  Ditto whether to "cache-align" hash table entries.  At
the time those were nuked, Guido also said he wanted COUNT_ALLOCS to
disappear (and to act as if it were always #define'd in a Py_TRACE_REFS
build), but nobody has gotten to that yet.

> or for caching small integers ?

There isn't a switch for that either, although there are two undocumented
symbols you can #define such that if their sum is <= 0, small ints waste
*more* memory than if you leave the code alone.  There's no way to disable
the unbounded and immortal int free list, and never was.

>>> How much memory footprint would removing the #ifdefs
>>> cause on average ?

>> 6, give or take.

> 6 what ? snakes, rabbits, swallows ?

You asked an unanswerable (not to mention unparseable) question, I gave a
useless yet accurate answer -- if you can rephrase your question in a way
that can be answered, attach whatever units you need to make 6 exactly
correct <wink>.  Although note that since WITH_CYCLE_GC has been #define'd
by default since it was introduced, removing its #ifdefery would have no
effect on default builds.

> I'm missing a concise concept here :-)
>
> If you want to make life hard for people who want to customize
> the interpreter, then you should remove *all* such #ifdefs. If
> not, then having the #ifdefs adds important meta-information
> to the code.

If you don't personally use a specific preprocessor symbol routinely, I
won't accept your bare assertion that it makes life easier for anyone.
Against that, every preprocessor symbol certainly makes it-- a little to a
lot --harder to maintain the code.  We almost never hear from anyone that
these little nightmares are being used; when we do hear about them, it's
almost always from a dabbler who "just tried it" and then complains because
Python no longer works (from won't compile to segfaults).  Fixing unused
code is a waste of time; I won't do it anymore, but I will devote time to
getting rid of unused code.




From neal@metaslash.com  Wed Jul  3 22:15:51 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Wed, 03 Jul 2002 17:15:51 -0400
Subject: [Python-Dev] Death to WITH_CYCLE_GC
References: <BIEJKCLHCIOIHAGOKOLHEEDBDGAA.tim@zope.com>
Message-ID: <3D236987.B9BB266E@metaslash.com>

Tim Peters wrote:

> Fixing unused code is a waste of time; I won't do it anymore, 
> but I will devote time to getting rid of unused code.

Amen.



From tim.one@comcast.net  Thu Jul  4 07:40:41 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 04 Jul 2002 02:40:41 -0400
Subject: [Python-Dev] Re: Alternative implementation of string interning
In-Reply-To: <15650.63562.102139.566843@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEAFACAB.tim.one@comcast.net>

[Timothy Delaney]
> The #1 most important consideration here is backwards
> compatibility IMO.  Whilst I would be personally unaffected by
> this change (allowing interned strings to be collected), we've
> already had examples of people and code that would be.

Have we?  I posted an example I made up -- I've written and seen code
*close* to that, but not close enough to actually break if interned strings
were to get collected.  I also saw Jack's interned-string refcount abuse in
an isolated part of the core Mac support code, but breaking core code never
counts because we have 100% control over the core (if interned strings were
to get collected, we'd fiddle the Mac code for the same release, and nobody
would be the wiser).  I don't recall hearing about anything else here, and I
don't know of anything else.

Any subsystem that can waste an unbounded amount of memory is a potential
cause of user headaches.  I don't like immortal interned strings, and I
don't like the unbounded int or float free lists either.  It's also not good
that pymalloc never returns arenas to the system, although at least that was
carefully designed so that arenas not in use can become and stay paged out
(e.g., it doesn't periodically "tickle" them as part of general
bookkeeping -- when they're unused by the user, they're also untouched by
pymalloc).

So far, I don't know of any real loss that would occur as a result of
reclaiming unreferenced interned strings.




From mal@lemburg.com  Thu Jul  4 10:05:54 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 04 Jul 2002 11:05:54 +0200
Subject: [Python-Dev] Incompatible changes to xmlrpclib
Message-ID: <3D240FF2.3060708@lemburg.com>

I noticed yesterday that the xmlrcplib.py version in CVS
is incompatible with the version in Python 2.2: all the
.dump_XXX() interfaces changed and now include a third
argument.

Since the Marshaller can be subclassed, this breaks all
existing application space subclasses extending or changing
the default xmlrpclib behaviour.

I'd opt for moving back to the previous style of calling the
write method via self.write.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Thu Jul  4 11:54:27 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 04 Jul 2002 12:54:27 +0200
Subject: [Python-Dev] Re: Alternative implementation of string interning
References: <LNBBLJKPBEHFEDALKOLCIEAFACAB.tim.one@comcast.net>
Message-ID: <3D242963.4000204@lemburg.com>

Tim Peters wrote:
> So far, I don't know of any real loss that would occur as a result of
> reclaiming unreferenced interned strings.

Has anybody ever checked how many such strings live in the
intern dict with ref count 1 in real life apps ?

E.g. say you have Zope running on a standard web-site
for 2 days -- how many such strings do you find in the
interned dict ?

Speaking for myself, I would have a problem with removing
automatic interning of constant strings in Python source
code since I rely on that "feature" for fast switching
on values (if..elif..elif.......else). Since code objects
usually don't go away while the interpreter is running,
these would not be affected by the proposed strategy.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Thu Jul  4 12:38:33 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 04 Jul 2002 13:38:33 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>	<3D220A86.5070003@lemburg.com>	<m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>	<3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org>
Message-ID: <3D2433B9.9080102@lemburg.com>

Barry A. Warsaw wrote:
>>>>>>"MAL" == M  <mal@lemburg.com> writes:
>>>>>
> 
>     MAL> Adding the argument would only help applications which would
>     MAL> make use of it. An application written for Python 2.2
>     MAL> couldn't do this since the optional argument wouldn't be
>     MAL> available.
> 
> Ok, here's another question.  When I updated the email package in
> Python 2.3, Guido wanted me to backport it to Python 2.2.x.  I did
> that once, but there's been a lot of changes since then, both bug
> fixes, API "fixes", and new functionality.
> 
> The email package can be installed separately as a distutils package,
> and it is compatible all the way back to Python 2.1.x.  Which means
> someone /could/ install the latest version in their site-packages and
> have the new functionality in any of the last 3 versions of Python,
> although it would be tricky for Python 2.2.1.
> 
> So does it make sense to backport the latest email package to Python
> 2.2.2?  That's what Guido wanted, and I could argue that doing so
> improves stability of that branch, because while it adds a lot of new
> stuff, the old stuff was fairly well broken.  E.g. you can't properly
> encode RFC 2047 headers in Python 2.2.1's email package.  Backporting
> allows application writers to fix their code so that it works
> compatibly and correctly across more versions of Python than if we
> didn't backport.  It also makes no sense to maintain two different
> code bases (especially now that that's been reduced from 3! :).
> 
> OTOH, it definitely adds new features.  Maybe email is special because
> it was so new in Python 2.2, and so I took a more naive approach to
> some issues that a wider use uncovered.
 >
> it-ain't-always-simple-ly y'rs,

Never said it was... :-)

For cases like the email package or distutils, I think it's
perfectly OK to only provide the updates for older Python
releases as separate download. Both have their own way of
life, so IMHO this is acceptable.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Thu Jul  4 12:34:07 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 04 Jul 2002 13:34:07 +0200
Subject: [Python-Dev] Death to WITH_CYCLE_GC
References: <BIEJKCLHCIOIHAGOKOLHEEDBDGAA.tim@zope.com>
Message-ID: <3D2432AF.8010000@lemburg.com>

Tim Peters wrote:
> [MAL]
> 
>>Then why do we have a switch to optionally remove the Unicode
>>support ?
> 
> 
> I don't know, although I've asked that question myself.  People used to be
> frightened of the Unicode database sizes; I'm not sure they are anymore.
 >
 >
>>or for disabling interning of strings ?
> 
> 
> There is no such switch now.  There used to be one.  Ditto for whether to
> cache hash values.  Ditto whether to "cache-align" hash table entries.  At
> the time those were nuked, Guido also said he wanted COUNT_ALLOCS to
> disappear (and to act as if it were always #define'd in a Py_TRACE_REFS
> build), but nobody has gotten to that yet.

Interesting. I don't recall any discussions about this...

>>or for caching small integers ?
> 
> 
> There isn't a switch for that either, although there are two undocumented
> symbols you can #define such that if their sum is <= 0, small ints waste
> *more* memory than if you leave the code alone.  There's no way to disable
> the unbounded and immortal int free list, and never was.

I was talking about NSMALLNEGINTS and NSMALLPOSINTS.

>>>>How much memory footprint would removing the #ifdefs
>>>>cause on average ?
>>>
> 
>>>6, give or take.
>>
> 
>>6 what ? snakes, rabbits, swallows ?
> 
> 
> You asked an unanswerable (not to mention unparseable) question, I gave a
> useless yet accurate answer -- if you can rephrase your question in a way
> that can be answered, attach whatever units you need to make 6 exactly
> correct <wink>.  Although note that since WITH_CYCLE_GC has been #define'd
> by default since it was introduced, removing its #ifdefery would have no
> effect on default builds.

Ok, let's make it parseable then:

a) When removing the GC code from the code base by #undef'ing
    WITH_CYCLE_GC, how much smaller is the Python interpreter ?

b) ..., how is pybench affected by this (speedup/slowdown/
    unnoticable) ?

c) ..., how many bytes per object are saved for container objects
    which are GC aware ?

If we're talking about just a few kB in interpreter size
and only a few kB worth of list and tuples, then removing
is fine. If we're talking about 100kBs, then you ought to
reconsider the move.

>>I'm missing a concise concept here :-)
>>
>>If you want to make life hard for people who want to customize
>>the interpreter, then you should remove *all* such #ifdefs. If
>>not, then having the #ifdefs adds important meta-information
>>to the code.
> 
> 
> If you don't personally use a specific preprocessor symbol routinely, I
> won't accept your bare assertion that it makes life easier for anyone.

I personally know that developers which have tried to create
a trimmed down version of the interpreter did like the #ifdefs
for removing certain parts like e.g. the complex numbers very
much. I'm just lobbying for them.

After all, someone has to give you a hard time ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mwh@python.net  Thu Jul  4 13:18:00 2002
From: mwh@python.net (Michael Hudson)
Date: 04 Jul 2002 13:18:00 +0100
Subject: [Python-Dev] Death to WITH_CYCLE_GC
In-Reply-To: "Tim Peters"'s message of "Wed, 3 Jul 2002 14:39:59 -0400"
References: <BIEJKCLHCIOIHAGOKOLHMECADGAA.tim@zope.com>
Message-ID: <2m65zvlm9z.fsf@starship.python.net>

"Tim Peters" <tim@zope.com> writes:

> [Tim Peters]
> > but since we never test without it the ability to compile it out
> > isn't much of "a feature".
> 
> [Martijn Faassen]
> > If the problem is you don't have time to test it,
> 
> That's not a problem for me.  In context, it was just one more reason why
> keeping WITH_CYCLE_GC has become a poor idea at best.
> 
> > what about talking to the Snake Farm people of the Python Business
> > Forum?
> >
> > http://www.lysator.liu.se/~sfarmer/
> >
> > They may be able and willing to help.
> 
> This would be a good idea for an "optional feature" somebody actually wants.
> For example, is HAVE_UNICODE actually turned off out in the world?

I build 3 debug builds (ucs2, ucs4 and without unicode) and run the
test suites every night on linux/x86.  test_unicode still fails in
ucs4 builds...

Cheers,
M.

-- 
  For every complex problem, there is a solution that is simple,
  neat, and wrong.                                    -- H. L. Mencken



From martin@v.loewis.de  Thu Jul  4 21:17:00 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 04 Jul 2002 22:17:00 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: <3D2433B9.9080102@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
 <3D220A86.5070003@lemburg.com>
 <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>
 <3D22ADD9.1030901@lemburg.com>
 <15650.64375.162977.160780@anthem.wooz.org>
 <3D2433B9.9080102@lemburg.com>
Message-ID: <m3znx79rk3.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> For cases like the email package or distutils, I think it's
> perfectly OK to only provide the updates for older Python
> releases as separate download. Both have their own way of
> life, so IMHO this is acceptable.

In neither case, this is really possible: Once you have the package in
the Python core, a separate installation in site-packages cannot
override the core implementation.

I believe that was the motivation for Barry to consider backporting
large amounts of changes. The same holds for distutils, except that
there aren't that many major changes.

Regards,
Martin



From aleax@aleax.it  Fri Jul  5 06:30:16 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 5 Jul 2002 07:30:16 +0200
Subject: [Python-Dev] Some dull gc stats
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
Message-ID: <02070507301605.27343@arthur>

On Tuesday 02 July 2002 21:06, Tim Peters wrote:
> [martin@v.loewis.de]
>
> > I understand that it is not a requirement anymore that changes to
> > Python 2.2 are "pure bugfixes". Instead, people expect that Python 2.2
> > evolves and continues to grow new features, as long as they are
> > "strictly backwards compatible".
>
> Alex made a case here for "new features", but the Python Business Forum
> hasn't shown interest in that.

As a Python Business Forum member and board member, I think I can
state that if a (business) case is indeed made, the PBF interest is there.

> Like most businessfolk, I expect they'll
> ignore such issues until someone discovers that the lack of a new feature
> is putting them out of business <0.8 wink>.

I suspect instead that a businessperson clever enough to pick Python
rather than heavily-hyped Java or widely-popular Perl or PHP is most
likely to be an unusually clever businessperson, with some level of
perception of what programming productivity is worth and how to get it.


> > For any user-visible feature, it is normally debatable whether it is
> > "strictly backwards compatible", since it is, by nature, a change in
> > observable behaviour.
> >
> > This specific case is not in that category (i.e. has no
> > user-observable behaviour change), so I think it qualifies for 2.2 -
> > provided there is enough trust in its correctness.
>
> The "bugfix part" of these changes certainly had user-visible aspects, in
> that before it was possible for objects in older generations to get
> yanked back into younger generations.  This can affect when objects get
> collected, and so throw off over-tuned programs slinging gc.enable() and
> disable() "at exactly the best time(s)".

Performance change is not quite the same thing as behavior change.
I agree with Martin that, assuming a performance-oriented change is
'known' to be correct (no change in the inputs-to-outputs behavior
of programs), the criterion should be one of overall benefit rather han
one of Pareto optima.

> > I'm concerned that backporting more changes to Python 2.2 will become
> > difficult in that area, if the GC implementations vary significantly.
>
> Maintaining multiple branches is always a PITA.

Yes, but the degree of pain varies with the branches' separation.


> > Maybe this can be reconsidered when there actually is another change
> > to backport.
>
> Anyone who is so inclined is welcome to reconsider it non-stop <wink>.

I suspect we'll indeed reconsider it.  Whether we do something about
it after the reconsideration will depend on cost-benefit analysis...


Alex



From aleax@aleax.it  Fri Jul  5 07:08:16 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 5 Jul 2002 08:08:16 +0200
Subject: [Python-Dev] List comprehensions
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEFOAAAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCIEFOAAAB.tim.one@comcast.net>
Message-ID: <02070508081606.27343@arthur>

On Thursday 27 June 2002 01:29, Tim Peters wrote:
> [Gerald S. Williams, on listcomp (non)scopes]
>
> > No problem. As long as it was decided that there's a use for
> > the current behavior, I won't question it.
>
> I'm not sure there's a use for it, but I am sure I'd shoot any coworker
> who found one and relied on it <wink>. 

The real (and no doubt PSU-intended) use of list comprehensions is, of 
course, to finesse Python's _apparent_ lack of assignment-in-expression.

Instead of coding a vulgar, typo-prone, hoi-polloi oriented:
    while x = bluh():
        whatever(x)
you get to code an elegant, refined, hoi-oligoi reserved:
    while [ x for x in [bluh()] if x ] :
        whatever(x)

(I'm told Beretta makes excellent small arms, should you need one...).


Alex



From gerhard.haering@gmx.de  Fri Jul  5 07:47:09 2002
From: gerhard.haering@gmx.de (=?ISO-8859-1?Q?Gerhard=20H=E4ring?=)
Date: Fri, 5 Jul 2002 08:47:09 +0200 (Central Europe Daylight Time)
Subject: [Python-Dev] Re[3]: Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: <m3znx79rk3.fsf@mira.informatik.hu-berlin.de>
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
 <3D220A86.5070003@lemburg.com>
 <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>
 <3D22ADD9.1030901@lemburg.com>
 <15650.64375.162977.160780@anthem.wooz.org>
 <3D2433B9.9080102@lemburg.com>
 <m3znx79rk3.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020705064452.A05DA3F1@gargamel.hqd-internal>

"Martin v. Loewis" <martin@v.loewis.de> wrote:
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
> > For cases like the email package or distutils, I think it's
> > perfectly OK to only provide the updates for older Python
> > releases as separate download. Both have their own way of
> > life, so IMHO this is acceptable.
> 
> In neither case, this is really possible: Once you have the package in
> the Python core, a separate installation in site-packages cannot
> override the core implementation.

This might sound clueless, but wouldn't it be a good idea to change that?
So that site-packages comes before Lib/ in sys.path?

Gerhard
--
mail:   gerhard.haering@gmx.de              registered Linux user #64239
web:    http://www.cs.fhm.edu/~ifw00065/    OpenPGP public key id 86AB43C0
public key fingerprint: DEC1 1D02 5743 1159 CD20  A4B6 7B22 6575 86AB 43C0
reduce(lambda x,y:x+y,map(lambda x:chr(ord(x)^42),tuple('zS^BED\nX_FOY\x0b')))




From mal@lemburg.com  Fri Jul  5 09:45:36 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 05 Jul 2002 10:45:36 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>	<3D220A86.5070003@lemburg.com>	<m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>	<3D22ADD9.1030901@lemburg.com>	<15650.64375.162977.160780@anthem.wooz.org>	<3D2433B9.9080102@lemburg.com> <m3znx79rk3.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D255CB0.9080502@lemburg.com>

Martin v. Loewis wrote:
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
> 
>>For cases like the email package or distutils, I think it's
>>perfectly OK to only provide the updates for older Python
>>releases as separate download. Both have their own way of
>>life, so IMHO this is acceptable.
> 
> 
> In neither case, this is really possible: Once you have the package in
> the Python core, a separate installation in site-packages cannot
> override the core implementation.

True, but it is easily possible to install those packages
in a directory which is scanned before the standard lib, thus
overriding the distribution versions:

python setup.py install install-lib=~/lib

> I believe that was the motivation for Barry to consider backporting
> large amounts of changes. The same holds for distutils, except that
> there aren't that many major changes.

If that's the case, then we probably ought to make it easier
for user installed Python add-ons to override builtin packages.

This would help to get rid off the hacks which the PyXML
distribution has to use in order to achieve the same.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From martin@v.loewis.de  Fri Jul  5 17:03:30 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 05 Jul 2002 18:03:30 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: <3D255CB0.9080502@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
 <3D220A86.5070003@lemburg.com>
 <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>
 <3D22ADD9.1030901@lemburg.com>
 <15650.64375.162977.160780@anthem.wooz.org>
 <3D2433B9.9080102@lemburg.com>
 <m3znx79rk3.fsf@mira.informatik.hu-berlin.de>
 <3D255CB0.9080502@lemburg.com>
Message-ID: <m3fzyygo19.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> True, but it is easily possible to install those packages
> in a directory which is scanned before the standard lib, thus
> overriding the distribution versions:
> 
> python setup.py install install-lib=~/lib

Why is ~/lib scanned before the standard lib?

Regards,
Martin



From martin@v.loewis.de  Fri Jul  5 17:01:37 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 05 Jul 2002 18:01:37 +0200
Subject: [Python-Dev] Re[3]: Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: <20020705064452.A05DA3F1@gargamel.hqd-internal>
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
 <3D220A86.5070003@lemburg.com>
 <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>
 <3D22ADD9.1030901@lemburg.com>
 <15650.64375.162977.160780@anthem.wooz.org>
 <3D2433B9.9080102@lemburg.com>
 <m3znx79rk3.fsf@mira.informatik.hu-berlin.de>
 <20020705064452.A05DA3F1@gargamel.hqd-internal>
Message-ID: <m3k7oago4e.fsf@mira.informatik.hu-berlin.de>

Gerhard H=E4ring <gerhard.haering@gmx.de> writes:

> This might sound clueless, but wouldn't it be a good idea to change that?
> So that site-packages comes before Lib/ in sys.path?

No, this is by design, to prevent people from overriding the standard
library. Essentially, all module names in the standard library are
reserved; this procedure enforces that (somewhat, you can always
insert things in the beginning of sys.path).

Regards,
Martin



From mal@lemburg.com  Fri Jul  5 18:17:47 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 05 Jul 2002 19:17:47 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>	<3D220A86.5070003@lemburg.com>	<m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>	<3D22ADD9.1030901@lemburg.com>	<15650.64375.162977.160780@anthem.wooz.org>	<3D2433B9.9080102@lemburg.com>	<m3znx79rk3.fsf@mira.informatik.hu-berlin.de>	<3D255CB0.9080502@lemburg.com> <m3fzyygo19.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D25D4BB.6070407@lemburg.com>

Martin v. Loewis wrote:
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
> 
>>True, but it is easily possible to install those packages
>>in a directory which is scanned before the standard lib, thus
>>overriding the distribution versions:
>>
>>python setup.py install install-lib=~/lib
> 
> 
> Why is ~/lib scanned before the standard lib?

Because I have it defined in PYTHONPATH :-)

As I said, perhaps we need to make it easier to override
std lib packages...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Fri Jul  5 18:19:03 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 05 Jul 2002 19:19:03 +0200
Subject: [Python-Dev] Re[3]: Patch level versions and new features (Was:
 Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>	<3D220A86.5070003@lemburg.com>	<m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>	<3D22ADD9.1030901@lemburg.com>	<15650.64375.162977.160780@anthem.wooz.org>	<3D2433B9.9080102@lemburg.com>	<m3znx79rk3.fsf@mira.informatik.hu-berlin.de>	<20020705064452.A05DA3F1@gargamel.hqd-internal> <m3k7oago4e.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D25D507.50805@lemburg.com>


Martin v. Loewis wrote:
> Gerhard H=E4ring <gerhard.haering@gmx.de> writes:
>=20
>=20
>>This might sound clueless, but wouldn't it be a good idea to change tha=
t?
>>So that site-packages comes before Lib/ in sys.path?
>=20
>=20
> No, this is by design, to prevent people from overriding the standard
> library. Essentially, all module names in the standard library are
> reserved; this procedure enforces that (somewhat, you can always
> insert things in the beginning of sys.path).

Uhm, just for the record: all paths defined in PYTHONPATH
are inserted before the std lib dirs in sys.path on startup,
so the "restriction" is not really all that restrictive.

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From tim.one@comcast.net  Fri Jul  5 18:48:11 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 05 Jul 2002 13:48:11 -0400
Subject: [Python-Dev] List comprehensions
In-Reply-To: <02070508081606.27343@arthur>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEEFACAB.tim.one@comcast.net>

[Alex Martelli]
> The real (and no doubt PSU-intended) use of list comprehensions is, of
> course, to finesse Python's _apparent_ lack of assignment-in-expression.
>
> Instead of coding a vulgar, typo-prone, hoi-polloi oriented:
>     while x = bluh():
>         whatever(x)
> you get to code an elegant, refined, hoi-oligoi reserved:
>     while [ x for x in [bluh()] if x ] :
>         whatever(x)

Excellent!  Guido uses

    while [x for x in [bluh()]][0]:
        whatever(x)

because he thinks it's "more elegant" (whatever that means to a Dutch guy),
but either way it's major relief from the obscurity of embedded assignment.

> (I'm told Beretta makes excellent small arms, should you need one...).

Thanks for the suggestion!  Some Americans consider it rude to shoot
coworkers, so I'm always looking for ways to get across to them that it's
more a matter of defending good taste than of killing people.  Using a piece
with sleek Italian design should go a long way toward helping to make this
point.




From tismer@tismer.com  Sat Jul  6 00:01:43 2002
From: tismer@tismer.com (Christian Tismer)
Date: Fri, 05 Jul 2002 23:01:43 +0000
Subject: [Python-Dev] GC bug with __slots__ ?
Message-ID: <3D262557.4000502@tismer.com>

Hi Guido,

I haven't been able to search lists since
my laptop is stolen, so maybethis is a known issue:

When I create a cyclic reference in a class with
slots, it will not be detected by gc.

#This one works fine:

class a(int): pass
x=a(7)
x.x=x
del x
gc.collect # frees cycle

#This one doesn't:

class a(int): __slots__=["x"]
x=a(7)
x.x=x
del x
gc.collect # frees cycle

ciao - chris  (greetings from iceland)

[yes there is no .sig, was stolen, too :-]





From tim.one@comcast.net  Sat Jul  6 00:09:03 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 05 Jul 2002 19:09:03 -0400
Subject: [Python-Dev] GC bug with __slots__ ?
In-Reply-To: <3D262557.4000502@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEFEACAB.tim.one@comcast.net>

This is already fixed in CVS Python.  From CVS NEWS:

- Classes using __slots__ are now properly garbage collected.
  [SF bug 519621]

I suspect your laptop may be hiding in the repository too!

> -----Original Message-----
> From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On
> Behalf Of Christian Tismer
> Sent: Friday, July 05, 2002 7:02 PM
> Cc: python-dev@python.org
> Subject: [Python-Dev] GC bug with __slots__ ?
>
>
> Hi Guido,
> 
> I haven't been able to search lists since
> my laptop is stolen, so maybethis is a known issue:
> 
> When I create a cyclic reference in a class with
> slots, it will not be detected by gc.
> 
> #This one works fine:
> 
> class a(int): pass
> x=a(7)
> x.x=x
> del x
> gc.collect # frees cycle
> 
> #This one doesn't:
> 
> class a(int): __slots__=["x"]
> x=a(7)
> x.x=x
> del x
> gc.collect # frees cycle
> 
> ciao - chris  (greetings from iceland)
> 
> [yes there is no .sig, was stolen, too :-]
>
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev



From tim.one@comcast.net  Sat Jul  6 00:21:12 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 05 Jul 2002 19:21:12 -0400
Subject: [Python-Dev] Re: Alternative implementation of string interning
In-Reply-To: <3D242963.4000204@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEFFACAB.tim.one@comcast.net>

[M.-A. Lemburg]
> Has anybody ever checked how many such strings live in the
> intern dict with ref count 1 in real life apps ?
>
> E.g. say you have Zope running on a standard web-site
> for 2 days -- how many such strings do you find in the
> interned dict ?

I don't know.  Jim Fulton has raised it as a Zope issue in the past, and my
recollection is that each time this comes up we go through a dance like:

    OK, we'll turn off interning in that path.
    ...
    Oops!  It looks like we already did!
    ...
    Oops!  I guess we didn't on *that* path.
    ...
    *Which* paths does Zope use again?
    ...
    Ah, OK, no, we already turned off interning in those paths.
    ...
    Or at least we did in Python version i.j.k.  *Which* versions
    are we worried about again?
    ...
    Does anyone remember which paths we're worried about?
    ...

It fizzles out then due to terminal boredom <wink>.

> Speaking for myself, I would have a problem with removing
> automatic interning of constant strings in Python source
> code

I don't believe anyone has suggested doing so.  Note that we don't
automatically intern all constant strings in Python source, we only intern
constant strings that "look like" identifiers.  This is from fear of the
immortality of interned strings.

> since I rely on that "feature" for fast switching
> on values (if..elif..elif.......else). Since code objects
> usually don't go away while the interpreter is running,
> these would not be affected by the proposed strategy.




From tim.one@comcast.net  Sat Jul  6 00:39:53 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 05 Jul 2002 19:39:53 -0400
Subject: [Python-Dev] Death to WITH_CYCLE_GC
In-Reply-To: <3D2432AF.8010000@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEFFACAB.tim.one@comcast.net>

[Tim]
>> There is no such switch now.  There used to be one. ...

[MAL]
> Interesting. I don't recall any discussions about this...

It was a topic on Python-Dev at the time, very brief, possibly consisting of
no more than a "how about this?" msg, and a "yes, and that too" reply from
Guido.

>> ... although there are two undocumented symbols you can #define such
>> that if their sum is <= 0, small ints waste *more* memory than if you
>> leave the code alone.

> I was talking about NSMALLNEGINTS and NSMALLPOSINTS.

Me too.  They're not there to turn off caching, they're there to tune it.
If you #define them both to 0, you will, as I said, end up using more
memory, not less.  If you #undef them, it will have no effect -- the code
will #define them back to their defaults then.

> Ok, let's make it parseable then:
>
> a) When removing the GC code from the code base by #undef'ing
>     WITH_CYCLE_GC, how much smaller is the Python interpreter ?

I don't know.

> b) ..., how is pybench affected by this (speedup/slowdown/
>     unnoticable) ?

Ditto.

> c) ..., how many bytes per object are saved for container objects
>     which are GC aware ?

That's platform-dependent.  It's 16 bytes on Win32 using MSVC 6.

> If we're talking about just a few kB in interpreter size
> and only a few kB worth of list and tuples, then removing
> is fine. If we're talking about 100kBs, then you ought to
> reconsider the move.

Why?  Python-Dev is for Python developers, and if nobody here *uses* the
non-feature of being able to compile out cyclic gc, and the hypothetical
people who do use it aren't serious enough about Python to participate here,
there's no paying audience for this continued unused complexity.  If
somebody wants it, they can step up and volunteer to (a) maintain this code,
and (b) test it.  Short of those two happening, it's history.

>> If you don't personally use a specific preprocessor symbol routinely, I
>> won't accept your bare assertion that it makes life easier for anyone.

> I personally know that developers which have tried to create
> a trimmed down version of the interpreter did like the #ifdefs
> for removing certain parts like e.g. the complex numbers very
> much. I'm just lobbying for them.

They can lobby for themselves, provided they still exist.  BTW,
WITHOUT_COMPLEX is the only preprocessor symbol I can think of that was
deliberately intended to make life easier on small platforms (HAVE_UNICODE
may or may not be in that boat -- I don't know why it's there).  Given the
comparatively trivial savings WITHOUT_COMPLEX affords, it hardly seems worth
the bother.  People who have written up the results of serious Python ports
to tiny platforms report needing *major* surgery, far beyond anything these
goofy #ifdefs provide (tiny platforms are, from a std C plus std POSIX view,
deeply broken in many ways).  The best such effort I knew of used to live
here

    http://www.abo.fi/~iporres/python

but that link is dead now, and a Google search doesn't suggest the project
has moved somewhere else.

> After all, someone has to give you a hard time ;-)

Very true, and I thank you for playing along <wink>.




From pinard@iro.umontreal.ca  Sat Jul  6 17:41:40 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 06 Jul 2002 12:41:40 -0400
Subject: [Python-Dev] Re: Priority queue (binary heap) python code
In-Reply-To: <20020628134254.GA14414@panix.com>
References: <LNBBLJKPBEHFEDALKOLCIEKJAAAB.tim.one@comcast.net>
 <036101c21e68$8abed730$6501a8c0@boostconsulting.com>
 <20020628134254.GA14414@panix.com>
Message-ID: <oqznx4g663.fsf@titan.progiciels-bpi.ca>

[Aahz]

> Thank Fredrik for a brilliant job of re-implementing Perl's regex syntax
> into something that I assume is maintainable (haven't looked at the code
> myself) *and* Unicode compliant.

Yes, this seems to be a very good thing for Python.

Speedy regexp engines are notoriously hard to maintain cleanly, at least,
so told me a few successive maintainers of GNU regexp.  Difficult points
are deterministic matching (avoiding backtracking) and POSIX compliance,
and the longest match criterion in particular.

For one, I'm pretty happy with Python regexp implementation, even if it
avoids the above points.  It has other virtues that are well worth the
trade, at least from the experience I have of it so far!

So, in a word, thanks too! :-)

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From pinard@iro.umontreal.ca  Sat Jul  6 17:56:06 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 06 Jul 2002 12:56:06 -0400
Subject: [Python-Dev] Re: Priority queue (binary heap) python code
In-Reply-To: <20020624213318.A5740@arizona.localdomain>
References: <20020624213318.A5740@arizona.localdomain>
Message-ID: <oqvg7sg5i1.fsf@titan.progiciels-bpi.ca>

--=-=-=

[Kevin O'Connor]

> I often find myself needing priority queues in python, and I've finally
> broken down and written a simple implementation.  [...]  Any chance
> something like this could make it into the standard python library?

Two years ago, I (too!) wrote one (appended below) and I offered it to Guido.
He replied he was not feeling like adding into the Python standard library
each and every interesting algorithm on this earth.  So I did not insist :-).


--=-=-=
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: attachment; filename=heap.py
Content-Transfer-Encoding: 8bit

#!/usr/bin/env python
# Copyright © 2000, 2002 Progiciels Bourbeau-Pinard inc.
# François Pinard <pinard@iro.umontreal.ca>, 2000.

"""\
Handle priority heaps.

Heaps are arrays for which a[k] <= a[2*k+1] and a[k] <= a[2*k+2] for all k,
counting elements from 0.  For the sake of comparison, unexisting elements
are considered to be infinite.  The interesting property of a heap is that
a[0] is always its smallest element.

The strange invariant above is meant to be an efficient memory representation
for a tournament.  The numbers below are `k', not a[k]:

                                   0

                  1                                 2

          3               4                5               6

      7       8       9       10      11      12      13      14

    15 16   17 18   19 20   21 22   23 24   25 26   27 28   29 30


In the tree above, each cell `k' is topping `2*k+1' and `2*k+2'.  In an
usual binary tournament we see in sports, each cell is the winner over
the two cells it tops, and we can trace the winner down the tree to see
all opponents s/he had.  However, in many computer applications of such
tournaments, we do not need to trace the history of a winner.  To be
more memory efficient, when a winner is promoted, we try to replace it by
something else at a lower level, and the rule becomes that a cell and the
two cells it tops contain three different items, but the top cell "wins"
over the two topped cells.

If this heap invariant is protected at all time, index 0 is clearly the
overall winner.  The simplest algorithmic way to remove it and find the
"next" winner is to move some looser (let's say cell 30 in the diagram
above) into the 0 position, and then percolate this new 0 down the tree,
exchanging values, until the invariant is re-established.  This is clearly
logarithmic on the total number of items in the tree.  By iterating over
all items, you get an O(n ln n) sort.

A nice feature of this sort is that you can efficiently insert new items
while the sort is going on, provided that the inserted items are not
"better" than the last 0'th element you extracted.  This is especially
useful in simulation contexts, where the tree holds all incoming events,
and the "win" condition means the smallest scheduled time.  When an event
schedule other events for execution, they are scheduled into the future,
so they can easily go into the heap.  So, a heap is a good structure for
implementing schedulers (this is what I used for my MIDI sequencer :-).

Various structures for implementing schedulers have been extensively
studied, and heaps are good for this, as they are reasonably speedy,
the speed is almost constant, and the worst case is not much different
than the average case.  However, there are other representations which
are more efficient overall, yet the worst cases might be terrible.

Heaps are also very useful in big disk sorts.  You most probably all know
that a big sort implies producing "runs" (which are pre-sorted sequences,
which size is usually related to the amount of CPU memory), followed by
a merging passes for these runs, which merging is often very cleverly
organised[1].  It is very important that the initial sort produces the
longest runs possible.  Tournaments are a good way to that.  If, using
all the memory available to hold a tournament, you replace and percolate
items that happen to fit the current run, you'll produce runs which are
twice the size of the memory for random input, and much better for input
fuzzily ordered.

Moreover, if you output the 0'th item on disk and get an input which may
not fit in the current tournament (because the value "wins" over the last
output value), it cannot fit in the heap, so the size of the heap decreases.
The freed memory could be cleverly reused immediately for progressively
building a second heap, which grows at exactly the same rate the first
heap is melting.  When the first heap completely vanishes, you switch
heaps and start a new run.  Clever and quite effective!

In a word, heaps are useful memory structures to know.  I use them in a
few applications, and I think it is good to keep a `heap' module around. :-)

--------------------
[1] The disk balancing algorithms which are current, nowadays, are more
annoying than clever, and this is a consequence of the seeking capabilities
of the disks.  On devices which cannot seek, like big tape drives, the
story was quite different, and one had to be very clever to ensure (far
in advance) that each tape movement will be the most effective possible
(that is, will best participate at "progressing" the merge).  Some tapes
were even able to read backwards, and this was also used to avoid the
rewinding time.  Believe me, real good tape sorts were quite spectacular
to watch!  From all times, sorting has always been a Great Art! :-)
"""

class Heap:

    def __init__(self, compare=cmp):
        """\
Set a new heap.  If COMPARE is given, use it instead of built-in comparison.

COMPARE, given two items, should return negative, zero or positive depending
on the fact the first item compares smaller, equal or greater than the
second item.
"""
        self.compare = compare
        self.array = []

    def __call__(self):
        """\
A heap instance, when called as a function, return all its items.
"""
        return self.array

    def __len__(self):
        """\
Return the number of items in the current heap instance.
"""
        return len(self.array)

    def __getitem__(self, index):
        """\
Return the INDEX-th item from the heap instance.  INDEX is usually zero.
"""
        return self.array[index]

    def push(self, item):
        """\
Add ITEM to the current heap instance.
"""
        array = self.array
        compare = self.compare
        array.append(item)
        high = len(array) - 1
        while high > 0:
            low = (high-1)/2
            if compare(array[low], array[high]) <= 0:
                break
            array[low], array[high] = array[high], array[low]
            high = low

    def pop(self):
        """\
Remove and return the smallest item from the current heap instance.
"""
        array = self.array
        item = array[0]
        if len(array) == 1:
            del array[0]
        else:
            compare = self.compare
            array[0] = array.pop()
            low, high = 0, 1
            while high < len(array):
                if ((high+1 < len(array)
                     and compare(array[high], array[high+1]) > 0)):
                    high = high+1
                if compare(array[low], array[high]) <= 0:
                    break
                array[low], array[high] = array[high], array[low]
                low, high = high, 2*high+1
        return item

def test(n=2000):
    heap = Heap()
    for k in range(n-1, -1, -1):
        heap.push(k)
    for k in range(n):
        assert k+len(heap) == n
        assert k == heap.pop()

--=-=-=
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit


-- 
François Pinard   http://www.iro.umontreal.ca/~pinard

--=-=-=--



From pinard@iro.umontreal.ca  Sat Jul  6 18:04:05 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 06 Jul 2002 13:04:05 -0400
Subject: [Python-Dev] Re: Priority queue (binary heap) python code
In-Reply-To: <20020625065203.GA27183@hishome.net>
References: <20020624213318.A5740@arizona.localdomain>
 <20020625065203.GA27183@hishome.net>
Message-ID: <oqr8igg54q.fsf@titan.progiciels-bpi.ca>

[Oren Tirosh]

> A sorted list is a much more general-purpose data structure than a priority
> queue and can be used to implement a priority queue.  [...]  The only
> advantage of a heap is O(1) peek which doesn't seem so critical.  [...]
> the internal order of a heap-based priority queue is very non-intuitive and
> quite useless for other purposes while a sorted list is, umm..., sorted!

It surely occurred to many of us to sort a file (or any set of data)
from the most interesting entry to the least interesting entry, look at
the first 5% to 10%, and drop all the rest.

A heap is a good way to retain the first few percents of items, without
going through the lengths of fully sorting all the rest.  By comparison,
it would not be efficient to use `.sort()' then truncate.

Within a simulation, future events are scheduled while current events
are being processed, so we do not have all the events to `.sort()' first.
It is likely that heaps would beat insertion after binary search, given
of course that both are implemented with the same care, speed-wise.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From pinard@iro.umontreal.ca  Sat Jul  6 23:43:13 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 06 Jul 2002 18:43:13 -0400
Subject: [Python-Dev] Re: The OOM-Killer vs. Python
In-Reply-To: <m3zo0xip76.fsf@mira.informatik.hu-berlin.de>
References: <3c9e2b5f.9993062@news.t-online.de>
 <m3zo0xip76.fsf@mira.informatik.hu-berlin.de>
Message-ID: <oqk7o8tr3y.fsf@titan.progiciels-bpi.ca>

[Martin v. Loewis]

> If you don't create any cyclic garbage, you can find all container
> objects with gc.get_objects.  If you find that gc.get_objects does not
> grow longer over time, but your process still consumes more memory,
> one of your C extensions has a refcounting bug.

Out of curiosity, I checked with the latest HTML documentation (2.3a0)
and did not find documentation for `gc.get_objects'.  Should it be there?

P.S. - Looking at http://www.python.org/dev/doc/devel/lib/module-gc.html.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From martin@v.loewis.de  Sun Jul  7 08:50:36 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 07 Jul 2002 09:50:36 +0200
Subject: [Python-Dev] Re: The OOM-Killer vs. Python
In-Reply-To: <oqk7o8tr3y.fsf@titan.progiciels-bpi.ca>
References: <3c9e2b5f.9993062@news.t-online.de>
 <m3zo0xip76.fsf@mira.informatik.hu-berlin.de>
 <oqk7o8tr3y.fsf@titan.progiciels-bpi.ca>
Message-ID: <m33cuwm0xf.fsf@mira.informatik.hu-berlin.de>

pinard@iro.umontreal.ca (Fran=E7ois Pinard) writes:

> Out of curiosity, I checked with the latest HTML documentation (2.3a0)
> and did not find documentation for `gc.get_objects'.  Should it be there?

I think so, yes. I filed bug #578308.

Regards,
Martin



From vinay_sajip@red-dove.com  Mon Jul  8 02:16:32 2002
From: vinay_sajip@red-dove.com (Vinay Sajip)
Date: Mon, 8 Jul 2002 02:16:32 +0100
Subject: [Python-Dev] PEP 282 Implementation
Message-ID: <00e001c2261d$19bfc320$652b6992@alpha>

I've uploaded my logging module, the proposed implementation for PEP 282,
for committer review, to the SourceForge patch manager:

http://sourceforge.net/tracker/index.php?func=detail&aid=578494&group_id=547
0&atid=305470

I've assigned it to Mark Hammond as (a) he had posted some comments to Trent
Mick's original PEP posting, and (b) Barry Warsaw advised not assigning to
PythonLabs people on account of their current workload.

The file logging.py is (apart from some test scripts) all that's supposed to
go into Python 2.3. The file logging-0.4.6.tar.gz contains the module, an
updated version of the PEP (which I mailed to Barry Warsaw on 26th June),
numerous test/example scripts, TeX documentation etc. You can also refer to

http://www.red-dove.com/python_logging.html

Here's hoping for a speedy review :-)

Regards,


Vinay Sajip




From barry@zope.com  Mon Jul  8 14:58:30 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 8 Jul 2002 09:58:30 -0400
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
 <3D220A86.5070003@lemburg.com>
 <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>
 <3D22ADD9.1030901@lemburg.com>
 <15650.64375.162977.160780@anthem.wooz.org>
 <3D2433B9.9080102@lemburg.com>
 <m3znx79rk3.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15657.39558.325764.651122@anthem.wooz.org>

>>>>> "MvL" == Martin v Loewis <martin@v.loewis.de> writes:

    >> For cases like the email package or distutils, I think it's
    >> perfectly OK to only provide the updates for older Python
    >> releases as separate download. Both have their own way of life,
    >> so IMHO this is acceptable.

    MvL> In neither case, this is really possible: Once you have the
    MvL> package in the Python core, a separate installation in
    MvL> site-packages cannot override the core implementation.

    MvL> I believe that was the motivation for Barry to consider
    MvL> backporting large amounts of changes. The same holds for
    MvL> distutils, except that there aren't that many major changes.

Exactly.  For my own purposes (e.g. Mailman) it's no problem; I
provide my own email package and arrange for MM to use it before the
Python standard one.  I actually think it's the right thing for a
normal distutils install to not override the standard version.

But I also think there is a use case for allowing a standard package
to be separately upgraded for a particular Python installation.  As
more and more standard Python libraries are packagized, they will
probably have life-cycles separate from the Python core themselves
(this will only be more true once we evolve toward a CPAN-like
arrangement).  So I think we will eventually need a way to upgrade
(not override :) a standard library package.

My suggestion would be to prepend a new directory on the standard
search path, let's call it site-upgrade for now.  A normal "python
setup.py install" would still install to site-packages, but we'd add a
"python setup.py upgrade" command that would install to site-upgrade.

-Barry



From barry@zope.com  Mon Jul  8 15:01:13 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 8 Jul 2002 10:01:13 -0400
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
 <3D220A86.5070003@lemburg.com>
 <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>
 <3D22ADD9.1030901@lemburg.com>
 <15650.64375.162977.160780@anthem.wooz.org>
 <3D2433B9.9080102@lemburg.com>
 <m3znx79rk3.fsf@mira.informatik.hu-berlin.de>
 <3D255CB0.9080502@lemburg.com>
Message-ID: <15657.39721.348050.614837@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> True, but it is easily possible to install those packages in
    MAL> a directory which is scanned before the standard lib, thus
    MAL> overriding the distribution versions:

    MAL> python setup.py install install-lib=~/lib

A general solution that requires uses to set environment variables
isn't acceptable IMO.

    >> I believe that was the motivation for Barry to consider
    >> backporting large amounts of changes. The same holds for
    >> distutils, except that there aren't that many major changes.

    MAL> If that's the case, then we probably ought to make it easier
    MAL> for user installed Python add-ons to override builtin
    MAL> packages.

+1

    MAL> This would help to get rid off the hacks which the PyXML
    MAL> distribution has to use in order to achieve the same.

Yup, see my previous response.
-Barry



From mal@lemburg.com  Mon Jul  8 15:14:26 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jul 2002 16:14:26 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>	<3D220A86.5070003@lemburg.com>	<m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>	<3D22ADD9.1030901@lemburg.com>	<15650.64375.162977.160780@anthem.wooz.org>	<3D2433B9.9080102@lemburg.com>	<m3znx79rk3.fsf@mira.informatik.hu-berlin.de> <15657.39558.325764.651122@anthem.wooz.org>
Message-ID: <3D299E42.70200@lemburg.com>

Barry A. Warsaw wrote:
> But I also think there is a use case for allowing a standard package
> to be separately upgraded for a particular Python installation.  As
> more and more standard Python libraries are packagized, they will
> probably have life-cycles separate from the Python core themselves
> (this will only be more true once we evolve toward a CPAN-like
> arrangement).  So I think we will eventually need a way to upgrade
> (not override :) a standard library package.

+1

> My suggestion would be to prepend a new directory on the standard
> search path, let's call it site-upgrade for now.  A normal "python
> setup.py install" would still install to site-packages, but we'd add a
> "python setup.py upgrade" command that would install to site-upgrade.

+1 (maybe with s/site-upgrade/system-packages)

Not sure whether it's already possible or not, but I'd prefer
to keep the install command and have the package provide this
information (site-packages vs. system-packages) as part of the
setup.py or setup.cfg file.

Perhaps we could have some kind of category for distutils
packages which marks them as system add-ons vs. site add-ons.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From martin@v.loewis.de  Mon Jul  8 17:24:28 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 08 Jul 2002 18:24:28 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: <3D299E42.70200@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
 <3D220A86.5070003@lemburg.com>
 <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>
 <3D22ADD9.1030901@lemburg.com>
 <15650.64375.162977.160780@anthem.wooz.org>
 <3D2433B9.9080102@lemburg.com>
 <m3znx79rk3.fsf@mira.informatik.hu-berlin.de>
 <15657.39558.325764.651122@anthem.wooz.org>
 <3D299E42.70200@lemburg.com>
Message-ID: <m3n0t2i3wj.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> Perhaps we could have some kind of category for distutils
> packages which marks them as system add-ons vs. site add-ons.

One approach would be for distutils to have a list of system packages
built-in, depending on the Python release. That list would cover PyXML
and email; perhaps others.

Of couse, taking the _xmlplus hack out of PyXML will cause backwards
compatibility problems (regardless what the alternative hook is).

Regards,
Martin



From gmcm@hypernet.com  Mon Jul  8 17:39:55 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Mon, 8 Jul 2002 12:39:55 -0400
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: <m3n0t2i3wj.fsf@mira.informatik.hu-berlin.de>
References: <3D299E42.70200@lemburg.com>
Message-ID: <3D29881B.30590.6BF1A7E@localhost>

On 8 Jul 2002 at 18:24, Martin v. Loewis wrote:

> Of couse, taking the _xmlplus hack out of PyXML
> will cause backwards compatibility problems
> (regardless what the alternative hook is). 

How? As long as "import xml" gets them _xmlplus, I
can't see how it would break anything.

I'd say it's broken already, since code written for
_xmlplus assumes a different contract, and that
is completely implicit.

-- Gordon
http://www.mcmillan-inc.com/




From skip@pobox.com  Sat Jul  6 17:17:30 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sat, 6 Jul 2002 11:17:30 -0500
Subject: [Python-Dev] Death to WITH_CYCLE_GC
In-Reply-To: <3D2432AF.8010000@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHEEDBDGAA.tim@zope.com>
 <3D2432AF.8010000@lemburg.com>
Message-ID: <15655.6170.182746.75875@localhost.localdomain>

    mal> I personally know that developers which have tried to create a
    mal> trimmed down version of the interpreter did like the #ifdefs for
    mal> removing certain parts like e.g. the complex numbers very much. I'm
    mal> just lobbying for them.

I think complex numbers are a bit different.  They had the #ifdef from start
precisely because it was expected they wouldn't be needed in some cases
where memory footprint mattered.  WITH_CYCLE_GC was just a debugging
#ifdef.

Skip



From barry@zope.com  Mon Jul  8 17:51:58 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 8 Jul 2002 12:51:58 -0400
Subject: [Python-Dev] PEP 282 Implementation
References: <00e001c2261d$19bfc320$652b6992@alpha>
Message-ID: <15657.49966.835748.511346@anthem.wooz.org>

>>>>> "VS" == Vinay Sajip <vinay_sajip@red-dove.com> writes:

    VS> The file logging.py is (apart from some test scripts) all
    VS> that's supposed to go into Python 2.3. The file
    VS> logging-0.4.6.tar.gz contains the module, an updated version
    VS> of the PEP (which I mailed to Barry Warsaw on 26th June),
    VS> numerous test/example scripts, TeX documentation etc. You can
    VS> also refer to

PEP 282 update has been installed.

One coment about the PEP: where `lvl' is used as an argument to
methods and functions, I think we shouldn't be so cute.  Please spell
it out as `level'.

-Barry



From martin@v.loewis.de  Mon Jul  8 18:03:44 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 08 Jul 2002 19:03:44 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: <3D29881B.30590.6BF1A7E@localhost>
References: <3D299E42.70200@lemburg.com> <3D29881B.30590.6BF1A7E@localhost>
Message-ID: <m365zqi233.fsf@mira.informatik.hu-berlin.de>

"Gordon McMillan" <gmcm@hypernet.com> writes:

> > Of couse, taking the _xmlplus hack out of PyXML
> > will cause backwards compatibility problems
> > (regardless what the alternative hook is). 
> 
> How? As long as "import xml" gets them _xmlplus, I
> can't see how it would break anything.

Of course, once the hack that is taken out of PyXML, there won't be
any _xmlplus anymore.

I was thinking about applications that package Python applications,
like freeze or Installer. People might have taken into account that
they have to look inside _xmlplus as well. If the hack changes, they
have to take into account that they need to look somewhere else,
instead.

Regards,
Martin



From gmcm@hypernet.com  Mon Jul  8 18:20:02 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Mon, 8 Jul 2002 13:20:02 -0400
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: <m365zqi233.fsf@mira.informatik.hu-berlin.de>
References: <3D29881B.30590.6BF1A7E@localhost>
Message-ID: <3D299182.4516.6E3D582@localhost>

On 8 Jul 2002 at 19:03, Martin v. Loewis wrote:

> "Gordon McMillan" <gmcm@hypernet.com> writes:
> 
> > > Of couse, taking the _xmlplus hack out of PyXML
> > > will cause backwards compatibility problems
> > > (regardless what the alternative hook is). 
> > 
> > How? As long as "import xml" gets them _xmlplus, I
> > can't see how it would break anything.
> 
> Of course, once the hack that is taken out of PyXML,
> there won't be any _xmlplus anymore.
> 
> I was thinking about applications that package
> Python applications, like freeze or Installer.
> People might have taken into account that they have
> to look inside _xmlplus as well. If the hack
> changes, they have to take into account that they
> need to look somewhere else, instead. 

py2exe doesn't do _xmlplus (unless that's changed
recently) - Thomas has people overlay xml with
_xmlplus.

Installer does do it, but it's a horrid hack (one bad
hack deserves another) and I'd be delighted to
remove it.

-- Gordon
http://www.mcmillan-inc.com/




From barry@zope.com  Mon Jul  8 18:23:35 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 8 Jul 2002 13:23:35 -0400
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
 <3D220A86.5070003@lemburg.com>
 <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>
 <3D22ADD9.1030901@lemburg.com>
 <15650.64375.162977.160780@anthem.wooz.org>
 <3D2433B9.9080102@lemburg.com>
 <m3znx79rk3.fsf@mira.informatik.hu-berlin.de>
 <15657.39558.325764.651122@anthem.wooz.org>
 <3D299E42.70200@lemburg.com>
Message-ID: <15657.51863.523283.977726@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    >> My suggestion would be to prepend a new directory on the
    >> standard search path, let's call it site-upgrade for now.  A
    >> normal "python setup.py install" would still install to
    >> site-packages, but we'd add a "python setup.py upgrade" command
    >> that would install to site-upgrade.

    MAL> +1 (maybe with s/site-upgrade/system-packages)

I like that: system-packages.

    MAL> Not sure whether it's already possible or not, but I'd prefer
    MAL> to keep the install command and have the package provide this
    MAL> information (site-packages vs. system-packages) as part of
    MAL> the setup.py or setup.cfg file.

Ok, yeah.  I think it would be a good idea for the package to somehow
register itself as an upgrade to an existing system package.  I still
want the install command to install to site-packages, but whether the
upgrade happens as an upgrade command or "python setup.py install -U"
or some other mechanism is up for grabs.

-Barry



From David Abrahams" <david.abrahams@rcn.com  Mon Jul  8 19:21:04 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 8 Jul 2002 14:21:04 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
Message-ID: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>

I keep running into the problem that there is no reliable way to introspect
about whether a type supports multi-pass iterability (in the sense that an
input stream might support only a single pass, but a list supports multiple
passes). I suppose you could check for __getitem__, but that wouldn't cover
linked lists, for example.

Can anyone channel Guido's intent for me? Is this an oversight or a
deliberate design decision? Is there an interface for checking
multi-pass-ability that I've missed?

TIA,
Dave





From jacobs@penguin.theopalgroup.com  Mon Jul  8 19:30:33 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Mon, 8 Jul 2002 14:30:33 -0400 (EDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>
Message-ID: <Pine.LNX.4.44.0207081429190.21098-100000@penguin.theopalgroup.com>

On Mon, 8 Jul 2002, David Abrahams wrote:
> I keep running into the problem that there is no reliable way to introspect
> about whether a type supports multi-pass iterability (in the sense that an
> input stream might support only a single pass, but a list supports multiple
> passes). I suppose you could check for __getitem__, but that wouldn't cover
> linked lists, for example.
> 
> Can anyone channel Guido's intent for me? Is this an oversight or a
> deliberate design decision? Is there an interface for checking
> multi-pass-ability that I've missed?

As far as I can tell, there is no published Python mechanism that
distinguishes "input iterators" from "forward iterators" (using the C++
parlance).

-Kevin


--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From tim.one@comcast.net  Mon Jul  8 19:44:25 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 08 Jul 2002 14:44:25 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENKACAB.tim.one@comcast.net>

[David Abrahams]
> I keep running into the problem that there is no reliable way to
> introspect about whether a type supports multi-pass iterability (in the
> sense that an input stream might support only a single pass, but a list
> supports multiple passes). I suppose you could check for __getitem__, but
> that wouldn't cover linked lists, for example.
>
> Can anyone channel Guido's intent for me? Is this an oversight or a
> deliberate design decision? Is there an interface for checking
> multi-pass-ability that I've missed?

The language makes no such distinctions.  If an app wants to make them, it's
up to the app to implement them.  Likewise for a way to tell a multipass
iterator to "start over again".  The Python iteration protocol has only two
methods, .next() to get "the next" item, and .iter() to return self; given a
random iterator, those are the only things you can rely on.




From barry@zope.com  Mon Jul  8 21:03:58 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 8 Jul 2002 16:03:58 -0400
Subject: [Python-Dev] New Persistence SIG created
Message-ID: <15657.61486.222685.665859@anthem.wooz.org>

As recently discussed on meta-sig@python.org, we have created a new
SIG focussed on producing a common persistence and transactional
framework for Python programs.  This SIG is called
persistence-sig@python.org.

For more information on the SIG, its mission, and deadlines see

    http://www.python.org/sigs/persistence-sig/

To join the mailing list see

    http://mail.python.org/mailman-21/listinfo/persistence-sig

-Barry



From jacobs@penguin.theopalgroup.com  Mon Jul  8 21:48:37 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Mon, 8 Jul 2002 16:48:37 -0400 (EDT)
Subject: [Python-Dev] Are we having https/ssl problems?
Message-ID: <Pine.LNX.4.44.0207081615150.32139-100000@penguin.theopalgroup.com>

Hi all,

This is not a bug report.  It is more of a query to find out if there are
known problems with the current Python 2.3 CVS regarding SSL, httplib w/
https, or urllib w/ https.  I seem to remember tuning out some discussions
on timeout sockets and SSL of late, so I thought I would ask.  Here is code
that has worked previously, but does not in the current CVS:

  import urllib

  def get(url):
    u = urllib.urlopen(url)
    junk = ''
    while 1:
      chunk = u.read()
      if not chunk:
        break
      junk += chunk
    return junk

  exlen = len(get('https://dbserv2.theopalgroup.com/mediumfile'))
  aclen = len(get('https://dbserv2.theopalgroup.com/mediumfile'))

  print "File 1 len = %d, File 2 len = %d" % (exlen,aclen)

> python2.0 testhttps.py
HTTP len = 37140, HTTPS len = 37140

> python2.1 testhttps.py
HTTP len = 37140, HTTPS len = 37140

> python2.2 testhttps.py
HTTP len = 37140, HTTPS len = 37140

> python2.3 testhttps.py
HTTP len = 37140, HTTPS len = 0

If this doesn't ring a bell with anyone, I will battle SourceForge once more
and file a bug report.  The interesting thing is that the problem is
sensitive to the size of the file requested.  Here is what happens when I
use 'smallfile' instead of 'mediumfile':

> python2.3 testhttps.py
HTTP len = 3713, HTTPS len = 3713

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From jacobs@penguin.theopalgroup.com  Mon Jul  8 21:51:16 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Mon, 8 Jul 2002 16:51:16 -0400 (EDT)
Subject: [Python-Dev] Are we having https/ssl problems?
In-Reply-To: <Pine.LNX.4.44.0207081615150.32139-100000@penguin.theopalgroup.com>
Message-ID: <Pine.LNX.4.44.0207081650210.32645-100000@penguin.theopalgroup.com>

On Mon, 8 Jul 2002, Kevin Jacobs wrote:
>   exlen = len(get('https://dbserv2.theopalgroup.com/mediumfile'))
                     ^^^^^
Oops.  Obviously, this should be http.

The trials of cut-n-paste,
-Kevin


--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From mal@lemburg.com  Mon Jul  8 21:59:59 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jul 2002 22:59:59 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>	<3D220A86.5070003@lemburg.com>	<m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>	<3D22ADD9.1030901@lemburg.com>	<15650.64375.162977.160780@anthem.wooz.org>	<3D2433B9.9080102@lemburg.com>	<m3znx79rk3.fsf@mira.informatik.hu-berlin.de>	<15657.39558.325764.651122@anthem.wooz.org>	<3D299E42.70200@lemburg.com> <15657.51863.523283.977726@anthem.wooz.org>
Message-ID: <3D29FD4F.4060607@lemburg.com>

Barry A. Warsaw wrote:
>>>>>>"MAL" == M  <mal@lemburg.com> writes:
>>>>>
> 
>     >> My suggestion would be to prepend a new directory on the
>     >> standard search path, let's call it site-upgrade for now.  A
>     >> normal "python setup.py install" would still install to
>     >> site-packages, but we'd add a "python setup.py upgrade" command
>     >> that would install to site-upgrade.
> 
>     MAL> +1 (maybe with s/site-upgrade/system-packages)
> 
> I like that: system-packages.
> 
>     MAL> Not sure whether it's already possible or not, but I'd prefer
>     MAL> to keep the install command and have the package provide this
>     MAL> information (site-packages vs. system-packages) as part of
>     MAL> the setup.py or setup.cfg file.
> 
> Ok, yeah.  I think it would be a good idea for the package to somehow
> register itself as an upgrade to an existing system package.  I still
> want the install command to install to site-packages, but whether the
> upgrade happens as an upgrade command or "python setup.py install -U"
> or some other mechanism is up for grabs.

Hmm, maybe I wasn't clear enough: I think that a distutils
package should have a flag in its setup.py which lets distutils
tell whether it's a site package or a system package, e.g.

setup(... pkgtype='site-package' ...)
vs.
setup(... pkgtype='system-package' ...)

(with pkgtype='site-package' as default value if not given)

The user would in both cases type 'python setup.py install'
but the install command would automatically choose the
right target subdir (site-packages/ or system-packages/).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From andymac@bullseye.apana.org.au  Mon Jul  8 13:53:01 2002
From: andymac@bullseye.apana.org.au (Andrew MacIntyre)
Date: Mon, 8 Jul 2002 23:53:01 +1100 (edt)
Subject: [Python-Dev] test_socket failure on FreeBSD
In-Reply-To: <200206192037.g5JKbSj03086@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.OS2.4.32.0207082338440.28004-400000@tenring.andymac.org>

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

---888574994-8578-1026132781=:28004
Content-Type: TEXT/PLAIN; charset=US-ASCII

On Wed, 19 Jun 2002, Guido van Rossum wrote:

> There are probably some differences in the socket semantics.  I'd
> appreciate it if you could provide a patch or at least a clue!

I've not read enough Stevens to grok sockets code (yet) :-(

However, I hope that the instrumented verbose output of test_socket might
give you a clue....

I've attached the diff from the version of test_socket (vs recent CVS)
that I used, as well as output from test_socket on FreeBSD 4.4 and
OS/2+EMX.  Getting the FreeBSD issues sorted is a higher priority for me
than getting OS/2+EMX working (though that would be nice too).

Please let me know if there's more testing/debugging I can do.

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  | Snail: PO Box 370
        andymac@pcug.org.au            |        Belconnen  ACT  2616
Web:    http://www.andymac.org/        |        Australia

---888574994-8578-1026132781=:28004
Content-Type: TEXT/PLAIN; charset=US-ASCII; name="test_socket.py.diff"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.OS2.4.32.0207082353010.28004@tenring.andymac.org>
Content-Description: test_socket.py.diff
Content-Disposition: attachment; filename="test_socket.py.diff"

KioqIHRlc3Rfc29ja2V0LnB5Lm9yaWcJU3VuIEp1biAzMCAyMjoyOTo1NyAy
MDAyDQotLS0gdGVzdF9zb2NrZXQucHkJTW9uIEp1bCAgOCAyMzoxNTo0MSAy
MDAyDQoqKioqKioqKioqKioqKioNCioqKiA4LDEzICoqKioNCi0tLSA4LDE0
IC0tLS0NCiAgaW1wb3J0IHRpbWUNCiAgaW1wb3J0IHRocmVhZCwgdGhyZWFk
aW5nDQogIGltcG9ydCBRdWV1ZQ0KKyBpbXBvcnQgdHJhY2ViYWNrDQogIA0K
ICBQT1JUID0gNTAwMDcNCiAgSE9TVCA9ICdsb2NhbGhvc3QnDQoqKioqKioq
KioqKioqKioNCioqKiAzNDQsMzQ5ICoqKioNCi0tLSAzNDUsMzUxIC0tLS0N
CiAgICAgIGRlZiB0ZXN0UmVjdkZyb20oc2VsZik6DQogICAgICAgICAgIiIi
VGVzdGluZyBsYXJnZSByZWN2ZnJvbSgpIG92ZXIgVENQLiIiIg0KICAgICAg
ICAgIG1zZywgYWRkciA9IHNlbGYuY2xpX2Nvbm4ucmVjdmZyb20oMTAyNCkN
CisgICAgICAgICBwcmludCAiXG5tc2c9JyVzJywgYWRkcj0nJXMnIiAlICht
c2csIHJlcHIoYWRkcikpDQogICAgICAgICAgaG9zdG5hbWUsIHBvcnQgPSBh
ZGRyDQogICAgICAgICAgIyNzZWxmLmFzc2VydEVxdWFsKGhvc3RuYW1lLCBz
b2NrZXQuZ2V0aG9zdGJ5bmFtZSgnbG9jYWxob3N0JykpDQogICAgICAgICAg
c2VsZi5hc3NlcnRFcXVhbChtc2csIE1TRykNCioqKioqKioqKioqKioqKg0K
KioqIDM1NCwzNjEgKioqKg0KLS0tIDM1NiwzNjUgLS0tLQ0KICAgICAgZGVm
IHRlc3RPdmVyRmxvd1JlY3ZGcm9tKHNlbGYpOg0KICAgICAgICAgICIiIlRl
c3RpbmcgcmVjdmZyb20oKSBpbiBjaHVua3Mgb3ZlciBUQ1AuIiIiDQogICAg
ICAgICAgc2VnMSwgYWRkciA9IHNlbGYuY2xpX2Nvbm4ucmVjdmZyb20obGVu
KE1TRyktMykNCisgICAgICAgICBwcmludCAiXG5zZWcxPSclcycsIGFkZHI9
JyVzJyIgJSAoc2VnMSwgcmVwcihhZGRyKSkNCiAgICAgICAgICBzZWcyLCBh
ZGRyID0gc2VsZi5jbGlfY29ubi5yZWN2ZnJvbSgxMDI0KQ0KICAgICAgICAg
IG1zZyA9IHNlZzEgKyBzZWcyDQorICAgICAgICAgcHJpbnQgInNlZzI9JyVz
JywgYWRkcj0nJXMnIiAlIChzZWcyLCByZXByKGFkZHIpKQ0KICAgICAgICAg
IGhvc3RuYW1lLCBwb3J0ID0gYWRkcg0KICAgICAgICAgICMjc2VsZi5hc3Nl
cnRFcXVhbChob3N0bmFtZSwgc29ja2V0LmdldGhvc3RieW5hbWUoJ2xvY2Fs
aG9zdCcpKQ0KICAgICAgICAgIHNlbGYuYXNzZXJ0RXF1YWwobXNnLCBNU0cp
DQoqKioqKioqKioqKioqKioNCioqKiA0NDgsNDUzICoqKioNCi0tLSA0NTIs
NDU4IC0tLS0NCiAgICAgICAgICBleGNlcHQgc29ja2V0LmVycm9yOg0KICAg
ICAgICAgICAgICBwYXNzDQogICAgICAgICAgZWxzZToNCisgICAgICAgICAg
ICAgcHJpbnQgIlxuY29ubj0iICsgcmVwcihjb25uKSArICJcbmFkZHI9IiAr
IHJlcHIoYWRkcikNCiAgICAgICAgICAgICAgc2VsZi5mYWlsKCJFcnJvciB0
cnlpbmcgdG8gZG8gbm9uLWJsb2NraW5nIGFjY2VwdC4iKQ0KICAgICAgICAg
IHJlYWQsIHdyaXRlLCBlcnIgPSBzZWxlY3Quc2VsZWN0KFtzZWxmLnNlcnZd
LCBbXSwgW10pDQogICAgICAgICAgaWYgc2VsZi5zZXJ2IGluIHJlYWQ6DQoq
KioqKioqKioqKioqKioNCioqKiA0NzUsNDgwICoqKioNCi0tLSA0ODAsNDg2
IC0tLS0NCiAgICAgICAgICBleGNlcHQgc29ja2V0LmVycm9yOg0KICAgICAg
ICAgICAgICBwYXNzDQogICAgICAgICAgZWxzZToNCisgICAgICAgICAgICAg
cHJpbnQgIlxuY29ubj0iICsgcmVwcihjb25uKSArICJcbmFkZHI9IiArIHJl
cHIoYWRkcikNCiAgICAgICAgICAgICAgc2VsZi5mYWlsKCJFcnJvciB0cnlp
bmcgdG8gZG8gbm9uLWJsb2NraW5nIHJlY3YuIikNCiAgICAgICAgICByZWFk
LCB3cml0ZSwgZXJyID0gc2VsZWN0LnNlbGVjdChbY29ubl0sIFtdLCBbXSkN
CiAgICAgICAgICBpZiBjb25uIGluIHJlYWQ6DQoqKioqKioqKioqKioqKioN
CioqKiA1NDQsNTUwICoqKioNCiAgICAgICAgICBzZWxmLmNsaV9maWxlLndy
aXRlKE1TRykNCiAgICAgICAgICBzZWxmLmNsaV9maWxlLmZsdXNoKCkNCiAg
DQohIGRlZiBtYWluKCk6DQogICAgICBzdWl0ZSA9IHVuaXR0ZXN0LlRlc3RT
dWl0ZSgpDQogICAgICBzdWl0ZS5hZGRUZXN0KHVuaXR0ZXN0Lm1ha2VTdWl0
ZShHZW5lcmFsTW9kdWxlVGVzdHMpKQ0KICAgICAgc3VpdGUuYWRkVGVzdCh1
bml0dGVzdC5tYWtlU3VpdGUoQmFzaWNUQ1BUZXN0KSkNCi0tLSA1NTAsNTU2
IC0tLS0NCiAgICAgICAgICBzZWxmLmNsaV9maWxlLndyaXRlKE1TRykNCiAg
ICAgICAgICBzZWxmLmNsaV9maWxlLmZsdXNoKCkNCiAgDQohIGRlZiB0ZXN0
X21haW4oKToNCiAgICAgIHN1aXRlID0gdW5pdHRlc3QuVGVzdFN1aXRlKCkN
CiAgICAgIHN1aXRlLmFkZFRlc3QodW5pdHRlc3QubWFrZVN1aXRlKEdlbmVy
YWxNb2R1bGVUZXN0cykpDQogICAgICBzdWl0ZS5hZGRUZXN0KHVuaXR0ZXN0
Lm1ha2VTdWl0ZShCYXNpY1RDUFRlc3QpKQ0KKioqKioqKioqKioqKioqDQoq
KiogNTU0LDU1NyAqKioqDQogICAgICB0ZXN0X3N1cHBvcnQucnVuX3N1aXRl
KHN1aXRlKQ0KICANCiAgaWYgX19uYW1lX18gPT0gIl9fbWFpbl9fIjoNCiEg
ICAgIG1haW4oKQ0KLS0tIDU2MCw1NjMgLS0tLQ0KICAgICAgdGVzdF9zdXBw
b3J0LnJ1bl9zdWl0ZShzdWl0ZSkNCiAgDQogIGlmIF9fbmFtZV9fID09ICJf
X21haW5fXyI6DQohICAgICB0ZXN0X21haW4oKQ0K
---888574994-8578-1026132781=:28004
Content-Type: TEXT/PLAIN; charset=US-ASCII; name="test_socket.log.fbsd44"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.OS2.4.32.0207082353011.28004@tenring.andymac.org>
Content-Description: test_socket.log.fbsd44
Content-Disposition: attachment; filename="test_socket.log.fbsd44"

dGVzdF9zb2NrZXQNClRlc3RpbmcgZm9yIG1pc3Npb24gY3JpdGljYWwgY29u
c3RhbnRzLiAuLi4gb2sNClRlc3RpbmcgZ2V0c2VydmJ5bmFtZSgpLiAuLi4g
b2sNClRlc3RpbmcgZ2V0c29ja29wdCgpLiAuLi4gb2sNClRlc3RpbmcgaG9z
dG5hbWUgcmVzb2x1dGlvbiBtZWNoYW5pc21zLiAuLi4gb2sNCk1ha2luZyBz
dXJlIGdldG5hbWVpbmZvIGRvZXNuJ3QgY3Jhc2ggdGhlIGludGVycHJldGVy
LiAuLi4gb2sNClRlc3RpbmcgZm9yIGV4aXN0YW5jZSBvZiBub24tY3J1Y2lh
bCBjb25zdGFudHMuIC4uLiBvaw0KVGVzdGluZyByZWZlcmVuY2UgY291bnQg
Zm9yIGdldG5hbWVpbmZvLiAuLi4gb2sNClRlc3Rpbmcgc2V0c29ja29wdCgp
LiAuLi4gb2sNClRlc3RpbmcgZ2V0c29ja25hbWUoKS4gLi4uIG9rDQpUZXN0
aW5nIHRoYXQgc29ja2V0IG1vZHVsZSBleGNlcHRpb25zLiAuLi4gb2sNClRl
c3RpbmcgZnJvbWZkKCkuIC4uLiBvaw0KVGVzdGluZyByZWNlaXZlIGluIGNo
dW5rcyBvdmVyIFRDUC4gLi4uIG9rDQpUZXN0aW5nIHJlY3Zmcm9tKCkgaW4g
Y2h1bmtzIG92ZXIgVENQLiAuLi4gDQpzZWcxPSdNaWNoYWVsIEdpbGZpeCB3
YXMgaGUnLCBhZGRyPSdOb25lJw0Kc2VnMj0ncmUNCicsIGFkZHI9J05vbmUn
DQpFUlJPUg0KVGVzdGluZyBsYXJnZSByZWNlaXZlIG92ZXIgVENQLiAuLi4g
b2sNClRlc3RpbmcgbGFyZ2UgcmVjdmZyb20oKSBvdmVyIFRDUC4gLi4uIA0K
bXNnPSdNaWNoYWVsIEdpbGZpeCB3YXMgaGVyZQ0KJywgYWRkcj0nTm9uZScN
CkVSUk9SDQpUZXN0aW5nIHNlbmRhbGwoKSB3aXRoIGEgMjA0OCBieXRlIHN0
cmluZyBvdmVyIFRDUC4gLi4uIG9rDQpUZXN0aW5nIHNodXRkb3duKCkuIC4u
LiBvaw0KVGVzdGluZyByZWN2ZnJvbSgpIG92ZXIgVURQLiAuLi4gb2sNClRl
c3Rpbmcgc2VuZHRvKCkgYW5kIFJlY3YoKSBvdmVyIFVEUC4gLi4uIG9rDQpU
ZXN0aW5nIG5vbi1ibG9ja2luZyBhY2NlcHQuIC4uLiANCmNvbm49PHNvY2tl
dCBvYmplY3QsIGZkPTgsIGZhbWlseT0yLCB0eXBlPTEsIHByb3RvY29sPTA+
DQphZGRyPSgnMTI3LjAuMC4xJywgMzE0NCkNCkZBSUwNClRlc3Rpbmcgbm9u
LWJsb2NraW5nIGNvbm5lY3QuIC4uLiBvaw0KVGVzdGluZyBub24tYmxvY2tp
bmcgcmVjdi4gLi4uIA0KY29ubj08c29ja2V0IG9iamVjdCwgZmQ9OCwgZmFt
aWx5PTIsIHR5cGU9MSwgcHJvdG9jb2w9MD4NCmFkZHI9KCcxMjcuMC4wLjEn
LCAzMTQ2KQ0KRkFJTA0KVGVzdGluZyB3aGV0aGVyIHNldCBibG9ja2luZyB3
b3Jrcy4gLi4uIG9rDQpQZXJmb3JtaW5nIGZpbGUgcmVhZGxpbmUgdGVzdC4g
Li4uIG9rDQpQZXJmb3JtaW5nIHNtYWxsIGZpbGUgcmVhZCB0ZXN0LiAuLi4g
b2sNClBlcmZvcm1pbmcgdW5idWZmZXJlZCBmaWxlIHJlYWQgdGVzdC4gLi4u
IG9rDQoNCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0NCkVSUk9SOiBUZXN0
aW5nIHJlY3Zmcm9tKCkgaW4gY2h1bmtzIG92ZXIgVENQLg0KLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLQ0KVHJhY2ViYWNrIChtb3N0IHJlY2VudCBjYWxs
IGxhc3QpOg0KICBGaWxlICJMaWIvdGVzdC90ZXN0X3NvY2tldC5weSIsIGxp
bmUgMzYzLCBpbiB0ZXN0T3ZlckZsb3dSZWN2RnJvbQ0KICAgIGhvc3RuYW1l
LCBwb3J0ID0gYWRkcg0KVHlwZUVycm9yOiB1bnBhY2sgbm9uLXNlcXVlbmNl
DQoNCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0NCkVSUk9SOiBUZXN0aW5n
IGxhcmdlIHJlY3Zmcm9tKCkgb3ZlciBUQ1AuDQotLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tDQpUcmFjZWJhY2sgKG1vc3QgcmVjZW50IGNhbGwgbGFzdCk6
DQogIEZpbGUgIkxpYi90ZXN0L3Rlc3Rfc29ja2V0LnB5IiwgbGluZSAzNDks
IGluIHRlc3RSZWN2RnJvbQ0KICAgIGhvc3RuYW1lLCBwb3J0ID0gYWRkcg0K
VHlwZUVycm9yOiB1bnBhY2sgbm9uLXNlcXVlbmNlDQoNCj09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT0NCkZBSUw6IFRlc3Rpbmcgbm9uLWJsb2NraW5nIGFj
Y2VwdC4NCi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NClRyYWNlYmFjayAo
bW9zdCByZWNlbnQgY2FsbCBsYXN0KToNCiAgRmlsZSAiTGliL3Rlc3QvdGVz
dF9zb2NrZXQucHkiLCBsaW5lIDQ1NiwgaW4gdGVzdEFjY2VwdA0KICAgIHNl
bGYuZmFpbCgiRXJyb3IgdHJ5aW5nIHRvIGRvIG5vbi1ibG9ja2luZyBhY2Nl
cHQuIikNCiAgRmlsZSAiL2hvbWUvYW5keW1hYy9jdnMvcHl0aG9uL3B5dGhv
bi10ZXN0L0xpYi91bml0dGVzdC5weSIsIGxpbmUgMjU0LCBpbiBmYWlsDQog
ICAgcmFpc2Ugc2VsZi5mYWlsdXJlRXhjZXB0aW9uLCBtc2cNCkFzc2VydGlv
bkVycm9yOiBFcnJvciB0cnlpbmcgdG8gZG8gbm9uLWJsb2NraW5nIGFjY2Vw
dC4NCg0KPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0KRkFJTDogVGVzdGlu
ZyBub24tYmxvY2tpbmcgcmVjdi4NCi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0NClRyYWNlYmFjayAobW9zdCByZWNlbnQgY2FsbCBsYXN0KToNCiAgRmls
ZSAiTGliL3Rlc3QvdGVzdF9zb2NrZXQucHkiLCBsaW5lIDQ4NCwgaW4gdGVz
dFJlY3YNCiAgICBzZWxmLmZhaWwoIkVycm9yIHRyeWluZyB0byBkbyBub24t
YmxvY2tpbmcgcmVjdi4iKQ0KICBGaWxlICIvaG9tZS9hbmR5bWFjL2N2cy9w
eXRob24vcHl0aG9uLXRlc3QvTGliL3VuaXR0ZXN0LnB5IiwgbGluZSAyNTQs
IGluIGZhaWwNCiAgICByYWlzZSBzZWxmLmZhaWx1cmVFeGNlcHRpb24sIG1z
Zw0KQXNzZXJ0aW9uRXJyb3I6IEVycm9yIHRyeWluZyB0byBkbyBub24tYmxv
Y2tpbmcgcmVjdi4NCg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQ0KUmFu
IDI2IHRlc3RzIGluIDAuMzMzcw0KDQpGQUlMRUQgKGZhaWx1cmVzPTIsIGVy
cm9ycz0yKQ0KdGVzdCB0ZXN0X3NvY2tldCBmYWlsZWQgLS0gZXJyb3JzIG9j
Y3VycmVkOyBydW4gaW4gdmVyYm9zZSBtb2RlIGZvciBkZXRhaWxzDQoxIHRl
c3QgZmFpbGVkOg0KICAgIHRlc3Rfc29ja2V0DQoa
---888574994-8578-1026132781=:28004
Content-Type: TEXT/PLAIN; charset=US-ASCII; name="test_socket.log.os2emx"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.OS2.4.32.0207082353012.28004@tenring.andymac.org>
Content-Description: test_socket.log.os2emx
Content-Disposition: attachment; filename="test_socket.log.os2emx"

dGVzdF9zb2NrZXQNClRlc3RpbmcgZm9yIG1pc3Npb24gY3JpdGljYWwgY29u
c3RhbnRzLiAuLi4gb2sNClRlc3RpbmcgZ2V0c2VydmJ5bmFtZSgpLiAuLi4g
b2sNClRlc3RpbmcgZ2V0c29ja29wdCgpLiAuLi4gb2sNClRlc3RpbmcgaG9z
dG5hbWUgcmVzb2x1dGlvbiBtZWNoYW5pc21zLiAuLi4gb2sNCk1ha2luZyBz
dXJlIGdldG5hbWVpbmZvIGRvZXNuJ3QgY3Jhc2ggdGhlIGludGVycHJldGVy
LiAuLi4gb2sNClRlc3RpbmcgZm9yIGV4aXN0YW5jZSBvZiBub24tY3J1Y2lh
bCBjb25zdGFudHMuIC4uLiBvaw0KVGVzdGluZyByZWZlcmVuY2UgY291bnQg
Zm9yIGdldG5hbWVpbmZvLiAuLi4gb2sNClRlc3Rpbmcgc2V0c29ja29wdCgp
LiAuLi4gb2sNClRlc3RpbmcgZ2V0c29ja25hbWUoKS4gLi4uIG9rDQpUZXN0
aW5nIHRoYXQgc29ja2V0IG1vZHVsZSBleGNlcHRpb25zLiAuLi4gb2sNClRl
c3RpbmcgZnJvbWZkKCkuIC4uLiBvaw0KVGVzdGluZyByZWNlaXZlIGluIGNo
dW5rcyBvdmVyIFRDUC4gLi4uIG9rDQpUZXN0aW5nIHJlY3Zmcm9tKCkgaW4g
Y2h1bmtzIG92ZXIgVENQLiAuLi4gDQpzZWcxPSdNaWNoYWVsIEdpbGZpeCB3
YXMgaGUnLCBhZGRyPSdOb25lJw0Kc2VnMj0ncmUNCicsIGFkZHI9J05vbmUn
DQpFUlJPUg0KVGVzdGluZyBsYXJnZSByZWNlaXZlIG92ZXIgVENQLiAuLi4g
b2sNClRlc3RpbmcgbGFyZ2UgcmVjdmZyb20oKSBvdmVyIFRDUC4gLi4uIA0K
bXNnPSdNaWNoYWVsIEdpbGZpeCB3YXMgaGVyZQ0KJywgYWRkcj0nTm9uZScN
CkVSUk9SDQpUZXN0aW5nIHNlbmRhbGwoKSB3aXRoIGEgMjA0OCBieXRlIHN0
cmluZyBvdmVyIFRDUC4gLi4uIEZBSUwNClRlc3Rpbmcgc2h1dGRvd24oKS4g
Li4uIG9rDQpUZXN0aW5nIHJlY3Zmcm9tKCkgb3ZlciBVRFAuIC4uLiBvaw0K
VGVzdGluZyBzZW5kdG8oKSBhbmQgUmVjdigpIG92ZXIgVURQLiAuLi4gb2sN
ClRlc3Rpbmcgbm9uLWJsb2NraW5nIGFjY2VwdC4gLi4uIA0KY29ubj08c29j
a2V0IG9iamVjdCwgZmQ9MTMsIGZhbWlseT0yLCB0eXBlPTEsIHByb3RvY29s
PTA+DQphZGRyPSgnMTI3LjAuMC4xJywgMzQ0MykNCkZBSUwNClRlc3Rpbmcg
bm9uLWJsb2NraW5nIGNvbm5lY3QuIC4uLiBFUlJPUg0KVGVzdGluZyBub24t
YmxvY2tpbmcgcmVjdi4gLi4uIG9rDQpUZXN0aW5nIHdoZXRoZXIgc2V0IGJs
b2NraW5nIHdvcmtzLiAuLi4gb2sNClBlcmZvcm1pbmcgZmlsZSByZWFkbGlu
ZSB0ZXN0LiAuLi4gb2sNClBlcmZvcm1pbmcgc21hbGwgZmlsZSByZWFkIHRl
c3QuIC4uLiBvaw0KUGVyZm9ybWluZyB1bmJ1ZmZlcmVkIGZpbGUgcmVhZCB0
ZXN0LiAuLi4gb2sNCg0KPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0KRVJS
T1I6IFRlc3RpbmcgcmVjdmZyb20oKSBpbiBjaHVua3Mgb3ZlciBUQ1AuDQot
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQpUcmFjZWJhY2sgKG1vc3QgcmVj
ZW50IGNhbGwgbGFzdCk6DQogIEZpbGUgIi4uLy4uL0xpYi90ZXN0L3Rlc3Rf
c29ja2V0LnB5IiwgbGluZSAzNjMsIGluIHRlc3RPdmVyRmxvd1JlY3ZGcm9t
DQogICAgaG9zdG5hbWUsIHBvcnQgPSBhZGRyDQpUeXBlRXJyb3I6IHVucGFj
ayBub24tc2VxdWVuY2UNCg0KPT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0K
RVJST1I6IFRlc3RpbmcgbGFyZ2UgcmVjdmZyb20oKSBvdmVyIFRDUC4NCi0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NClRyYWNlYmFjayAobW9zdCByZWNl
bnQgY2FsbCBsYXN0KToNCiAgRmlsZSAiLi4vLi4vTGliL3Rlc3QvdGVzdF9z
b2NrZXQucHkiLCBsaW5lIDM0OSwgaW4gdGVzdFJlY3ZGcm9tDQogICAgaG9z
dG5hbWUsIHBvcnQgPSBhZGRyDQpUeXBlRXJyb3I6IHVucGFjayBub24tc2Vx
dWVuY2UNCg0KPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0KRVJST1I6IFRl
c3Rpbmcgbm9uLWJsb2NraW5nIGNvbm5lY3QuDQotLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tDQpUcmFjZWJhY2sgKG1vc3QgcmVjZW50IGNhbGwgbGFzdCk6
DQogIEZpbGUgIi4uLy4uL0xpYi90ZXN0L3Rlc3Rfc29ja2V0LnB5IiwgbGlu
ZSAxMTcsIGluIF90ZWFyRG93bg0KICAgIHNlbGYuZmFpbChtc2cpDQogIEZp
bGUgIkY6L0RFVi9DVlNfVEVTVC9QWVRIT04tVEVTVC9MaWIvdW5pdHRlc3Qu
cHkiLCBsaW5lIDI1NCwgaW4gZmFpbA0KICAgIHJhaXNlIHNlbGYuZmFpbHVy
ZUV4Y2VwdGlvbiwgbXNnDQpBc3NlcnRpb25FcnJvcjogKDU2LCAnU29ja2V0
IGlzIGFscmVhZHkgY29ubmVjdGVkJykNCg0KPT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PQ0KRkFJTDogVGVzdGluZyBzZW5kYWxsKCkgd2l0aCBhIDIwNDgg
Ynl0ZSBzdHJpbmcgb3ZlciBUQ1AuDQotLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tDQpUcmFjZWJhY2sgKG1vc3QgcmVjZW50IGNhbGwgbGFzdCk6DQogIEZp
bGUgIi4uLy4uL0xpYi90ZXN0L3Rlc3Rfc29ja2V0LnB5IiwgbGluZSAzNzYs
IGluIHRlc3RTZW5kQWxsDQogICAgc2VsZi5hc3NlcnRfKGxlbihyZWFkKSA9
PSAxMDI0LCAiRXJyb3IgcGVyZm9ybWluZyBzZW5kYWxsLiIpDQogIEZpbGUg
IkY6L0RFVi9DVlNfVEVTVC9QWVRIT04tVEVTVC9MaWIvdW5pdHRlc3QucHki
LCBsaW5lIDI2MiwgaW4gZmFpbFVubGVzcw0KICAgIGlmIG5vdCBleHByOiBy
YWlzZSBzZWxmLmZhaWx1cmVFeGNlcHRpb24sIG1zZw0KQXNzZXJ0aW9uRXJy
b3I6IEVycm9yIHBlcmZvcm1pbmcgc2VuZGFsbC4NCg0KPT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PQ0KRkFJTDogVGVzdGluZyBub24tYmxvY2tpbmcgYWNj
ZXB0Lg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQ0KVHJhY2ViYWNrICht
b3N0IHJlY2VudCBjYWxsIGxhc3QpOg0KICBGaWxlICIuLi8uLi9MaWIvdGVz
dC90ZXN0X3NvY2tldC5weSIsIGxpbmUgNDU2LCBpbiB0ZXN0QWNjZXB0DQog
ICAgc2VsZi5mYWlsKCJFcnJvciB0cnlpbmcgdG8gZG8gbm9uLWJsb2NraW5n
IGFjY2VwdC4iKQ0KICBGaWxlICJGOi9ERVYvQ1ZTX1RFU1QvUFlUSE9OLVRF
U1QvTGliL3VuaXR0ZXN0LnB5IiwgbGluZSAyNTQsIGluIGZhaWwNCiAgICBy
YWlzZSBzZWxmLmZhaWx1cmVFeGNlcHRpb24sIG1zZw0KQXNzZXJ0aW9uRXJy
b3I6IEVycm9yIHRyeWluZyB0byBkbyBub24tYmxvY2tpbmcgYWNjZXB0Lg0K
DQotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQpSYW4gMjYgdGVzdHMgaW4g
MC4xMzBzDQoNCkZBSUxFRCAoZmFpbHVyZXM9MiwgZXJyb3JzPTMpDQp0ZXN0
IHRlc3Rfc29ja2V0IGZhaWxlZCAtLSBlcnJvcnMgb2NjdXJyZWQ7IHJ1biBp
biB2ZXJib3NlIG1vZGUgZm9yIGRldGFpbHMNCjEgdGVzdCBmYWlsZWQ6DQog
ICAgdGVzdF9zb2NrZXQNCg==
---888574994-8578-1026132781=:28004--



From gward@python.net  Tue Jul  9 02:20:56 2002
From: gward@python.net (Greg Ward)
Date: Mon, 8 Jul 2002 21:20:56 -0400
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: <3D299E42.70200@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com> <3D220A86.5070003@lemburg.com> <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <m3znx79rk3.fsf@mira.informatik.hu-berlin.de> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com>
Message-ID: <20020709012056.GA2526@cthulhu.gerg.ca>

On 08 July 2002, M.-A. Lemburg said:
> Not sure whether it's already possible or not, but I'd prefer
> to keep the install command and have the package provide this
> information (site-packages vs. system-packages) as part of the
> setup.py or setup.cfg file.
> 
> Perhaps we could have some kind of category for distutils
> packages which marks them as system add-ons vs. site add-ons.

+1 -- this should definitely be up to the package author/packager, not
the local admin.  I once tried to convince Guido that the ability to
occasionally upgrade standard library modules/packages would be a good
thing, but he wasn't having it.  Any change of heart, O Mighty BDFL?

        Greg
-- 
Greg Ward - Python bigot                                gward@python.net
http://starship.python.net/~gward/
What the hell, go ahead and put all your eggs in one basket.



From tdelaney@avaya.com  Tue Jul  9 02:30:51 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Tue, 9 Jul 2002 11:30:51 +1000
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A400@natasha.auslabs.avaya.com>

> From: martin@v.loewis.de [mailto:martin@v.loewis.de]
> 
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
> > Perhaps we could have some kind of category for distutils
> > packages which marks them as system add-ons vs. site add-ons.
> 
> One approach would be for distutils to have a list of system packages
> built-in, depending on the Python release.

+1

Arbitrary package authors shouldn't be able to state that their package is a
system package - that should be up to the core team.

Of course, this would require that distutils can be updated (to allow new
system packages). I don't see much point in putting any more security in
place than that though ... make it a bit difficult, so people don't bother
trying to circumvent it. If someone wants to modify distutils themself, then
there isn't going to be much anyone can do about it.

Tim Delaney



From jeremy@zope.com  Mon Jul  8 23:41:27 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Mon, 8 Jul 2002 18:41:27 -0400
Subject: [Python-Dev] Are we having https/ssl problems?
In-Reply-To: <Pine.LNX.4.44.0207081615150.32139-100000@penguin.theopalgroup.com>
References: <Pine.LNX.4.44.0207081615150.32139-100000@penguin.theopalgroup.com>
Message-ID: <15658.5399.500893.599454@slothrop.zope.com>

>>>>> "KJ" == Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes:

  KJ> Hi all, This is not a bug report.  It is more of a query to find
  KJ> out if there are known problems with the current Python 2.3 CVS
  KJ> regarding SSL, httplib w/ https, or urllib w/ https.  I seem to
  KJ> remember tuning out some discussions on timeout sockets and SSL
  KJ> of late, so I thought I would ask.  Here is code that has worked
  KJ> previously, but does not in the current CVS:

It sounds like a bug to me.

  KJ> If this doesn't ring a bell with anyone, I will battle
  KJ> SourceForge once more and file a bug report.

I hope you can get away with at worst a minor skirmish.

I've made several changes to httplib recently to fix other SSL related
problems.  It appears the new code has some bugs.

Since you're using CVS, I'll mention that it provides many ways to
look for changes -- e.g. cvs log / annotate of individual files.  The
SF pages show all files, last revision, & mod time.  You can sort by
any of those fields.

Jeremy





From oren-py-d@hishome.net  Tue Jul  9 06:18:33 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 9 Jul 2002 01:18:33 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEENKACAB.tim.one@comcast.net>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <LNBBLJKPBEHFEDALKOLCEENKACAB.tim.one@comcast.net>
Message-ID: <20020709051833.GA32041@hishome.net>

On Mon, Jul 08, 2002 at 02:44:25PM -0400, Tim Peters wrote:
> [David Abrahams]
> > I keep running into the problem that there is no reliable way to
> > introspect about whether a type supports multi-pass iterability (in the
> > sense that an input stream might support only a single pass, but a list
> > supports multiple passes). I suppose you could check for __getitem__, but
> > that wouldn't cover linked lists, for example.
> >
> > Can anyone channel Guido's intent for me? Is this an oversight or a
> > deliberate design decision? Is there an interface for checking
> > multi-pass-ability that I've missed?
> 
> The language makes no such distinctions.  If an app wants to make them, it's
> up to the app to implement them.  Likewise for a way to tell a multipass
> iterator to "start over again".  The Python iteration protocol has only two
> methods, .next() to get "the next" item, and .iter() to return self; given a
> random iterator, those are the only things you can rely on.

I believe that when David was talking about multi-pass iterability he
wasn't referring to an iterator that can be told to "start over again" but
to an iterable object that can produce multiple independent iterators of
itself, each one good for a single iteration.

The language does make a distinction between an *iterable* object that may
have only an __iter__ method and an *iterator* that has a next method. This
distinction is blurred a bit by the fact that iterators also have an
__iter__ method that also makes them appear as one-shot iterables.

Imagine an altenative universe where a south african programmer called
Rossu van Guidom writes a wonderful language called Mamba and in that
language iterator semantics are defined like this:

* Objects that wish to be iterable define an __iter__() method returning an
iterator.

* An iterator is an object with a next() method. That's all.

* The for statement checks if an object has an __iter__ method.  If it
does, it calls it and uses the returned iterator.  If it doesn't, it tries 
to use the object itself.  If it doesn't have .next either it will fail
and report that the object is not iterable.

A Mamba programmer called Nero Hsorit has speculated in a mamba-dev posting 
that in an alternative universe in a language called 'Cobra' people kept 
getting confused between iterators and iterables :-)

	Oren




From tim.one@comcast.net  Tue Jul  9 06:47:13 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 09 Jul 2002 01:47:13 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <20020709051833.GA32041@hishome.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEAHADAB.tim.one@comcast.net>

[Oren Tirosh]
> I believe that when David was talking about multi-pass iterability he
> wasn't referring to an iterator that can be told to "start over again" but
> to an iterable object that can produce multiple independent iterators of
> itself, each one good for a single iteration.

To an excellent first approximation, it makes no difference.  I read David
the same as you, and that's where my "no such distinctions" came from.  The
"likewise" in "Likewise for a way to tell a multipass iterator to 'start
over again'" means "and in addition to what you asked about, not that
either" (which is something other people have asked about, and more often
than what David asked about).

> The language does make a distinction between an *iterable* object that
> may have only an __iter__ method and an *iterator* that has a next
> method.

Sure.  At the wrapper-level David works at, all Python supplies here is
PyObject_GetIter(x), which returns an iterator or a NULL, and in the former
case the only useful thing he can do with it is call PyIter_Next() on it.
There's simply no way for him to know whether calling PyObject_GetIter(x)
again will yield an iterator that produces the same sequence of values, or
even whether it will yield an iterator again at all.  He could hardcode
knowledge about a few types, like, e.g., the builtin list type, but that
wouldn't even extend to subclasses of list; similarly a subclass of file may
well fiddle its iterator to be multi-pass despite that the builtin file
doesn't.

> ...
> A Mamba programmer called Nero Hsorit has speculated in a
> mamba-dev posting that in an alternative universe in a language
> called 'Cobra' people kept getting confused between iterators and
> iterables :-)

David can't get there from here with or without confusion <wink>.




From greg@cosc.canterbury.ac.nz  Tue Jul  9 06:51:42 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 09 Jul 2002 17:51:42 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <20020709051833.GA32041@hishome.net>
Message-ID: <200207090551.g695pgb18970@oma.cosc.canterbury.ac.nz>

> Imagine an altenative universe where a south african programmer called
> Rossu van Guidom writes a wonderful language called Mamba and in that
> language iterator semantics are defined like this:
> 
> * Objects that wish to be iterable define an __iter__() method returning an
> iterator.
> 
> * An iterator is an object with a next() method. That's all.

But that doesn't allow for things like file objects, which,
although not iterators themselves, are capable of producing
iterators of different sorts which iterate over them in
different ways -- and yet they can only be iterated over
once.

In other words, there are such things as one-shot
iterables, even if iterables and iterators are kept
separate.

Maybe a one-shot iterable should raise an exception
if you try to obtain a second iterator from it?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From mal@lemburg.com  Tue Jul  9 08:33:45 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 09 Jul 2002 09:33:45 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
References: <B43D149A9AB2D411971300B0D03D7E8BF0A400@natasha.auslabs.avaya.com>
Message-ID: <3D2A91D9.3080301@lemburg.com>

Delaney, Timothy wrote:
>>From: martin@v.loewis.de [mailto:martin@v.loewis.de]
>>
>>"M.-A. Lemburg" <mal@lemburg.com> writes:
>>
>>
>>>Perhaps we could have some kind of category for distutils
>>>packages which marks them as system add-ons vs. site add-ons.
>>
>>One approach would be for distutils to have a list of system packages
>>built-in, depending on the Python release.
> 
> 
> +1
> 
> Arbitrary package authors shouldn't be able to state that their package is a
> system package - that should be up to the core team.

Hmm, I don't really see the need to make this more complicated.

Package authors should be sensible enough to not create
system packages unless these are actually part of the
core or understood as optional but standard add-on (e.g. the
Japanese codecs could be such an add-on).

Besides, it's easy enough to achieve the same effect by
subclassing the install command in your setup.py, so there is
not much gained security there.

The only advantage I see in Martin's approach is that it
would seem backwards compatible, but then: installing a
system package in a pre-2.3 system would not have the
desired effect at all, so the gained backwards compatibility
can't really be put to use.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From David Abrahams" <david.abrahams@rcn.com  Tue Jul  9 09:43:25 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Tue, 9 Jul 2002 04:43:25 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <LNBBLJKPBEHFEDALKOLCEENKACAB.tim.one@comcast.net> <20020709051833.GA32041@hishome.net>
Message-ID: <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com>

From: "Oren Tirosh" <oren-py-d@hishome.net>
>
> I believe that when David was talking about multi-pass iterability he
> wasn't referring to an iterator that can be told to "start over again"
but
> to an iterable object that can produce multiple independent iterators of
> itself, each one good for a single iteration.

That's right.

> The language does make a distinction between an *iterable* object that
may
> have only an __iter__ method and an *iterator* that has a next method.
This
> distinction is blurred a bit by the fact that iterators also have an
> __iter__ method that also makes them appear as one-shot iterables.

Yep. [Part of the reason I want to know whether I've got a one-shot
sequence is that inspecting that sequence then becomes an
information-destroying operation -- only being able to touch it once
changes how you have to handle it]

I was thinking one potentially nice way to introspect about
multi-pass-ability might be to get an iterator and to see whether it was
copyable. Currently even most multi-pass iterators can't be copied with
copy.copy().

-Dave





From Paul.Moore@atosorigin.com  Tue Jul  9 09:57:35 2002
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Tue, 9 Jul 2002 09:57:35 +0100
Subject: [Python-Dev] Single- vs. Multi-pass iterability
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com>

IIRC from earlier discussions on the list, iterators "by design" do not
expose this information. In C++ terms, all Python iterators are forward
iterators (I could argue here that it's the C++ usage of the word "iterator"
for something that "points" and often does more than just "iterate" that is
misleading, but that's off topic).

If you need to know more than that, I think that the design intent is that
you pass the *container* around, and get at the iterator via
iter(container). Of course, this sort of begs the question as to how you can
introspect a container, to determine what properties its iterators can have
(but lets not go there - I can see Alex Martelli popping up to claim that
the adaption PEP will let you do that :-)). But you do have a better chance,
by requiring that the container support a richer interface, or just by type
testing.

Paul.



From aleax@aleax.it  Tue Jul  9 10:09:19 2002
From: aleax@aleax.it (Alex Martelli)
Date: Tue, 9 Jul 2002 11:09:19 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <E17Rqzy-0006zb-00@mail.python.org>

On Tuesday 09 July 2002 10:57 am, Moore, Paul wrote:
> IIRC from earlier discussions on the list, iterators "by design" do not
> expose this information. In C++ terms, all Python iterators are forward
> iterators

I think they're _input_ iterators -- you can only "get" items through the
iterator, not "set" them (as you can with forward, but not input,
iterators in C++).

> can introspect a container, to determine what properties its iterators can
> have (but lets not go there - I can see Alex Martelli popping up to claim
> that the adaption PEP will let you do that :-)). But you do have a better

The adaptation (not "adaption") PEP 246 would just obviate the need to
invent yet one more infrastructure/plumbing ad-hoc "solution" here, but
would not by itself alone solve the need to design and designate one or
more protocols for "containers that yield augmented-iterators of kind X"
or for augmented-iterators themselves ("iterator able to replicate itself",
"iterator able to 'rewind'", "iterator to which you can write an item", etc).

The first step in studying such a need is whether it IS in fact a need.
Sure, "rich iterators" might come in handy, but do we NEED them...?
If so, then what kinds of rich-iterators do we in fact need?  How to get
at them seems a third-order problem at best (and here, of course, I
would suggest that adaptation IS good for this tertiary problem:-).

> chance, by requiring that the container support a richer interface, or just
> by type testing.

*Shudder*.  You're advocating MORE type-testing...?


Alex



From Paul.Moore@atosorigin.com  Tue Jul  9 10:31:01 2002
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Tue, 9 Jul 2002 10:31:01 +0100
Subject: [Python-Dev] Single- vs. Multi-pass iterability
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B42B@UKRUX002.rundc.uk.origin-it.com>

From: Alex Martelli [mailto:aleax@aleax.it]

>On Tuesday 09 July 2002 10:57 am, Moore, Paul wrote:
>
>> IIRC from earlier discussions on the list, iterators "by
>> design" do not expose this information. In C++ terms, all
>> Python iterators are forward iterators
>
>I think they're _input_ iterators -- you can only "get"
>items through the iterator, not "set" them (as you can with
>forward, but not input, iterators in C++).

Rats, you're right. I can never get the C++ terminology correct...

>> can introspect a container, to determine what properties
>> its iterators can have (but lets not go there - I can see
>> Alex Martelli popping up to claim that the adaption PEP
>> will let you do that :-)). But you do have a better
>
>The adaptation (not "adaption") PEP 246 would just obviate
>the need to invent yet one more infrastructure/plumbing
>ad-hoc "solution" here, but would not by itself alone solve
>the need to design and designate one or more protocols
>for "containers that yield augmented-iterators of kind X"
>or for augmented-iterators themselves ("iterator able to
>replicate itself", "iterator able to 'rewind'", "iterator
>to which you can write an item", etc).

Oh, I agree. Sorry, that was just an offhand comment without enough detail
to make sense on its own. In a way, I was expressing mild support for the
PEP as a general solution to "issues like this".

>The first step in studying such a need is whether it IS
>in fact a need. Sure, "rich iterators" might come in
>handy, but do we NEED them...? If so, then what kinds of
>rich-iterators do we in fact need? How to get at them
>seems a third-order problem at best (and here, of course,
>I would suggest that adaptation IS good for this tertiary
>problem:-).

Yes. I was taking as a given that if the original question had been asked,
then there was at least a perceived need. And refining that need into a
protocol is David's problem (should he want to go down that route). Of
course, David has since clarified his original question - what he's really
concerned about is telling whether calling next() on an iterator destroys
information (as it does for a file iterator). That's a valid concern, but as
I pointed out it's a property of the container, not of the iterator [and
querying the container as to whether its iterators *have* that property is
back to where we started].

I think a key issue here is that Python iterators are real objects, not
"concepts" as they are in C++. But my brain isn't up to understanding *why*
that issue is key...:-)

[I knew I shouldn't have started this].

>> chance, by requiring that the container support a richer
>> interface, or just by type testing.
>
>*Shudder*. You're advocating MORE type-testing...?

Definitely not. I was trying to point out that there may be a hole, if type
testing *is* the only answer. But the hole could easily be in my ability to
think of a better solution (quite possible, as I don't have the problem
myself).

Paul.



From David Abrahams" <david.abrahams@rcn.com  Tue Jul  9 10:37:47 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Tue, 9 Jul 2002 05:37:47 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17Rqzq-0003eW-00@mx05.mrf.mail.rcn.net>
Message-ID: <075a01c2272c$4acae390$6601a8c0@boostconsulting.com>

From: "Alex Martelli" <aleax@aleax.it>


> On Tuesday 09 July 2002 10:57 am, Moore, Paul wrote:
> > IIRC from earlier discussions on the list, iterators "by design" do not
> > expose this information. In C++ terms, all Python iterators are forward
> > iterators
>
> I think they're _input_ iterators -- you can only "get" items through the
> iterator, not "set" them (as you can with forward, but not input,
> iterators in C++).

C++ also has forward constant iterators which are not writable. Take the
const_iterator type of your favorite singly-linked-list implementation for
example.

Unfortunately, C++ iterators mix up a bunch of concepts which ought to be
orthogonal, like single-vs-multipass, whether they iterate over lvalues or
must use a proxy, direction of iterability. Excellent paper on the topic at
http://groups.yahoo.com/group/boost/files/iterator-categories.html.


> The first step in studying such a need is whether it IS in fact a need.
> Sure, "rich iterators" might come in handy, but do we NEED them...?
> If so, then what kinds of rich-iterators do we in fact need?  How to get
> at them seems a third-order problem at best (and here, of course, I
> would suggest that adaptation IS good for this tertiary problem:-).


I don't know if we need them, but I'm certainly finding that not having
some more information is difficult for me. If I need to make multiple
passes over the information in a generalized iterable object, the only
solution AFAICT is to unconditionally copy all the information into a list
first.

-Dave







From aleax@aleax.it  Tue Jul  9 11:03:40 2002
From: aleax@aleax.it (Alex Martelli)
Date: Tue, 9 Jul 2002 12:03:40 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <075a01c2272c$4acae390$6601a8c0@boostconsulting.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17Rqzq-0003eW-00@mx05.mrf.mail.rcn.net> <075a01c2272c$4acae390$6601a8c0@boostconsulting.com>
Message-ID: <E17Rrqm-0000aQ-00@mail.python.org>

On Tuesday 09 July 2002 11:37 am, David Abrahams wrote:
> From: "Alex Martelli" <aleax@aleax.it>
>
> > On Tuesday 09 July 2002 10:57 am, Moore, Paul wrote:
> > > IIRC from earlier discussions on the list, iterators "by design" do not
> > > expose this information. In C++ terms, all Python iterators are forward
> > > iterators
> >
> > I think they're _input_ iterators -- you can only "get" items through the
> > iterator, not "set" them (as you can with forward, but not input,
> > iterators in C++).
>
> C++ also has forward constant iterators which are not writable. Take the
> const_iterator type of your favorite singly-linked-list implementation for
> example.

Right, but then you can at least get the current item as many times as
you want before advancing -- input iterators and Python's iterators have
in common that get-and-advance is inseparable.


> I don't know if we need them, but I'm certainly finding that not having
> some more information is difficult for me. If I need to make multiple
> passes over the information in a generalized iterable object, the only
> solution AFAICT is to unconditionally copy all the information into a list
> first.

Yes, I can see that making such a copy willy-nilly could be a pity from
the performance viewpoint when, theoretically, one could otherwise
guarantee that the information is inalterable.  But is it all that frequent
that one can make such guarantees, e.g. that the underlying list or
dictionary (if any) cannot possibly be altered (e.g. by other threads)
during multiple iterations over it?  I.e. it might not be enough to know
that you can iterate again if needed -- you might also need some
guarantee that further iterations yield identical information, and that,
in turn, might prove more problematic in many cases (although maybe
not in yours -- I don't know enough details to tell!-).

So maybe using a "snapshot" strategy for the general case, and then
maybe specialcasing and optimizing a very few performance hotspots
where information CAN be guaranteed to be unchangeable and
multiply iterable (if you can locate any such hotspots) isn't quite as
bad as all that.  Just musing...


Alex




From oren-py-d@hishome.net  Tue Jul  9 12:21:36 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 9 Jul 2002 07:21:36 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <LNBBLJKPBEHFEDALKOLCEENKACAB.tim.one@comcast.net> <20020709051833.GA32041@hishome.net> <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com>
Message-ID: <20020709112136.GA73672@hishome.net>

On Tue, Jul 09, 2002 at 04:43:25AM -0400, David Abrahams wrote:
> Yep. [Part of the reason I want to know whether I've got a one-shot
> sequence is that inspecting that sequence then becomes an
> information-destroying operation -- only being able to touch it once
> changes how you have to handle it]
> 
> I was thinking one potentially nice way to introspect about
> multi-pass-ability might be to get an iterator and to see whether it was
> copyable. Currently even most multi-pass iterators can't be copied with
> copy.copy().

I wouldn't call it a one-shot sequence - it's just an iterator.  The name
iterator is enough to suggest that it is disposable and good for just one
pass through the container.

If the object has an __iter__ method but no next it's not an iterator and
therefore most likely re-iterable.  One notable exception is a file object.
File iterators affect the current position of the file.  If you think about 
it you'll see that file objects aren't really containers - they are already 
iterators.  The real container is the file on the disk and the file object 
represents a pointer to a position in this container used for scanning it
which a pretty good definition of an iterator.  The difference is cosmetic:
the next method is called readline and it returns an empty string instead of 
raising StopIteration.

class ifile(file):
    def __iter__(self):
        return self

    def next(self):
        s = self.readline()
        if s:
            return s
        raise StopIteration

class xfile:
    def __init__(self, filename):
        self.filename = filename

    def __iter__(self):
        return ifile(self.filename)

This pair of objects has a proper container/iterator relationship. The
xfile (stands for eXternal file, nothing to do with Mulder and Scully)
represents the file on the disk and each call to iter(xfileobject) returns
a new and independent iterator of the same container.

	Oren



From David Abrahams" <david.abrahams@rcn.com  Tue Jul  9 12:53:03 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Tue, 9 Jul 2002 07:53:03 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <LNBBLJKPBEHFEDALKOLCEENKACAB.tim.one@comcast.net> <20020709051833.GA32041@hishome.net> <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com> <20020709112136.GA73672@hishome.net>
Message-ID: <07bb01c2273f$2ed8ac90$6601a8c0@boostconsulting.com>

From: "Oren Tirosh" <oren-py-d@hishome.net>


> On Tue, Jul 09, 2002 at 04:43:25AM -0400, David Abrahams wrote:
> > Yep. [Part of the reason I want to know whether I've got a one-shot
> > sequence is that inspecting that sequence then becomes an
> > information-destroying operation -- only being able to touch it once
> > changes how you have to handle it]
> >
> > I was thinking one potentially nice way to introspect about
> > multi-pass-ability might be to get an iterator and to see whether it
was
> > copyable. Currently even most multi-pass iterators can't be copied with
> > copy.copy().
>
> I wouldn't call it a one-shot sequence - it's just an iterator.  The name
> iterator is enough to suggest that it is disposable and good for just one
> pass through the container.
>
> If the object has an __iter__ method but no next it's not an iterator and
> therefore most likely re-iterable.  One notable exception is a file
object.
> File iterators affect the current position of the file.

No kidding, that's the problem I'm talking about. It does me no good to
have a criterion for determinining re-iterability which fails for the case
I'm most concerned with ;-)

> If you think about
> it you'll see that file objects aren't really containers - they are
already
> iterators.  The real container is the file on the disk

There might not be a "real container" -- if it's an input pipe the data
disappears as you iterate it.

-Dave





From pinard@iro.umontreal.ca  Tue Jul  9 13:14:38 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 09 Jul 2002 08:14:38 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <20020709112136.GA73672@hishome.net>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>
 <LNBBLJKPBEHFEDALKOLCEENKACAB.tim.one@comcast.net>
 <20020709051833.GA32041@hishome.net>
 <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com>
 <20020709112136.GA73672@hishome.net>
Message-ID: <oqk7o5t7wx.fsf@titan.progiciels-bpi.ca>

[Oren Tirosh]

> class ifile(file):
>     def __iter__(self):
>         return self

>     def next(self):
>         s = self.readline()
>         if s:
>             return s
>         raise StopIteration

> class xfile:
>     def __init__(self, filename):
>         self.filename = filename

>     def __iter__(self):
>         return ifile(self.filename)

> This pair of objects has a proper container/iterator relationship.

This is all clear to me, except for one little thing.  I wonder why class
`ifile' has an `__iter__' method itself.  I know it is said to be the
"iterator protocol", and I wonder why it has to be.

My understanding is that `__iter__' returns an iterator all ready to be
enquired a number of times through `.next()' calls, and I presume that
if any re-initialisation has to take place, it is within `__iter__'.
However, as the iterator maintains its own progressive state, I do not see
the intent and purpose of the iterator having an `__iter__' method itself.
Would it make sense using the iterator `__iter__' as the preferred place
where it re-initialises itself?

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From oren-py-d@hishome.net  Tue Jul  9 13:27:19 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 9 Jul 2002 08:27:19 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <oqk7o5t7wx.fsf@titan.progiciels-bpi.ca>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <LNBBLJKPBEHFEDALKOLCEENKACAB.tim.one@comcast.net> <20020709051833.GA32041@hishome.net> <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com> <20020709112136.GA73672@hishome.net> <oqk7o5t7wx.fsf@titan.progiciels-bpi.ca>
Message-ID: <20020709122718.GA85236@hishome.net>

On Tue, Jul 09, 2002 at 08:14:38AM -0400, François Pinard wrote:
> This is all clear to me, except for one little thing.  I wonder why class
> `ifile' has an `__iter__' method itself.  I know it is said to be the
> "iterator protocol", and I wonder why it has to be.

I don't like it either.  In my previous message about the language 'Mamba' 
in an alternative universe I have an example of an alternative: if object 
has a tp_iter it is called, otherwise the object must have a tp_next.

> My understanding is that `__iter__' returns an iterator all ready to be
> enquired a number of times through `.next()' calls, and I presume that
> if any re-initialisation has to take place, it is within `__iter__'.
> However, as the iterator maintains its own progressive state, I do not see
> the intent and purpose of the iterator having an `__iter__' method itself.
> Would it make sense using the iterator `__iter__' as the preferred place
> where it re-initialises itself?

As far as I can tell this was done so that for could iterate over both
iterables and iterators.  I just don't see why it has to be done by all
iterators instead of in just one place, adding much confusion between
iterators and iterables in the process.

	Oren



From jacobs@penguin.theopalgroup.com  Tue Jul  9 13:49:13 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Tue, 9 Jul 2002 08:49:13 -0400 (EDT)
Subject: [Python-Dev] Are we having https/ssl problems?
In-Reply-To: <15658.5399.500893.599454@slothrop.zope.com>
Message-ID: <Pine.LNX.4.44.0207090846240.3202-100000@penguin.theopalgroup.com>

On Mon, 8 Jul 2002, Jeremy Hylton wrote:
> >>>>> "KJ" == Kevin Jacobs <jacobs@penguin.theopalgroup.com> writes:
> 
>   KJ> Hi all, This is not a bug report.  It is more of a query to find
>   KJ> out if there are known problems with the current Python 2.3 CVS
>   KJ> regarding SSL, httplib w/ https, or urllib w/ https.  I seem to
>   KJ> remember tuning out some discussions on timeout sockets and SSL
>   KJ> of late, so I thought I would ask.  Here is code that has worked
>   KJ> previously, but does not in the current CVS:
> 
> It sounds like a bug to me.

Now it is:

  http://sourceforge.net/tracker/index.php?func=detail&aid=579107&group_id=5470&atid=105470

> I've made several changes to httplib recently to fix other SSL related
> problems.  It appears the new code has some bugs.

It looks like your change to rework the fake SSL file exposed this one.  It
has something to do with the non-trivial way that httplib closes
connections.  In several places it just looks wrong.  I've attached a patch
that fixes the problem, but may break other things, and points out some
other potential problems.

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From barry@zope.com  Tue Jul  9 14:10:24 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 9 Jul 2002 09:10:24 -0400
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
 <3D220A86.5070003@lemburg.com>
 <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>
 <3D22ADD9.1030901@lemburg.com>
 <15650.64375.162977.160780@anthem.wooz.org>
 <3D2433B9.9080102@lemburg.com>
 <m3znx79rk3.fsf@mira.informatik.hu-berlin.de>
 <15657.39558.325764.651122@anthem.wooz.org>
 <3D299E42.70200@lemburg.com>
 <15657.51863.523283.977726@anthem.wooz.org>
 <3D29FD4F.4060607@lemburg.com>
Message-ID: <15658.57536.133296.126976@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> Hmm, maybe I wasn't clear enough: I think that a distutils
    MAL> package should have a flag in its setup.py which lets
    MAL> distutils tell whether it's a site package or a system
    MAL> package, e.g.

    | setup(... pkgtype='site-package' ...)
    | vs.
    | setup(... pkgtype='system-package' ...)

    MAL> (with pkgtype='site-package' as default value if not given)

    MAL> The user would in both cases type 'python setup.py install'
    MAL> but the install command would automatically choose the
    MAL> right target subdir (site-packages/ or system-packages/).

Except you can't always tell if its a system package or an add-on.
email for example is an add-on for Python 2.1, but a system package
for Python 2.2.

Ignoring this specific example for now (since none of this will exist
until Py2.3 anyway), it seems to me that there will be future packages
for which this is true too.  In that case, hardwiring site vs. system
in the package's setup.py isn't the right approach.

-Barry



From mal@lemburg.com  Tue Jul  9 14:28:52 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 09 Jul 2002 15:28:52 +0200
Subject: [Python-Dev] AtExit Functions
Message-ID: <3D2AE514.2050909@lemburg.com>

While working with mxTextTools 2.1.0b2, Mike Fletcher found that
he gets a fatal error when the interpreter exits.

Some tracing indicates that the cause is the at exit function
of mxTextTools which clears the cache of tag tables used by
the Tagging Engine in mxTextTools.

If these tables include references to (callable) Python instances,
Python can't properly clean them up when decref'ing them at
AtExit time.

Would it be safe to simply move the call_dll_exitfunc()
call just before the "clear threat" code in Py_Finalize() ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From fredrik@pythonware.com  Tue Jul  9 14:50:26 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 9 Jul 2002 15:50:26 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com><3D220A86.5070003@lemburg.com><m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de><3D22ADD9.1030901@lemburg.com><15650.64375.162977.160780@anthem.wooz.org><3D2433B9.9080102@lemburg.com><m3znx79rk3.fsf@mira.informatik.hu-berlin.de><15657.39558.325764.651122@anthem.wooz.org><3D299E42.70200@lemburg.com><15657.51863.523283.977726@anthem.wooz.org><3D29FD4F.4060607@lemburg.com> <15658.57536.133296.126976@anthem.wooz.org>
Message-ID: <006301c2274f$95cfabf0$0900a8c0@spiff>

barry wrote:

>     MAL> The user would in both cases type 'python setup.py install'
>     MAL> but the install command would automatically choose the
>     MAL> right target subdir (site-packages/ or system-packages/).
>=20
> Except you can't always tell if its a system package or an add-on.
> email for example is an add-on for Python 2.1, but a system package
> for Python 2.2.

assuming that the package maintainer is informed when a package
is added to the standard library, that packages won't move in and
out too much, and/or that most users probably don't want to down-
grade to an older package version, you could of course write:

    if sys.version_info >=3D (2, 2):
        pkgtype =3D "system-package"
    else:
        pkgtype =3D "site-package"

    setup(... pkgtype=3Dpkgtype ...)

in your setup.py file, once your package has been added.

</F>




From mal@lemburg.com  Tue Jul  9 15:15:15 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 09 Jul 2002 16:15:15 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com><3D220A86.5070003@lemburg.com><m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de><3D22ADD9.1030901@lemburg.com><15650.64375.162977.160780@anthem.wooz.org><3D2433B9.9080102@lemburg.com><m3znx79rk3.fsf@mira.informatik.hu-berlin.de><15657.39558.325764.651122@anthem.wooz.org><3D299E42.70200@lemburg.com><15657.51863.523283.977726@anthem.wooz.org><3D29FD4F.4060607@lemburg.com> <15658.57536.133296.126976@anthem.wooz.org> <006301c2274f$95cfabf0$0900a8c0@spiff>
Message-ID: <3D2AEFF3.5000600@lemburg.com>

Fredrik Lundh wrote:
> barry wrote:
> 
> 
>>    MAL> The user would in both cases type 'python setup.py install'
>>    MAL> but the install command would automatically choose the
>>    MAL> right target subdir (site-packages/ or system-packages/).
>>
>>Except you can't always tell if its a system package or an add-on.
>>email for example is an add-on for Python 2.1, but a system package
>>for Python 2.2.
> 
> 
> assuming that the package maintainer is informed when a package
> is added to the standard library, that packages won't move in and
> out too much, and/or that most users probably don't want to down-
> grade to an older package version, you could of course write:
> 
>     if sys.version_info >= (2, 2):
>         pkgtype = "system-package"
>     else:
>         pkgtype = "site-package"
> 
>     setup(... pkgtype=pkgtype ...)
> 
> in your setup.py file, once your package has been added.

Right.

A package author whose package moves into the core would have
to do this anyway, if s/he wants to maintain backwards compatibility
with older Python versions, since the distutils package in those
versions would not accept the new keyword.

Anyway, regardless of how we do it, we need to add the
'system-packages' dir to just before the '.../lib/pythonX.X'
entry in sys.path. If there's consent about this, I'd suggest
to move ahead in this direction as first step.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From guido@python.org  Tue Jul  9 15:15:45 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 09 Jul 2002 10:15:45 -0400
Subject: [Python-Dev] AtExit Functions
In-Reply-To: Your message of "Tue, 09 Jul 2002 15:28:52 +0200."
 <3D2AE514.2050909@lemburg.com>
References: <3D2AE514.2050909@lemburg.com>
Message-ID: <200207091415.g69EFjT01619@odiug.zope.com>

> While working with mxTextTools 2.1.0b2, Mike Fletcher found that
> he gets a fatal error when the interpreter exits.
> 
> Some tracing indicates that the cause is the at exit function
> of mxTextTools which clears the cache of tag tables used by
> the Tagging Engine in mxTextTools.
> 
> If these tables include references to (callable) Python instances,
> Python can't properly clean them up when decref'ing them at
> AtExit time.
> 
> Would it be safe to simply move the call_dll_exitfunc()
> call just before the "clear threat" code in Py_Finalize() ?

You mean call_ll_exitfuncs(). :-)

I think you may be making a wrong use of Py_AtExit().  The docs state
(since 1998):

  Since Python's internal finallization will have completed before the
  cleanup function, no Python APIs should be called by *func*.

I don't think it's safe to move the call forward.  (I don't know which
line you are referring to with ``"clear threat" code'' so I don't know
how far back you want to move it, but I think the intention is very
clear that this should be done at the very last.)

You may want to use the atexit.py module instead to schedule your
module's cleanup action; these exit functions are called much earlier.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mwh@python.net  Tue Jul  9 15:37:38 2002
From: mwh@python.net (Michael Hudson)
Date: 09 Jul 2002 15:37:38 +0100
Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py
Message-ID: <2mr8id0xxp.fsf@starship.python.net>

I'm curious about this bit of code from Lib/distutils/sysconfig.py:

,-----------------------------------------------------------------------
| # python_build: (Boolean) if true, we're either building Python or
| # building an extension with an un-installed Python, so we use
| # different (hard-wired) directories.
|
| argv0_path = os.path.dirname(os.path.abspath(sys.executable))
| landmark = os.path.join(argv0_path, "Modules", "Setup")
| if not os.path.isfile(landmark):
|     python_build = 0
| elif os.path.isfile(os.path.join(argv0_path, "Lib", "os.py")):
|     python_build = 1
| else:
|     python_build = os.path.isfile(os.path.join(os.path.dirname(argv0_path),
|                                                "Lib", "os.py"))
| del argv0_path, landmark
`-----------------------------------------------------------------------

Well, curious is a bit weak.  It's broken, and breaks (eg) the snake
farm's builds (because that is set up to build python in a directory
far away and over the hills from the source directory).

Why isn't it just

,-----------------------------------------------------------------------
| # python_build: (Boolean) if true, we're either building Python or
| # building an extension with an un-installed Python, so we use
| # different (hard-wired) directories.
|
| argv0_path = os.path.dirname(os.path.abspath(sys.executable))
| landmark = os.path.join(argv0_path, "Modules", "Setup")
| 
| python_build = os.path.isfile(landmark):
| 
| del argv0_path, landmark
`-----------------------------------------------------------------------

?  What cases does that get wrong?  I'd have changed it already, but I
have this feeling I must be missing something.

Cheers,
M.

-- 
  The meaning of "brunch" is as yet undefined.
                                             -- Simon Booth, ucam.chat



From mal@lemburg.com  Tue Jul  9 16:00:39 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 09 Jul 2002 17:00:39 +0200
Subject: [Python-Dev] AtExit Functions
References: <3D2AE514.2050909@lemburg.com> <200207091415.g69EFjT01619@odiug.zope.com>
Message-ID: <3D2AFA97.2030402@lemburg.com>

Guido van Rossum wrote:
>>While working with mxTextTools 2.1.0b2, Mike Fletcher found that
>>he gets a fatal error when the interpreter exits.
>>
>>Some tracing indicates that the cause is the at exit function
>>of mxTextTools which clears the cache of tag tables used by
>>the Tagging Engine in mxTextTools.
>>
>>If these tables include references to (callable) Python instances,
>>Python can't properly clean them up when decref'ing them at
>>AtExit time.
>>
>>Would it be safe to simply move the call_dll_exitfunc()
>>call just before the "clear threat" code in Py_Finalize() ?
> 
> 
> You mean call_ll_exitfuncs(). :-)

Yeah... today's my typo day :-)

> I think you may be making a wrong use of Py_AtExit().  The docs state
> (since 1998):
> 
>   Since Python's internal finallization will have completed before the
>   cleanup function, no Python APIs should be called by *func*.

Hmm, and that includes Py_DECREF() and PyObject_Del() ?

In that case, I have a problem since I'm using those
two to clean up caches and free lists in the mx tools.

> I don't think it's safe to move the call forward.  (I don't know which
> line you are referring to with ``"clear threat" code'' so I don't know
> how far back you want to move it, but I think the intention is very
> clear that this should be done at the very last.)

Here's the snippet:

void
Py_Finalize(void)
{
	PyInterpreterState *interp;
	PyThreadState *tstate;

	if (!initialized)
		return;

	/* The interpreter is still entirely intact at this point, and the
	 * exit funcs may be relying on that.  In particular, if some thread
	 * or exit func is still waiting to do an import, the import machinery
	 * expects Py_IsInitialized() to return true.  So don't say the
	 * interpreter is uninitialized until after the exit funcs have run.
	 * Note that Threading.py uses an exit func to do a join on all the
	 * threads created thru it, so this also protects pending imports in
	 * the threads created via Threading.
	 */
	call_sys_exitfunc();
	initialized = 0;

	/* Get current thread state and interpreter pointer */
	tstate = PyThreadState_Get();
	interp = tstate->interp;

	/* Disable signal handling */
	PyOS_FiniInterrupts();

	/* Cleanup Codec registry */
	_PyCodecRegistry_Fini();

	/* Destroy all modules */
	PyImport_Cleanup();

	/* Destroy the database used by _PyImport_{Fixup,Find}Extension */
	_PyImport_Fini();

----------------------------------
move call_ll_exitfuncs() here
----------------------------------

	/* Debugging stuff */
#ifdef COUNT_ALLOCS
	dump_counts();
#endif

#ifdef Py_REF_DEBUG
	fprintf(stderr, "[%ld refs]\n", _Py_RefTotal);
#endif

#ifdef Py_TRACE_REFS
	if (Py_GETENV("PYTHONDUMPREFS")) {
		_Py_PrintReferences(stderr);
	}
#endif /* Py_TRACE_REFS */

	/* Now we decref the exception classes.  After this point nothing
	   can raise an exception.  That's okay, because each Fini() method
	   below has been checked to make sure no exceptions are ever
	   raised.
	*/
	_PyExc_Fini();

	/* Delete current thread */
	PyInterpreterState_Clear(interp);
	PyThreadState_Swap(NULL);
	PyInterpreterState_Delete(interp);

	PyMethod_Fini();
	PyFrame_Fini();
	PyCFunction_Fini();
	PyTuple_Fini();
	PyString_Fini();
	PyInt_Fini();
	PyFloat_Fini();

#ifdef Py_USING_UNICODE
	/* Cleanup Unicode implementation */
	_PyUnicode_Fini();
#endif

	/* XXX Still allocated:
	   - various static ad-hoc pointers to interned strings
	   - int and float free list blocks
	   - whatever various modules and libraries allocate
	*/

	PyGrammar_RemoveAccelerators(&_PyParser_Grammar);

#ifdef PYMALLOC_DEBUG
	if (Py_GETENV("PYTHONMALLOCSTATS"))
		_PyObject_DebugMallocStats();
#endif

--------------------------------
	call_ll_exitfuncs();
--------------------------------

#ifdef Py_TRACE_REFS
	_Py_ResetReferences();
#endif /* Py_TRACE_REFS */
}

> You may want to use the atexit.py module instead to schedule your
> module's cleanup action; these exit functions are called much earlier.

That's difficult to get right since I have to register such a
function from C. Also, atexit.py is not present in
Python 1.5.2.

I could probably use a hack in the module dictionary which
then triggers calling a cleanup function when the dictionary
gets cleared, but there's a problem with this: clearing
the module is easily possible for a user as well and doing
so would cause seg faults if the user continues to call
API on the module (maybe unknowingly through destructors).

Looks like the only way to "solve" the problem is by simply
leaking memory :-(

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Tue Jul  9 16:08:20 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 09 Jul 2002 17:08:20 +0200
Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py
References: <2mr8id0xxp.fsf@starship.python.net>
Message-ID: <3D2AFC64.3030400@lemburg.com>

Michael Hudson wrote:
> I'm curious about this bit of code from Lib/distutils/sysconfig.py:
> 
> ,-----------------------------------------------------------------------
> | # python_build: (Boolean) if true, we're either building Python or
> | # building an extension with an un-installed Python, so we use
> | # different (hard-wired) directories.
> |
> | argv0_path = os.path.dirname(os.path.abspath(sys.executable))
> | landmark = os.path.join(argv0_path, "Modules", "Setup")
> | if not os.path.isfile(landmark):
> |     python_build = 0
> | elif os.path.isfile(os.path.join(argv0_path, "Lib", "os.py")):
> |     python_build = 1
> | else:
> |     python_build = os.path.isfile(os.path.join(os.path.dirname(argv0_path),
> |                                                "Lib", "os.py"))
> | del argv0_path, landmark
> `-----------------------------------------------------------------------
> 
> Well, curious is a bit weak.  It's broken, and breaks (eg) the snake
> farm's builds (because that is set up to build python in a directory
> far away and over the hills from the source directory).
> 
> Why isn't it just
> 
> ,-----------------------------------------------------------------------
> | # python_build: (Boolean) if true, we're either building Python or
> | # building an extension with an un-installed Python, so we use
> | # different (hard-wired) directories.
> |
> | argv0_path = os.path.dirname(os.path.abspath(sys.executable))
> | landmark = os.path.join(argv0_path, "Modules", "Setup")
> | 
> | python_build = os.path.isfile(landmark):
> | 
> | del argv0_path, landmark
> `-----------------------------------------------------------------------
> 
> ?  What cases does that get wrong?  I'd have changed it already, but I
> have this feeling I must be missing something.

Is Modules/Setup a landmark on all Python build platforms,
e.g. on Macs, Windows and other non-Unix platforms as well ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mwh@python.net  Tue Jul  9 16:14:13 2002
From: mwh@python.net (Michael Hudson)
Date: 09 Jul 2002 16:14:13 +0100
Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py
In-Reply-To: "M.-A. Lemburg"'s message of "Tue, 09 Jul 2002 17:08:20 +0200"
References: <2mr8id0xxp.fsf@starship.python.net> <3D2AFC64.3030400@lemburg.com>
Message-ID: <2mu1n9ndbu.fsf@starship.python.net>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> Michael Hudson wrote:
> > I'm curious about this bit of code from Lib/distutils/sysconfig.py:
> > [...]
> > ?  What cases does that get wrong?  I'd have changed it already, but I
> > have this feeling I must be missing something.
> 
> Is Modules/Setup a landmark on all Python build platforms,
> e.g. on Macs, Windows and other non-Unix platforms as well ?

Given that distutils isn't used for building Python's own extension
modules on non-Unix platforms, does it matter?

Cheers,
M.

-- 
  ... Windows proponents tell you that it will solve things that
  your Unix system people keep telling you are hard.  The Unix 
  people are right: they are hard, and Windows does not solve 
  them, ...                            -- Tim Bradshaw, comp.lang.lisp



From fdrake@acm.org  Tue Jul  9 19:02:35 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 9 Jul 2002 14:02:35 -0400
Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py
In-Reply-To: <3D2AFC64.3030400@lemburg.com>
References: <2mr8id0xxp.fsf@starship.python.net>
 <3D2AFC64.3030400@lemburg.com>
Message-ID: <15659.9531.709381.202975@grendel.zope.com>

M.-A. Lemburg writes:
 > Is Modules/Setup a landmark on all Python build platforms,
 > e.g. on Macs, Windows and other non-Unix platforms as well ?

It's only on Unix as far as I'm aware.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From martin@v.loewis.de  Tue Jul  9 19:42:08 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 09 Jul 2002 20:42:08 +0200
Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py
In-Reply-To: <2mr8id0xxp.fsf@starship.python.net>
References: <2mr8id0xxp.fsf@starship.python.net>
Message-ID: <m3ptxwyc8v.fsf@mira.informatik.hu-berlin.de>

Michael Hudson <mwh@python.net> writes:

> ?  What cases does that get wrong?  I'd have changed it already, but I
> have this feeling I must be missing something.

This looks overly complex to me, too, but you may want to ask the
author specifically:

<quote>
revision 1.46
date: 2002/06/04 15:28:21;  author: fdrake;  state: Exp;  lines: +23 -15
When using a Python that has not been installed to build 3rd-party
modules, distutils does not understand that the build version of the
source tree is needed.

This patch fixes distutils.sysconfig to understand that the running
Python is part of the build tree and needs to use the appropriate
"shape" of the tree. This does not assume anything about the current
directory, so can be used to build 3rd-party modules using Python's
build tree as well.

This is useful since it allows us to use a non-installed debug-mode
Python with 3rd-party modules for testing. It as the side-effect that
set_python_build() is no longer needed (the hack which was added to
allow distutils to be used to build the "standard" extension modules).

</quote>

Regards,
Martin



From guido@python.org  Tue Jul  9 21:04:13 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 09 Jul 2002 16:04:13 -0400
Subject: [Python-Dev] Re: [Python-checkins] Using string methods in stdlib
In-Reply-To: Your message of "Wed, 26 Jun 2002 19:05:03 EDT."
 <3D1A489F.93B8942B@metaslash.com>
References: <E17N6x9-0004ZU-00@usw-pr-cvs1.sourceforge.net> <3D1A362B.F3D2D585@metaslash.com> <003901c21d5f$3970af20$ced241d5@hagrid>
 <3D1A489F.93B8942B@metaslash.com>
Message-ID: <200207092004.g69K4Do03854@odiug.zope.com>

> (moved to python-dev and changed title)

The move didn't succeed.  But I'm moving this response there.

[Ping]
> > > >
> > > > Update of /cvsroot/python/python/dist/src/Lib
> > > >
> > > > Modified Files:
> > > >         cgitb.py
> > > >
> > > > +             if name[:1] == '_': continue

[Neal]
> > > Any reason not to use:
> > >
> > > if name.startswith('_'): continue
> > >
> > > ?

[Fredrik]
> > tried benchmarking?

[Neal again]
> I wasn't asking because of speed.  I don't know
> which version is faster and I couldn't care less.
> I think using the method is clearer.

startswith() was added because it was observed that there were
(relatively) frequent bugs involving tests for s[:I] == C where len(C)
!= I, either due to miscounting or due to an edit of C without a
matching edit of I.  startswith() avoids all that.

> > and figuring out that "_" is exactly one character long isn't
> > that hard, really.
> 
> I agree that for a single character either way is clear.

Agreed too.  The startswith() use case is for string long enough that
you don't "see" the length immediately.  Probably that means anything
longer than 4.  But in order to create good habits I think it's fine
to use it in all cases.

> > (can we please cut this python newspeak enforcement crap
> > now, btw.  even if slicing hadn't been much faster, there's
> > nothing wrong with using an idiom that has worked perfectly
> > fine for the last decade...)

Maybe Neal is showing a bit too much of youthful enthusiasm for the
new way.  But I don't see it as enforcement crap.  When I see Python
code I wrote 10 years ago that works fine, I usually still think, "um,
I wouldn't have written it that way now."  If I think that about code
that I feel is important as an example for later generations I like to
fix it.

We're trying to stay out of the modules that need to remain 1.5.2
compatible.

> I thought the stdlib used startswith/endswith.  But I did 
> a simple grep just to find where startswith could be used and 
> was surprised to find about 150 cases.  Many are 1 char,
> but there are many others of 5+ chars which make it harder
> to determine immediately if the code is correct.
> 
> I also see several cases of code like this in mimify:
> 
> 	line[:len(prefix)] == prefix
> 
> and other places where the length is calculated elsewhere, 
> (rlcompleter) making it even harder to verify correctness.
> 
> Part of the reason to prefer the methods is for defensive programming.
> There is duplicate information by using slicing (str & length) and 
> it's possible to change half the information and not the other, 
> leading to bugs.  That's not possible with the methods.
> 
> I don't think the stdlib should use every new feature.  But I
> do think it should reflect the best programming practices and
> should be programmed defensively in order to try to avoid future bugs.

I agree for new code, but I think we should be conservative in
migrating existing code to use new idioms.  It's better only to do
that as part of a general overhaul of a module.  As I've remarked
before, I'm no big fan of "peephole" changes, where lots of modules
are changed to implement one particular style change (e.g. string
methods).  Historically, such peephole changes have always introduced
bugs because it's 99% boring work, and then you start making mistakes.
Also, it leads to anachronisms where ancient code suddenly makes use
of a modern feature but otherwise still looks ancient.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pinard@iro.umontreal.ca  Tue Jul  9 22:54:40 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 09 Jul 2002 17:54:40 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <20020709122718.GA85236@hishome.net>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>
 <LNBBLJKPBEHFEDALKOLCEENKACAB.tim.one@comcast.net>
 <20020709051833.GA32041@hishome.net>
 <06ec01c22724$bf1d2760$6601a8c0@boostconsulting.com>
 <20020709112136.GA73672@hishome.net>
 <oqk7o5t7wx.fsf@titan.progiciels-bpi.ca>
 <20020709122718.GA85236@hishome.net>
Message-ID: <oq7kk4tvmn.fsf@titan.progiciels-bpi.ca>

[Oren Tirosh]
> [François Pinard]

> > However, as the iterator maintains its own progressive state, I do not see
> > the intent and purpose of the iterator having an `__iter__' method itself.

> As far as I can tell this was done so that for could iterate over both
> iterables and iterators.

That is, that an iterator is always itself an iterable.  I guess the real
question is: could we have the guarantee that if an iterable returns an
iterator through the iterable's __iter__, the iterator's __iter__ method
will never be called from looping over the iterable?  If we do not have
that guarantee, then, when (and why) will the iterator's __iter__ be called?

I did not find an answer to these questions neither from the Reference Manual
nor the PEP, yet I confess that the exposition of the C API might hold an
answer I could not understand.  Could it be explained without referring to C?

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From greg@cosc.canterbury.ac.nz  Wed Jul 10 00:54:25 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 10 Jul 2002 11:54:25 +1200 (NZST)
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <oq7kk4tvmn.fsf@titan.progiciels-bpi.ca>
Message-ID: <200207092354.g69NsP726034@oma.cosc.canterbury.ac.nz>

pinard@iro.umontreal.ca:

> could we have the guarantee that if an iterable returns an
> iterator through the iterable's __iter__, the iterator's __iter__ method
> will never be called from looping over the iterable?

[...pause while Greg's brain parses that sentence...]

Yes, I believe that's true.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From skip@pobox.com  Wed Jul 10 00:57:56 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 9 Jul 2002 18:57:56 -0500
Subject: [Python-Dev] anydbm/whichdb fix?
Message-ID: <15659.30852.807030.896503@gargle.gargle.HOWL>

Folks,

Jack has been having trouble with dbm stuff on Mac OS X since my recent
changes to setup.py and configure, and I am supposed to be shepherding a
test case for the whichdb module.  The two sort of go hand-in-hand.  I've
seen the same problem as Jack under certain circumstances on Linux.  I
reimplemented Greg Ball's whichdb.py patch and would appreciate some
feedback from others who've crossed this bit of dirt in the past before I
check in the two files (Jack, Guido, Barry, I seem to recall all of you
having anydbm/whichdb problems at one point).

The patch in question is at

    http://python.org/sf/541694

Skip



From pinard@iro.umontreal.ca  Wed Jul 10 02:05:58 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 09 Jul 2002 21:05:58 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <200207092354.g69NsP726034@oma.cosc.canterbury.ac.nz>
References: <200207092354.g69NsP726034@oma.cosc.canterbury.ac.nz>
Message-ID: <oqit3os87d.fsf@titan.progiciels-bpi.ca>

[Greg Ewing]

> pinard@iro.umontreal.ca:

> > could we have the guarantee that if an iterable returns an
> > iterator through the iterable's __iter__, the iterator's __iter__ method
> > will never be called from looping over the iterable?

> [...pause while Greg's brain parses that sentence...]

(Sorry for my bad English.)

> Yes, I believe that's true.

If yes, then, the Library Reference is misleading, at the page:

   http://www.python.org/dev/doc/devel/lib/typeiter.html

when it strongly says that any iterator's __iter__ method is "required".
I guess this is a possible source of confusion.

The context does not make it clear that the iterator's __iter__ method is
*only* required whenever one *also* wants to use an iterator as an iterable.
Better would be to describe __iter__ only once, the first time through,
saying everything there that has to be said, and only retain for iterators
the requirement of having a `next()' method.  We should describe the truth.

P.S. - Also, I do not understand the tiny bit about the `in' statement in the
above page.  Has `in' ever been a statement?  If it refers to the comparison
operator `in', then has it any special properties when used with iterators?
I'm unsuccessful at seeing any hint about this from the documentation:

   http://www.python.org/dev/doc/devel/ref/comparisons.html

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From tim_one@email.msn.com  Wed Jul 10 04:48:33 2002
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 9 Jul 2002 23:48:33 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <oqit3os87d.fsf@titan.progiciels-bpi.ca>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEFOADAB.tim_one@email.msn.com>

[François Pinard]
> If yes, then, the Library Reference is misleading, at the page:
>
>    http://www.python.org/dev/doc/devel/lib/typeiter.html
>
> when it strongly says that any iterator's __iter__ method is "required".
> I guess this is a possible source of confusion.

This is how Python the language is defined.  The current C Python
implementation doesn't enforce it, and you may be able to get away with
defining an iterator that doesn't supply an __iter__ method in some contexts
under the current C implementation.  If you do, though, you're breaking the
rules, and there's no guarantee your code will continue to work.

> The context does not make it clear that the iterator's __iter__ method is
> *only* required whenever one *also* wants to use an iterator as an
> iterable.

That's not how the iteration protocol is defined, and isn't how it should be
defined either.  Requiring *some* method with a reserved name is an aid to
introspection, lest it become impossible to distinguish, say, an iterator
from an instance of a doubly-linked list node class that just happens to
supply methods named .prev() and .next() for an unrelated purpose.

> Better would be to describe __iter__ only once, the first time through,
> saying everything there that has to be said, and only retain for iterators
> the requirement of having a `next()' method.  We should describe
> the truth.

Except that iterators are required to have an __iter__ method:  this is a
matter of definition, not just of reverse-engineering the minimum you can
get away with under the current implementation in assorted contexts.  You'll
discover this hard way the first time you try to pass an iterator without an
__iter__ method to a routine you didn't write that says it accepts any
iterable object as an argument.  Such a routine is entitled-- by the
documented requirements --to rely on its argument responding sensibly to an
__iter__ message.

> P.S. - Also, I do not understand the tiny bit about the `in'
> statement in the above page.  Has `in' ever been a statement?

I figure you're talking about this:

    ... to be used with the for and in statements

The tail end of that is indeed worded poorly; 'in' isn't a statement.

> If it refers to the comparison operator `in',

Yes, that's the intent.

> then has it any special properties when used with iterators?

In

    x in y

y can be any iterable object.  As an extreme example,

    if "error\n" in file('msgs'):





From barry@zope.com  Wed Jul 10 05:46:19 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 10 Jul 2002 00:46:19 -0400
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com>
 <3D220A86.5070003@lemburg.com>
 <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de>
 <3D22ADD9.1030901@lemburg.com>
 <15650.64375.162977.160780@anthem.wooz.org>
 <3D2433B9.9080102@lemburg.com>
 <m3znx79rk3.fsf@mira.informatik.hu-berlin.de>
 <15657.39558.325764.651122@anthem.wooz.org>
 <3D299E42.70200@lemburg.com>
 <15657.51863.523283.977726@anthem.wooz.org>
 <3D29FD4F.4060607@lemburg.com>
 <15658.57536.133296.126976@anthem.wooz.org>
 <006301c2274f$95cfabf0$0900a8c0@spiff>
 <3D2AEFF3.5000600@lemburg.com>
Message-ID: <15659.48155.786445.996056@anthem.wooz.org>

>>>>> "MAL" == M  <mal@lemburg.com> writes:

    MAL> Anyway, regardless of how we do it, we need to add the
    MAL> 'system-packages' dir to just before the '.../lib/pythonX.X'
    MAL> entry in sys.path. If there's consent about this, I'd suggest
    MAL> to move ahead in this direction as first step.

+1

Perhaps also backport to 2.2 and (maybe? maybe not?) 2.1.
-Barry



From Jack.Jansen@cwi.nl  Wed Jul 10 09:58:35 2002
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Wed, 10 Jul 2002 10:58:35 +0200
Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py
In-Reply-To: <2mr8id0xxp.fsf@starship.python.net>
Message-ID: <37A35504-93E3-11D6-AC50-0030655234CE@cwi.nl>

On Tuesday, July 9, 2002, at 04:37 , Michael Hudson wrote:

> Why isn't it just
>
> ,-----------------------------------------------------------------------
> | # python_build: (Boolean) if true, we're either building Python or
> | # building an extension with an un-installed Python, so we use
> | # different (hard-wired) directories.
> |
> | argv0_path = os.path.dirname(os.path.abspath(sys.executable))
> | landmark = os.path.join(argv0_path, "Modules", "Setup")
> |
> | python_build = os.path.isfile(landmark):
> |
> | del argv0_path, landmark
> `-----------------------------------------------------------------------

This won't work for one of the standard use cases: having multiple 
"build" subdirectories of the source directory (where you build for 
different platforms or some such).

And on the other question: as of a week ago setup.py is also being used 
to build at least some of the MacPython extension modules. But as for 
MacPython the build tree and the install tree are one and the same there 
is no problem.

And as to a general solution to the problem: how about parsing the 
Makefile that sits beside the interpreter? In all use cases (I think 
also in your example of build directories very far away over the hills) 
the Makefile will sit in the same directory as the interpreter. And the 
Makefile will have the srcdir variable that points to the source 
directory. And we have a makefile parser in distutils.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -




From mwh@python.net  Wed Jul 10 10:56:03 2002
From: mwh@python.net (Michael Hudson)
Date: 10 Jul 2002 10:56:03 +0100
Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py
In-Reply-To: Jack Jansen's message of "Wed, 10 Jul 2002 10:58:35 +0200"
References: <37A35504-93E3-11D6-AC50-0030655234CE@cwi.nl>
Message-ID: <2mznwzsy8c.fsf@starship.python.net>

Jack Jansen <Jack.Jansen@cwi.nl> writes:

> On Tuesday, July 9, 2002, at 04:37 , Michael Hudson wrote:
> 
> > Why isn't it just
> >
> > ,-----------------------------------------------------------------------
> > | # python_build: (Boolean) if true, we're either building Python or
> > | # building an extension with an un-installed Python, so we use
> > | # different (hard-wired) directories.
> > |
> > | argv0_path = os.path.dirname(os.path.abspath(sys.executable))
> > | landmark = os.path.join(argv0_path, "Modules", "Setup")
> > |
> > | python_build = os.path.isfile(landmark):
> > |
> > | del argv0_path, landmark
> > `-----------------------------------------------------------------------
> 
> This won't work for one of the standard use cases: having multiple 
> "build" subdirectories of the source directory (where you build for 
> different platforms or some such).

How so?  It worked for my cron jobs last night, which build in this
fashion.

> And on the other question: as of a week ago setup.py is also being used 
> to build at least some of the MacPython extension modules.

Is this MacPython as built by CodeWarrior?  I'm counting MacOS X as
unix when it's convenient to do so :)

> But as for MacPython the build tree and the install tree are one and
> the same there is no problem.

Don't understand, sorry.

> And as to a general solution to the problem: how about parsing the 
> Makefile that sits beside the interpreter? 

If there's a Modules/Setup file, that's what we do.

> In all use cases (I think also in your example of build directories
> very far away over the hills) the Makefile will sit in the same
> directory as the interpreter.

So you're suggesting that we use the Makefile as the landmark instead
of Modules/Setup?

> And the Makefile will have the srcdir variable that points to the
> source directory. And we have a makefile parser in distutils.

That's in effect what happens now.

Cheers,
M.

-- 
  I hate leaving Windows95 boxes publically accessible, so shifting
  even to NT is a blessing in some ways.  At least I can reboot them
  remotely in a sane manner, rather than having to send them malformed
  packets.      -- http://bofhcam.org/journal/journal.html, 20/06/2000



From mwh@python.net  Wed Jul 10 10:56:38 2002
From: mwh@python.net (Michael Hudson)
Date: 10 Jul 2002 10:56:38 +0100
Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py
In-Reply-To: martin@v.loewis.de's message of "09 Jul 2002 20:42:08 +0200"
References: <2mr8id0xxp.fsf@starship.python.net> <m3ptxwyc8v.fsf@mira.informatik.hu-berlin.de>
Message-ID: <2mwus3sy7d.fsf@starship.python.net>

martin@v.loewis.de (Martin v. Loewis) writes:

> Michael Hudson <mwh@python.net> writes:
> 
> > ?  What cases does that get wrong?  I'd have changed it already, but I
> > have this feeling I must be missing something.
> 
> This looks overly complex to me, too, but you may want to ask the
> author specifically:

I already did; didn't you see the Cc: line in my first post?

Cheers,
M.

-- 
  Counting lines is probably a good idea if you want to print it out
  and are short on paper, but I fail to see the purpose otherwise.
                                        -- Erik Naggum, comp.lang.lisp



From mwh@python.net  Wed Jul 10 10:58:36 2002
From: mwh@python.net (Michael Hudson)
Date: 10 Jul 2002 10:58:36 +0100
Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py
In-Reply-To: "Fred L. Drake, Jr."'s message of "Tue, 9 Jul 2002 14:02:35 -0400"
References: <2mr8id0xxp.fsf@starship.python.net> <3D2AFC64.3030400@lemburg.com> <15659.9531.709381.202975@grendel.zope.com>
Message-ID: <2mu1n7sy43.fsf@starship.python.net>

"Fred L. Drake, Jr." <fdrake@acm.org> writes:

> M.-A. Lemburg writes:
>  > Is Modules/Setup a landmark on all Python build platforms,
>  > e.g. on Macs, Windows and other non-Unix platforms as well ?
> 
> It's only on Unix as far as I'm aware.

Now how about answering the initial question?  It's your code.

pesteringly-ly y'rs,
M.

-- 
  Sufficiently advanced political correctness is indistinguishable
  from irony.                                           -- Erik Naggum



From guido@python.org  Wed Jul 10 11:23:36 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jul 2002 06:23:36 -0400
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: Your message of "Wed, 10 Jul 2002 00:46:19 EDT."
 <15659.48155.786445.996056@anthem.wooz.org>
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com> <3D220A86.5070003@lemburg.com> <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <m3znx79rk3.fsf@mira.informatik.hu-berlin.de> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> <15657.51863.523283.977726@anthem.wooz.org> <3D29FD4F.4060607@lemburg.com> <15658.57536.133296.126976@anthem.wooz.org> <006301c2274f$95cfabf0$0900a8c0@spiff> <3D2AEFF3.5000600@lemburg.com>
 <15659.48155.786445.996056@anthem.wooz.org>
Message-ID: <200207101023.g6AANaT25347@pcp02138704pcs.reston01.va.comcast.net>

> >>>>> "MAL" == M  <mal@lemburg.com> writes:
> 
>     MAL> Anyway, regardless of how we do it, we need to add the
>     MAL> 'system-packages' dir to just before the '.../lib/pythonX.X'
>     MAL> entry in sys.path. If there's consent about this, I'd suggest
>     MAL> to move ahead in this direction as first step.
> 
> +1
> 
> Perhaps also backport to 2.2 and (maybe? maybe not?) 2.1.

Smells like a new feature to me, so -1 on a 2.2 backport.  I haven't
seen enough of this thread to comment on this for 2.3.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mwh@python.net  Wed Jul 10 11:27:24 2002
From: mwh@python.net (Michael Hudson)
Date: 10 Jul 2002 11:27:24 +0100
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib httplib.py,1.57,1.58
In-Reply-To: jhylton@users.sourceforge.net's message of "Tue, 09 Jul 2002 14:22:38 -0700"
References: <E17S2RS-00070n-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <2mk7o3sws3.fsf@starship.python.net>

jhylton@users.sourceforge.net writes:

> Update of /cvsroot/python/python/dist/src/Lib
> In directory usw-pr-cvs1:/tmp/cvs-serv26945
> 
> Modified Files:
> 	httplib.py 
> Log Message:
> Fix for SF bug 579107.
> 
> The recent SSL changes resulted in important, but subtle changes to
> close() semantics.  Since builtin socket makefile() is not called for
> SSL connections, we don't get separately closeable fds for connection
> and response.  Comments in the code explain how to restore makefile
> semantics.
> 
> Bug fix candidate.

I have a feeling that it was this checkin that broke test_pyclbr.

Certainly, something did.

Perhaps this module could do with a better test?

Cheers,
M.

-- 
  Get out your salt shakers folks, this one's going to take more
  than one grain.                 -- Ator in an Ars Technica news item



From mwh@python.net  Wed Jul 10 11:36:03 2002
From: mwh@python.net (Michael Hudson)
Date: 10 Jul 2002 11:36:03 +0100
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib httplib.py,1.57,1.58
In-Reply-To: Michael Hudson's message of "10 Jul 2002 11:27:24 +0100"
References: <E17S2RS-00070n-00@usw-pr-cvs1.sourceforge.net> <2mk7o3sws3.fsf@starship.python.net>
Message-ID: <2mfzyrswdo.fsf@starship.python.net>

Michael Hudson <mwh@python.net> writes:

> jhylton@users.sourceforge.net writes:
[...] 
> > Modified Files:
> > 	httplib.py 
[...]
> 
> I have a feeling that it was this checkin that broke test_pyclbr.
> 
> Certainly, something did.

Oh, Tim fixed it already.

> Perhaps this module could do with a better test?

This still stands, though.

Cheers,
M.

-- 
  NUTRIMAT:  That drink was individually tailored to meet your
             personal requirements for nutrition and pleasure.
    ARTHUR:  Ah.  So I'm a masochist on a diet am I?
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 9



From fdrake@acm.org  Wed Jul 10 12:55:03 2002
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 10 Jul 2002 07:55:03 -0400
Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py
In-Reply-To: <37A35504-93E3-11D6-AC50-0030655234CE@cwi.nl>
References: <2mr8id0xxp.fsf@starship.python.net>
 <37A35504-93E3-11D6-AC50-0030655234CE@cwi.nl>
Message-ID: <15660.8343.626355.368514@grendel.zope.com>

Jack Jansen writes:
 > This won't work for one of the standard use cases: having multiple 
 > "build" subdirectories of the source directory (where you build for 
 > different platforms or some such).

Actually, it does support this case; Modules/Setup is relative to the
interpreter, and will have been created on Unix.

 > And as to a general solution to the problem: how about parsing the 
 > Makefile that sits beside the interpreter? In all use cases (I think 
 > also in your example of build directories very far away over the hills) 
 > the Makefile will sit in the same directory as the interpreter. And the 
 > Makefile will have the srcdir variable that points to the source 
 > directory. And we have a makefile parser in distutils.

Does this work on Windows?  Parsing the Makefile isn't a problem, but
I don't think there is one in the MSVC build.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation



From wdraxinger@darkstargames.de  Wed Jul 10 12:59:44 2002
From: wdraxinger@darkstargames.de (Wolfgang Draxinger)
Date: Wed, 10 Jul 2002 13:59:44 +0200
Subject: [Python-Dev] Embedding Python the extreme way
Message-ID: <3D2C21B0.4040108@darkstargames.de>

For my current 3D engine project I decided to use Python as a important
part of the whole design. And it works well.
However, now I want to make python an integal part of the engine, not
just a external lib. And I not want to statically link it with the
engine. My goal is, to discard all modules and builtin functions that I
don't need, e.g. sys. It's intended to control the 3D engine, not to
write complex scripts.

Can anybody give me some advice for that. Embedding Python and writing
extension modules is no problem at all, just "cleaning" the python
sources. My base is Python 2.2.1
-- 
+------------------------------------------------+
| +----------------+ WOLFGANG DRAXINGER          |
| | ,-.   DARKSTAR | lead programmer             |
| |(   ) +---------+ wdraxinger@darkstargames.de |
| | `-' / GAMES /                                |
| +----+''''''''     http://www.darkstargames.de |
+------------------------------------------------+




From mwh@python.net  Wed Jul 10 15:17:57 2002
From: mwh@python.net (Michael Hudson)
Date: 10 Jul 2002 15:17:57 +0100
Subject: [Python-Dev] The C API and wide unicode support
Message-ID: <2mr8ibzmy2.fsf@starship.python.net>

It may be best to allow this particular dead horse to go on being
dead, but I thought I'd ask here.  Beats work, anyway.

Picture the situation: you're wrapping a C library that returns a
unicode string (let's say encoded as UCS-2).  You want to return this
as a Python object.  So you'd think you can write

return PyUnicode_Decode(encstr, "ucs-2", NULL);

(or something close to that).  But for reasons that escape me,
PyUnicode_Decode is included in the API renaming in
Include/unicodeobject.h, so if you want to provide binaries you have
to provide two, and you can be sure that users will have no idea which
they need.

So, questions:

(1) am I correct in thinking that PyUnicode_Decode (and a bunch of
    others) could safely be omitted from the renaming?
(2) if so, is it worth omitting those APIs that could be omitted for 2.3?

This train of thinking came about because the version of 2.2 that
comes with Redhat 7.3 is compiled with wide unicode support (which
surprised me), and so the pygame RPMs broke.

Cheers,
M.

-- 
  Any form of evilness that can be detected without *too* much effort
  is worth it...  I have no idea what kind of evil we're looking for
  here or how to detect is, so I can't answer yes or no.
                                       -- Guido Van Rossum, python-dev



From walter@livinglogic.de  Wed Jul 10 15:57:16 2002
From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=)
Date: Wed, 10 Jul 2002 16:57:16 +0200
Subject: [Python-Dev] The C API and wide unicode support
References: <2mr8ibzmy2.fsf@starship.python.net>
Message-ID: <3D2C4B4C.6050204@livinglogic.de>

Michael Hudson wrote:

> It may be best to allow this particular dead horse to go on being
> dead, but I thought I'd ask here.  Beats work, anyway.
> 
> Picture the situation: you're wrapping a C library that returns a
> unicode string (let's say encoded as UCS-2).  You want to return this
> as a Python object.  So you'd think you can write
> 
> return PyUnicode_Decode(encstr, "ucs-2", NULL);

There is no "ucs-2" encoding. This should be "utf-16", "utf-16-le"
or "utf-16-be".

> (or something close to that).  But for reasons that escape me,
> PyUnicode_Decode is included in the API renaming in
> Include/unicodeobject.h, so if you want to provide binaries you have
> to provide two, and you can be sure that users will have no idea which
> they need.
> 
> So, questions:
> 
> (1) am I correct in thinking that PyUnicode_Decode (and a bunch of
>     others) could safely be omitted from the renaming?

No, because the unicode objects generated will consist of either
UCS-2 or UCS-4 "characters". This has nothing to do with the
encoding of the byte array which you use to create the unicode object.

Any C function that uses Unicode objects in any way needs name
mangling, because the storage layout of the Unicode objects
changes.

> (2) if so, is it worth omitting those APIs that could be omitted for 2.3?
> 
> This train of thinking came about because the version of 2.2 that
> comes with Redhat 7.3 is compiled with wide unicode support (which
> surprised me), and so the pygame RPMs broke.

I don't know, probably because sizeof(wchar_t)==4 ?

Bye,
    Walter Dörwald




From guido@python.org  Wed Jul 10 16:02:53 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jul 2002 11:02:53 -0400
Subject: [Python-Dev] The C API and wide unicode support
In-Reply-To: Your message of "Wed, 10 Jul 2002 16:57:16 +0200."
 <3D2C4B4C.6050204@livinglogic.de>
References: <2mr8ibzmy2.fsf@starship.python.net>
 <3D2C4B4C.6050204@livinglogic.de>
Message-ID: <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net>

> Any C function that uses Unicode objects in any way needs name
> mangling, because the storage layout of the Unicode objects
> changes.

Really?  If I am only using the published APIs and not peeking
directly inside the Unicode object, why should I care about its
internal lay-out?

Shouldn't only functions whose signature uses PY_UNICODE_TYPE be
name-mangled?  What am I missing?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From walter@livinglogic.de  Wed Jul 10 16:25:09 2002
From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=)
Date: Wed, 10 Jul 2002 17:25:09 +0200
Subject: [Python-Dev] The C API and wide unicode support
References: <2mr8ibzmy2.fsf@starship.python.net>              <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D2C51D5.8060000@livinglogic.de>

Guido van Rossum wrote:

>>Any C function that uses Unicode objects in any way needs name
>>mangling, because the storage layout of the Unicode objects
>>changes.
> 
> 
> Really?  If I am only using the published APIs and not peeking
> directly inside the Unicode object, why should I care about its
> internal lay-out?

That's what I meant with "using". Function that only pass
unicode objects around don't need to know (as long as they pass
the objects only to functions that themselves either "know"
or "don't need to know" the layout).

PyUnicode_Decode creates unicode objects, so I guess it needs
to know.

> Shouldn't only functions whose signature uses PY_UNICODE_TYPE be
> name-mangled?  What am I missing?

What about the functions that use the C macros (PyUnicode_AS_UNICODE
etc.) directly or indirectly? Those functions will rely on the
internal lay-out.

Bye,
    Walter Dörwald




From mwh@python.net  Wed Jul 10 16:29:14 2002
From: mwh@python.net (Michael Hudson)
Date: 10 Jul 2002 16:29:14 +0100
Subject: [Python-Dev] The C API and wide unicode support
In-Reply-To: =?ISO-8859-15?Q?Walter_D=F6rwald?='s message of "Wed, 10 Jul 2002 17:25:09 +0200"
References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de>
Message-ID: <2m1yabiotx.fsf@starship.python.net>

=?ISO-8859-15?Q?Walter_D=F6rwald?= <walter@livinglogic.de> writes:

> Guido van Rossum wrote:
> 
> >>Any C function that uses Unicode objects in any way needs name
> >>mangling, because the storage layout of the Unicode objects
> >>changes.
> > 
> > 
> > Really?  If I am only using the published APIs and not peeking
> > directly inside the Unicode object, why should I care about its
> > internal lay-out?
> 
> That's what I meant with "using". Function that only pass
> unicode objects around don't need to know (as long as they pass
> the objects only to functions that themselves either "know"
> or "don't need to know" the layout).
> 
> PyUnicode_Decode creates unicode objects, so I guess it needs
> to know.

*It* needs to know, yes.  But surely the caller doesn't?

> > Shouldn't only functions whose signature uses PY_UNICODE_TYPE be
> > name-mangled?  What am I missing?
> 
> What about the functions that use the C macros (PyUnicode_AS_UNICODE
> etc.) directly or indirectly? Those functions will rely on the
> internal lay-out.

They're verboten in extension modules anyway, so I don't care.

Cheers,
M.

-- 
  Like most people, I don't always agree with the BDFL (especially
  when he wants to change things I've just written about in very 
  large books), ... 
         -- Mark Lutz, http://python.oreilly.com/news/python_0501.html



From walter@livinglogic.de  Wed Jul 10 17:00:17 2002
From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=)
Date: Wed, 10 Jul 2002 18:00:17 +0200
Subject: [Python-Dev] The C API and wide unicode support
References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net>
Message-ID: <3D2C5A11.6020501@livinglogic.de>

Michael Hudson wrote:

> =?ISO-8859-15?Q?Walter_D=F6rwald?= <walter@livinglogic.de> writes:
> 
>>Guido van Rossum wrote:
>>
>>
>>>>Any C function that uses Unicode objects in any way needs name
>>>>mangling, because the storage layout of the Unicode objects
>>>>changes.
>>>
>>>
>>>Really?  If I am only using the published APIs and not peeking
>>>directly inside the Unicode object, why should I care about its
>>>internal lay-out?
>>
>>That's what I meant with "using". Function that only pass
>>unicode objects around don't need to know (as long as they pass
>>the objects only to functions that themselves either "know"
>>or "don't need to know" the layout).
>>
>>PyUnicode_Decode creates unicode objects, so I guess it needs
>>to know.
> 
> *It* needs to know, yes.  But surely the caller doesn't?

This depends on what the caller does with the result of
PyUnicode_Decode.

>>>Shouldn't only functions whose signature uses PY_UNICODE_TYPE be
>>>name-mangled?  What am I missing?
>>
>>What about the functions that use the C macros (PyUnicode_AS_UNICODE
>>etc.) directly or indirectly? Those functions will rely on the
>>internal lay-out.
> 
> They're verboten in extension modules anyway, so I don't care.

I didn't know that. Neither Include/unicodeobject.h nor
Doc/api/concrete.tex mention it. Is there any other location
where this is mentioned?

I think to forbid the use of the macros is too restrictive.
What if I want to implement a version of
    foo.replace(u"&", u"&amp;")
       .replace(u"<", u"&lt;")
       .replace(u"\"", u"&quot;")
       .replace(u">", u"&gt;")
in C for performance reasons? How is this possible without
using the C macros?

And if extension modules are not allowed to access the internal
layout of unicode objects, what's the use of name mangling?

Bye,
    Walter Dörwald




From mal@lemburg.com  Wed Jul 10 18:19:12 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 10 Jul 2002 19:19:12 +0200
Subject: [Python-Dev] The C API and wide unicode support
References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net>
Message-ID: <3D2C6C90.7090006@lemburg.com>

Michael Hudson wrote:
>>Shouldn't only functions whose signature uses PY_UNICODE_TYPE be
>>>name-mangled?  What am I missing?
 >
>>What about the functions that use the C macros (PyUnicode_AS_UNICODE
>>etc.) directly or indirectly? Those functions will rely on the
>>internal lay-out.
> 
> 
> They're verboten in extension modules anyway, so I don't care.

They are not disallowed in extensions... don't know where you
have that idea from.

Note that the name mangling is done to prevent an extension
which uses Unicode in some way from loading if the interpreter
and extension Unicode "width" doesn't match.

If we would allow this, extensions using the macros would cause
memory corruption since they'd index differently. That's not only
a potential cause for a seg fault, it's also a security risk.

The name mangling does not provide a 100% bullet proof way
of preventing this (an extension might use Py_UNICODE and
the Unicode macros without touching any of the other C APIs),
but it goes a long way in that direction.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From aahz@pythoncraft.com  Wed Jul 10 18:28:57 2002
From: aahz@pythoncraft.com (Aahz)
Date: Wed, 10 Jul 2002 13:28:57 -0400
Subject: [Python-Dev] Embedding Python the extreme way
In-Reply-To: <3D2C21B0.4040108@darkstargames.de>
References: <3D2C21B0.4040108@darkstargames.de>
Message-ID: <20020710172857.GA5093@panix.com>

python-dev is the wrong place for this discussion.  Please post your
message to comp.lang.python or look on www.python.org for other
resources if you think that's not suitable.

On Wed, Jul 10, 2002, Wolfgang Draxinger wrote:
>
> For my current 3D engine project I decided to use Python as a important
> part of the whole design. And it works well.
> However, now I want to make python an integal part of the engine, not
> just a external lib. And I not want to statically link it with the
> engine. My goal is, to discard all modules and builtin functions that I
> don't need, e.g. sys. It's intended to control the 3D engine, not to
> write complex scripts.
> 
> Can anybody give me some advice for that. Embedding Python and writing
> extension modules is no problem at all, just "cleaning" the python
> sources. My base is Python 2.2.1

-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From tim@zope.com  Wed Jul 10 19:00:42 2002
From: tim@zope.com (Tim Peters)
Date: Wed, 10 Jul 2002 14:00:42 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib httplib.py,1.57,1.58
In-Reply-To: <2mk7o3sws3.fsf@starship.python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEILADAB.tim@zope.com>

>> Modified Files:
>> 	httplib.py
>> Log Message:
>> Fix for SF bug 579107.

[Michael Hudson]
> ...
> I have a feeling that it was this checkin that broke test_pyclbr.

Yes.

> ...
> Perhaps this module could do with a better test?

"This module" is ambiguous given that two modules are involved, but it's
hard to disagree either way <0.9 wink>.  The change to httplib that broke
test_pyclbr should not, of course, have been checked in in that state
regardless.  Whatever, it's fixed now.




From guido@python.org  Wed Jul 10 19:26:28 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jul 2002 14:26:28 -0400
Subject: [Python-Dev] The C API and wide unicode support
In-Reply-To: Your message of "Wed, 10 Jul 2002 19:19:12 +0200."
 <3D2C6C90.7090006@lemburg.com>
References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net>
 <3D2C6C90.7090006@lemburg.com>
Message-ID: <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net>

> > They're verboten in extension modules anyway, so I don't care.
> 
> They are not disallowed in extensions... don't know where you
> have that idea from.

Maybe because other macros are often disallowed in (3rd party)
extensions, the reason being that the macros dig in the internal
representation which isn't guaranteed to be binary compatible?  It
would make sense that the same rules applies to the Unicode macros in
3rd party extensions.

(I admit that these restrictions may be underdocumented.  Nevertheless
they were intended and I believe they were discussed.)

> Note that the name mangling is done to prevent an extension
> which uses Unicode in some way from loading if the interpreter
> and extension Unicode "width" doesn't match.
> 
> If we would allow this, extensions using the macros would cause
> memory corruption since they'd index differently. That's not only
> a potential cause for a seg fault, it's also a security risk.

If there was a way so that only extensions that use the macros or
APIs whose signature uses Py_UNICODE_TYPE would fail to load, that
would be better.  But I don't know how to enforce that.

> The name mangling does not provide a 100% bullet proof way
> of preventing this (an extension might use Py_UNICODE and
> the Unicode macros without touching any of the other C APIs),
> but it goes a long way in that direction.

Maybe it goes too far.

OTOH, Michael, is this really something you cannot live with?  Or is
it simply a surprise?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Wed Jul 10 20:26:55 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jul 2002 15:26:55 -0400
Subject: [Python-Dev] Provide a Python wrapper for any new C extension
In-Reply-To: Your message of "Fri, 21 Jun 2002 14:18:39 PDT."
 <Pine.SOL.4.44.0206211407280.13283-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0206211407280.13283-100000@death.OCF.Berkeley.EDU>
Message-ID: <200207101926.g6AJQtg27630@pcp02138704pcs.reston01.va.comcast.net>

> The only obvious objection I can see to this is a performance hit for
> having to go through the Python stub to call the C extension.  But I just
> did a very simple test of calling strftime('%c') 25,000 times from time
> directly and using a Python stub and it was .470 and .490 secs total
> respectively according to profile.run().

If the Python module does "from _Cmodule import *", there should be
*no* difference in performance, since you get the same object in
either case.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Wed Jul 10 20:25:22 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jul 2002 15:25:22 -0400
Subject: [Python-Dev] Provide a Python wrapper for any new C extension
In-Reply-To: Your message of "Fri, 21 Jun 2002 20:31:27 BST."
 <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk>
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk>
Message-ID: <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net>

[Hamish Lawson]
> One of the arguments put forward against renaming the existing time
> module to _time (as part of incorporating a pure-Python strptime
> function) is that it could break some builds. Therefore I'd suggest
> that it could be a useful principle for any C extension added in the
> future to the standard library to have an accompanying pure-Python
> wrapper that would be the one that client code would usually import.

There are too many distinct use cases to make this a hard and fast
rule.   The problem with maintaining many builds is best served by
keeping the number of extensions small, period.

[Marc-Andre Lemburg]
> BTW, this reminds me of the old idea to move that standard
> lib into a package, eg. 'python'...
> 
> from python import time.

Maybe in Python 3000.  In 2.x, I think rearranging the standard
library will just cause more upheaval without much benefits.

> We should at least reserve such a name RSN so that we don't
> run into problems later on.

I can guarantee you that that name won't be used as a standard Python
module or package name any time soon.  If someone creates a 3rd party
package or module named 'python' I'd question their sanity. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From Jack.Jansen@oratrix.com  Wed Jul 10 21:17:19 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Wed, 10 Jul 2002 22:17:19 +0200
Subject: [Python-Dev] Dodgy code in distutils/sysconfig.py
In-Reply-To: <2mznwzsy8c.fsf@starship.python.net>
Message-ID: <0938EF98-9442-11D6-B6F5-003065517236@oratrix.com>

On woensdag, juli 10, 2002, at 11:56 , Michael Hudson wrote:
>> This won't work for one of the standard use cases: having multiple
>> "build" subdirectories of the source directory (where you build for
>> different platforms or some such).
>
> How so?  It worked for my cron jobs last night, which build in this
> fashion.

You're right, of course. I was confusing this with the sys.path 
initialization code.
I think there's absolutely nothing wrong with your solution.
>
>> And on the other question: as of a week ago setup.py is also 
>> being used
>> to build at least some of the MacPython extension modules.
>
> Is this MacPython as built by CodeWarrior?  I'm counting MacOS X as
> unix when it's convenient to do so :)

Correct. Over on the pythonmac-sig the use of "MacPython" is 
(for the time being) reserved to mean the CodeWarrior-built 
Python that will also run on OS9. MachoPython is used for the 
OSX-only unix Python.

And, as I said (or try to say:-), "don't worry about MacPython 
builds, they'll continue to work.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -




From martin@v.loewis.de  Wed Jul 10 21:32:11 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 10 Jul 2002 22:32:11 +0200
Subject: [Python-Dev] Embedding Python the extreme way
In-Reply-To: <3D2C21B0.4040108@darkstargames.de>
References: <3D2C21B0.4040108@darkstargames.de>
Message-ID: <m3ptxvjpdg.fsf@mira.informatik.hu-berlin.de>

Wolfgang Draxinger <wdraxinger@darkstargames.de> writes:

> Can anybody give me some advice for that.

Not on this list, which is for the development *of* Python, not for
the development *with* Python.

Regards,
Martin




From martin@v.loewis.de  Wed Jul 10 21:39:55 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 10 Jul 2002 22:39:55 +0200
Subject: [Python-Dev] The C API and wide unicode support
In-Reply-To: <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net>
References: <2mr8ibzmy2.fsf@starship.python.net>
 <3D2C4B4C.6050204@livinglogic.de>
 <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3lm8jjp0k.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Really?  If I am only using the published APIs and not peeking
> directly inside the Unicode object, why should I care about its
> internal lay-out?

The safeguard is to tell apart module that use Unicode objects from
modules which don't. If a module uses Unicode objects, it might be
using PyUnicode_AS_UNICODE. Unfortunately, this does not result in a
symbol reference, so using a module that only uses
PyUnicode_AS_UNICODE would break if it was compiled for the wrong
width of Py_UNICODE.

Mangling all Unicode functions is the best safeguard we could find to
protect against this case. It is still possible to cheat that, but it
is unlikely that somebody breaks the safeguard by accident.

Likewise, it is unlikely that a single platform has builds for two
different Py_UNICODE sizes simultaneously, so the safeguard does not
add additional burden, either.

Regards,
Martin



From martin@v.loewis.de  Wed Jul 10 21:44:03 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 10 Jul 2002 22:44:03 +0200
Subject: [Python-Dev] The C API and wide unicode support
In-Reply-To: <2mr8ibzmy2.fsf@starship.python.net>
References: <2mr8ibzmy2.fsf@starship.python.net>
Message-ID: <m3hej7joto.fsf@mira.informatik.hu-berlin.de>

Michael Hudson <mwh@python.net> writes:

> (or something close to that).  But for reasons that escape me,
> PyUnicode_Decode is included in the API renaming in
> Include/unicodeobject.h, so if you want to provide binaries you have
> to provide two, and you can be sure that users will have no idea which
> they need.

That is not true. One option is to provide the sources; if you do so,
you do not need to provide binaries at all (thanks to distutils
(*)). Another option is to provide binaries for the default
installation only, which will be UCS-2. Nobody will notice.

Regards,
Martin

(*) If distutils is unacceptable, it is probably because it requires
users to have a C compiler. In that case, you are probably targeting
Win32. In that case, you can be certain how the binaries have been
built.



From mal@lemburg.com  Wed Jul 10 22:21:30 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 10 Jul 2002 23:21:30 +0200
Subject: [Python-Dev] The C API and wide unicode support
References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net>              <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D2CA55A.6080803@lemburg.com>

Guido van Rossum wrote:
>>>They're verboten in extension modules anyway, so I don't care.
>>
>>They are not disallowed in extensions... don't know where you
>>have that idea from.
> 
> 
> Maybe because other macros are often disallowed in (3rd party)
> extensions, the reason being that the macros dig in the internal
> representation which isn't guaranteed to be binary compatible?  It
> would make sense that the same rules applies to the Unicode macros in
> 3rd party extensions.

Which macros would that be ? I modelled the macros in the
Unicode implementation after those of the string
implementation. And those macros are certainly used in
a lot of 3rd party extensions.

> (I admit that these restrictions may be underdocumented.  Nevertheless
> they were intended and I believe they were discussed.)

I guess, having the macros in the header files without an
explicit warning marks them as public interface. That's how
I have used them in tons of code and I think that I'm not
alone in using this approach.

>>Note that the name mangling is done to prevent an extension
>>which uses Unicode in some way from loading if the interpreter
>>and extension Unicode "width" doesn't match.
>>
>>If we would allow this, extensions using the macros would cause
>>memory corruption since they'd index differently. That's not only
>>a potential cause for a seg fault, it's also a security risk.
>
> If there was a way so that only extensions that use the macros or
> APIs whose signature uses Py_UNICODE_TYPE would fail to load, that
> would be better.  But I don't know how to enforce that.

That's certainly possible for C API, but not for the macros
(without defeating their purpose). You also have a problem
in case the extension defines its own Unicode routines relying
on the Python types and macros, e.g. for extensions which
subclass the Unicode type. These don't necessarily need to
use the APIs; not even the macros... but they do rely on the
binary layout used in the Unicode type.

>>The name mangling does not provide a 100% bullet proof way
>>of preventing this (an extension might use Py_UNICODE and
>>the Unicode macros without touching any of the other C APIs),
>>but it goes a long way in that direction.
> 
> 
> Maybe it goes too far.
> 
> OTOH, Michael, is this really something you cannot live with?  Or is
> it simply a surprise?

I think that the fact that Michael is seeing breakage is
a good thing. Otherwise, he would probably not have noticed
that RedHat chose to use the wide build as default.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Wed Jul 10 22:33:18 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 10 Jul 2002 23:33:18 +0200
Subject: [Python-Dev] python package
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D2CA81E.6060408@lemburg.com>

Guido van Rossum wrote:
>>BTW, this reminds me of the old idea to move that standard
>>lib into a package, eg. 'python'...
>>
>>from python import time.
> 
> 
> Maybe in Python 3000.  In 2.x, I think rearranging the standard
> library will just cause more upheaval without much benefits.
> 
> 
>>We should at least reserve such a name RSN so that we don't
>>run into problems later on.
> 
> 
> I can guarantee you that that name won't be used as a standard Python
> module or package name any time soon.  If someone creates a 3rd party
> package or module named 'python' I'd question their sanity. :-)

How about adding

python.py:
__path__ = ['.']


This would not only reserve the name in the global namespace,
but also enable applications to start using 'from python import x'
now without much fuzz.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From guido@python.org  Wed Jul 10 23:51:59 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jul 2002 18:51:59 -0400
Subject: [Python-Dev] python package
In-Reply-To: Your message of "Wed, 10 Jul 2002 23:33:18 +0200."
 <3D2CA81E.6060408@lemburg.com>
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net>
 <3D2CA81E.6060408@lemburg.com>
Message-ID: <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net>

> How about adding
> 
> python.py:
> __path__ = ['.']
> 
> This would not only reserve the name in the global namespace,
> but also enable applications to start using 'from python import x'
> now without much fuzz.

Then I have to ask the question I originally wanted to ask: what
problem would that solve?  And is this the right solution?

Also, it would make *all* standard modules accessible through the
python package -- surely this isn't what we want (not if we use the
Java example at least).

Also, for some modules (that keep some global state) it's a bad idea
if they are imported twice, since their initialization code would be
run twice, and there would be two separate instances of the module.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aahz@pythoncraft.com  Thu Jul 11 01:26:36 2002
From: aahz@pythoncraft.com (Aahz)
Date: Wed, 10 Jul 2002 20:26:36 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207090551.g695pgb18970@oma.cosc.canterbury.ac.nz>
References: <20020709051833.GA32041@hishome.net> <200207090551.g695pgb18970@oma.cosc.canterbury.ac.nz>
Message-ID: <20020711002636.GA6958@panix.com>

On Tue, Jul 09, 2002, Greg Ewing wrote:
>
> Maybe a one-shot iterable should raise an exception
> if you try to obtain a second iterator from it?

Then you couldn't do this:

    done = False
    for line in f:
        if not check(line):
            break
        process(line)
    else:
        done = True

    if not done:
        for line in file:
            another_process(line)

-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From guido@python.org  Thu Jul 11 02:10:18 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jul 2002 21:10:18 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 10 Jul 2002 20:26:36 EDT."
 <20020711002636.GA6958@panix.com>
References: <20020709051833.GA32041@hishome.net> <200207090551.g695pgb18970@oma.cosc.canterbury.ac.nz>
 <20020711002636.GA6958@panix.com>
Message-ID: <200207110110.g6B1AIb28525@pcp02138704pcs.reston01.va.comcast.net>

> Then you couldn't do this:
> 
>     done = False
>     for line in f:
>         if not check(line):
>             break
>         process(line)
>     else:
>         done = True
> 
>     if not done:
>         for line in file:
>             another_process(line)

That's already broken, see SF bug 524804.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From oren-py-d@hishome.net  Thu Jul 11 07:15:28 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Thu, 11 Jul 2002 02:15:28 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207110110.g6B1AIb28525@pcp02138704pcs.reston01.va.comcast.net>
References: <20020709051833.GA32041@hishome.net> <200207090551.g695pgb18970@oma.cosc.canterbury.ac.nz> <20020711002636.GA6958@panix.com> <200207110110.g6B1AIb28525@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020711061528.GA6367@hishome.net>

On Wed, Jul 10, 2002 at 09:10:18PM -0400, Guido van Rossum wrote:
> > Then you couldn't do this:
> > 
> >     done = False
> >     for line in f:
> >         if not check(line):
> >             break
> >         process(line)
> >     else:
> >         done = True
> > 
> >     if not done:
> >         for line in file:
> >             another_process(line)
> 
> That's already broken, see SF bug 524804.

Xreadlines is buffered and therefore leaves the file position of the file 
in an unexpected state.  If you use xreadlines explicitly you should expect 
that. The fact that file.__iter__ returns an xreadlines object implicitly is 
therefore a bit surprising. 

What's the reason for using xreadlines as a file iterator?  Was it 
performance or was it just the easiest way to implement it using an existing 
object?

"Files support the iterator protocol. Each iteration returns the same
result as file.readline()"

This is not correct. Files support what I call the iterable protocol. Objects 
supporting the iterator protocol have a .next() method, files don't. While 
it's true that each iteration has the same result as readline it doesn't 
have the same side effects.

Proposal: make files really support the iterator protocol. __iter__ would
return self and next() would call readline and raise StopIteration if ''.
If anyone wants the xreadline performance improvement it should be explicit.

definitions: 

iterable := hasattr(obj, '__iter__') 
iterator := hasattr(obj, '__iter__') and hasattr(obj, 'next')

If object is iterable and not an iterator it would be reasonable to expect
that it is also re-iterable.  I don't know if this should be a requirement 
but I think it would be a good idea if all builtin objects should conform to 
it anyway.  Currently files are the only builtin that is iterable, not an 
iterator and not re-iterable. 

explicit-is-better-than-implicit-ly yours,

      Oren




From martin@v.loewis.de  Thu Jul 11 07:49:42 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 11 Jul 2002 08:49:42 +0200
Subject: [Python-Dev] The C API and wide unicode support
In-Reply-To: <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net>
References: <2mr8ibzmy2.fsf@starship.python.net>
 <3D2C4B4C.6050204@livinglogic.de>
 <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net>
 <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net>
 <3D2C6C90.7090006@lemburg.com>
 <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3sn2qu5bt.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Maybe because other macros are often disallowed in (3rd party)
> extensions, the reason being that the macros dig in the internal
> representation which isn't guaranteed to be binary compatible?  

In this specific case, using a function vs. using a macro makes no
difference: the function exposes implementation details just as the
macro. In theory, using the function would allow to rearrange Unicode
objects to have their characters in the same memory block as the
object, which would break applications of the macro - but apparently,
the risk of the result relying too much on implementation details
(i.e. wide or narrow Unicode) is more serious.

> It would make sense that the same rules applies to the Unicode
> macros in 3rd party extensions.

Given the potential change of the layout of Unicode objects, I would
agree that it is good to ban PyUnicode_UPPER_CASE from use in
extension modules.

> If there was a way so that only extensions that use the macros or
> APIs whose signature uses Py_UNICODE_TYPE would fail to load, that
> would be better.  But I don't know how to enforce that.

That is indeed the problem, and the last time we concluded that it
would be best to bind all Unicode functions to the unicode width, to
be on the safe side.

> Maybe it goes too far.
> 
> OTOH, Michael, is this really something you cannot live with?  Or is
> it simply a surprise?

That is the central question here. As I said before, I would expect
this to be a non-issue, in real life.

Regards,
Martin



From mal@lemburg.com  Thu Jul 11 08:43:28 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 11 Jul 2002 09:43:28 +0200
Subject: [Python-Dev] python package
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net>              <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D2D3720.9040100@lemburg.com>

Guido van Rossum wrote:
>>How about adding
>>
>>python.py:
>>__path__ = ['.']
>>
>>This would not only reserve the name in the global namespace,
>>but also enable applications to start using 'from python import x'
>>now without much fuzz.
> 
> 
> Then I have to ask the question I originally wanted to ask: what
> problem would that solve?  And is this the right solution?

It solves the namespace issue.

Every time we add a module or package to the standard lib, there
is a chance that we break someones code out there by overriding
his/her own module/package (e.g. take the addition of the email
package -- such generic names tend to be used a lot).

Whether it's the right solution depends on how you see it.
IMHO it would be ideal to move the complete std lib under
a single package. You might want to use a more diverse hierarchy
but I don't think that is really needed for the existing
code base. Using a single package also makes the transition
from non-package imports to python-package imports a lot
easier.

> Also, it would make *all* standard modules accessible through the
> python package -- surely this isn't what we want (not if we use the
> Java example at least).

Are you sure that you want to make things complicated ? (see above)

> Also, for some modules (that keep some global state) it's a bad idea
> if they are imported twice, since their initialization code would be
> run twice, and there would be two separate instances of the module.

That's true for the trick I proposed above since the modules
are reachable in two ways with the standard way of writing
'import <stdmod>' being used in tons of code.

Now there is also a different way to approach this problem,
though: that of directing Python to the right package by
providing stubs for all current standard lib modules.

I have used such a stub for my mx stuff when I moved
everything from top-level to under the 'mx' umbrella:

# Redirect all imports to the corresponding mx package
def _redirect(mx_subpackage):
     global __path__
     import os,mx
     __path__ = [os.path.join(mx.__path__[0],mx_subpackage)]
_redirect('DateTime')

# Now load all important symbols
from mx.DateTime import *

This works great -- it even let's you load pickles which
store the old import names and automagically converts
them to the new names.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From just@letterror.com  Thu Jul 11 08:44:13 2002
From: just@letterror.com (Just van Rossum)
Date: Thu, 11 Jul 2002 09:44:13 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <20020711061528.GA6367@hishome.net>
Message-ID: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]>

Oren Tirosh wrote:

> Xreadlines is buffered and therefore leaves the file position of the file 
> in an unexpected state.  If you use xreadlines explicitly you should expect 
> that. The fact that file.__iter__ returns an xreadlines object implicitly is 
> therefore a bit surprising. 
> 
> What's the reason for using xreadlines as a file iterator?  Was it 
> performance or was it just the easiest way to implement it using an existing 
> object?

The rationale was something like "the simple most way to iterate over the lines
in a file should be the fastest". I'd agree with that, but not at the expense of
the surprises mentioned in the bug. I would perhaps help if the file object
would cache the xreadlines iterator, that would limit the scope of the problem
to the case where iteration and explicit .read() calls are mixed.

> "Files support the iterator protocol. Each iteration returns the same
> result as file.readline()"
> 
> This is not correct. Files support what I call the iterable protocol. Objects 
> supporting the iterator protocol have a .next() method, files don't. While 
> it's true that each iteration has the same result as readline it doesn't 
> have the same side effects.
> 
> Proposal: make files really support the iterator protocol. __iter__ would
> return self and next() would call readline and raise StopIteration if ''.
> If anyone wants the xreadline performance improvement it should be explicit.

+1

(But, since the bug is closed as "won't fix" I doubt this has a big chance of
happening.)

Just



From mwh@python.net  Thu Jul 11 10:05:56 2002
From: mwh@python.net (Michael Hudson)
Date: 11 Jul 2002 10:05:56 +0100
Subject: [Python-Dev] The C API and wide unicode support
In-Reply-To: Guido van Rossum's message of "Wed, 10 Jul 2002 14:26:28 -0400"
References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <2my9cihbwr.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> OTOH, Michael, is this really something you cannot live with?  Or is
> it simply a surprise?

Here's where the problem came up.

A user posted to pygame-users saying that when he tried to import
pygame.event, along the lines of PyUnicodeUCS2_Unicode undefined.
This obviously made a light go on in my head, and I asked where he'd
got his Python and his pygame.  He'd got his Python from the Redhat
7.3 RPM and his pygame from pygame.org.  I suggested building pygame
from source, which he did and everything worked[*].  

Prediction: this is going to cause pain.  For instance, if this user
decides that he wants to upgrade to 2.2.1, he might download Sean's
RPMs from python.org which are narrow unicode builds -- and then his
extensions will break.  The problem here is that the kind of users
this is going to trouble are exactly the users who will not know
what's going on.

We can't prevent this sort of thing totally, but I think it should be
possible to carry out simple unicode manipulations (like this example
of returning a buffer) without incurring this kind of binary
compatibility worry.  Maybe a "safe" api, plastered with warning signs
in the docs about poking into the internal structure of the objects.

I wonder why Redhat distribute wide unicode builds?  That's the
immediate cause of the problem.  Maybe we could ask them...

Cheers,
M.
[*] actually, I think pygame might break with a wide unicode build.

-- 
  For every complex problem, there is a solution that is simple,
  neat, and wrong.                                    -- H. L. Mencken



From guido@python.org  Thu Jul 11 11:41:45 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jul 2002 06:41:45 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Thu, 11 Jul 2002 02:15:28 EDT."
 <20020711061528.GA6367@hishome.net>
References: <20020709051833.GA32041@hishome.net> <200207090551.g695pgb18970@oma.cosc.canterbury.ac.nz> <20020711002636.GA6958@panix.com> <200207110110.g6B1AIb28525@pcp02138704pcs.reston01.va.comcast.net>
 <20020711061528.GA6367@hishome.net>
Message-ID: <200207111041.g6BAfjg29839@pcp02138704pcs.reston01.va.comcast.net>

> What's the reason for using xreadlines as a file iterator?  Was it
> performance or was it just the easiest way to implement it using an
> existing object?

I thought this was answered adequately by my last entry in the SF bug
report.  The short answer is performance in the common case.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jul 11 11:47:53 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jul 2002 06:47:53 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Thu, 11 Jul 2002 09:44:13 +0200."
 <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]>
Message-ID: <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net>

> > Proposal: make files really support the iterator
> > protocol. __iter__ would return self and next() would call
> > readline and raise StopIteration if ''.  If anyone wants the
> > xreadline performance improvement it should be explicit.

No.  I won't have "for line in file" be slower than attainable.

The only solution I accept is a complete rewrite of the I/O system
without using stdio, so xreadlines can be integrated.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jul 11 11:56:59 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jul 2002 06:56:59 -0400
Subject: [Python-Dev] The C API and wide unicode support
In-Reply-To: Your message of "11 Jul 2002 10:05:56 BST."
 <2my9cihbwr.fsf@starship.python.net>
References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net>
 <2my9cihbwr.fsf@starship.python.net>
Message-ID: <200207111056.g6BAuxR29969@pcp02138704pcs.reston01.va.comcast.net>

> > OTOH, Michael, is this really something you cannot live with?  Or is
> > it simply a surprise?
> 
> Here's where the problem came up.
> 
> A user posted to pygame-users saying that when he tried to import
> pygame.event, along the lines of PyUnicodeUCS2_Unicode undefined.
> This obviously made a light go on in my head, and I asked where he'd
> got his Python and his pygame.  He'd got his Python from the Redhat
> 7.3 RPM and his pygame from pygame.org.  I suggested building pygame
> from source, which he did and everything worked[*].  
> 
> Prediction: this is going to cause pain.  For instance, if this user
> decides that he wants to upgrade to 2.2.1, he might download Sean's
> RPMs from python.org which are narrow unicode builds -- and then his
> extensions will break.  The problem here is that the kind of users
> this is going to trouble are exactly the users who will not know
> what's going on.
> 
> We can't prevent this sort of thing totally, but I think it should be
> possible to carry out simple unicode manipulations (like this example
> of returning a buffer) without incurring this kind of binary
> compatibility worry.  Maybe a "safe" api, plastered with warning signs
> in the docs about poking into the internal structure of the objects.

That might work.  Or you could call the Python APIs from C. :-)

> I wonder why Redhat distribute wide unicode builds?  That's the
> immediate cause of the problem.  Maybe we could ask them...

I've had little luck trying to communicate with RedHat about their
Python releases.

Anyway, I think it's obvious why they do this: because it's there, and
because they don't want surprises with customers who use wide Unicode
characters.

> Cheers,
> M.
> [*] actually, I think pygame might break with a wide unicode build.

Hm, so maybe you should fix that first before you start complaining.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aleax@aleax.it  Thu Jul 11 12:09:14 2002
From: aleax@aleax.it (Alex Martelli)
Date: Thu, 11 Jul 2002 13:09:14 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17SbpW-0002yg-00@mail.python.org>

On Thursday 11 July 2002 12:47 pm, Guido van Rossum wrote:
> > > Proposal: make files really support the iterator
> > > protocol. __iter__ would return self and next() would call
> > > readline and raise StopIteration if ''.  If anyone wants the
> > > xreadline performance improvement it should be explicit.
>
> No.  I won't have "for line in file" be slower than attainable.

+1.  I _intensely_ want to be able to teach beginners to use "for line in 
file" and have it be fast in the common case.  "Nice" behavior for rarer 
cases of prematurely interrupted loops is OK, if feasible, but secondary.  
Having "for line in file" play nicely with other method calls on 'file' has 
no importance to me in this context -- no more than, e.g., having "for item 
in alist" play nicely with calls to mutating methods of object alist.


> The only solution I accept is a complete rewrite of the I/O system
> without using stdio, so xreadlines can be integrated.

I thought Just's suggestion (about having the file object remember
the xreadlines object in use, so that another for loop would continue
right where the first one exited) seemed like a reasonable hack -- a
compromise of reasonably little effort for some small secondary gain.

Guess I must be missing something?  Of course the "complete rewrite"
is an alluring prospect -- for many other reasons, such as enabling
user control of file buffering in cross-platform ways, *yum* -- but it's not 
going to happen in time for 2.3 anyway, is it?


Alex



From mal@lemburg.com  Thu Jul 11 12:46:28 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 11 Jul 2002 13:46:28 +0200
Subject: [Python-Dev] The C API and wide unicode support
References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net> <2my9cihbwr.fsf@starship.python.net>
Message-ID: <3D2D7014.6050609@lemburg.com>

Michael Hudson wrote:
> Guido van Rossum <guido@python.org> writes:
> 
> 
>>OTOH, Michael, is this really something you cannot live with?  Or is
>>it simply a surprise?
> 
> 
> Here's where the problem came up.
> 
> A user posted to pygame-users saying that when he tried to import
> pygame.event, along the lines of PyUnicodeUCS2_Unicode undefined.
> This obviously made a light go on in my head, and I asked where he'd
> got his Python and his pygame.  He'd got his Python from the Redhat
> 7.3 RPM and his pygame from pygame.org.  I suggested building pygame
> from source, which he did and everything worked[*].  
> 
> Prediction: this is going to cause pain.  For instance, if this user
> decides that he wants to upgrade to 2.2.1, he might download Sean's
> RPMs from python.org which are narrow unicode builds -- and then his
> extensions will break.  The problem here is that the kind of users
> this is going to trouble are exactly the users who will not know
> what's going on.

It's a pain, yes, but still better than having seg faults
due to memory corruption afterwords.

> We can't prevent this sort of thing totally, but I think it should be
> possible to carry out simple unicode manipulations (like this example
> of returning a buffer) without incurring this kind of binary
> compatibility worry.  Maybe a "safe" api, plastered with warning signs
> in the docs about poking into the internal structure of the objects.

Perhaps we need an additional abstract API PyObject_UnicodeEx()
which provides a way to additionally define the encoding to assume
for decoding string objects ? (PyObject_Unicode() always assumes
the default encoding)

> I wonder why Redhat distribute wide unicode builds?  That's the
> immediate cause of the problem.  Maybe we could ask them...
> 
> Cheers,
> M.
> [*] actually, I think pygame might break with a wide unicode build.

Why's that ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mwh@python.net  Thu Jul 11 12:58:11 2002
From: mwh@python.net (Michael Hudson)
Date: 11 Jul 2002 12:58:11 +0100
Subject: [Python-Dev] The C API and wide unicode support
In-Reply-To: Guido van Rossum's message of "Thu, 11 Jul 2002 06:56:59 -0400"
References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net> <2my9cihbwr.fsf@starship.python.net> <200207111056.g6BAuxR29969@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <2melea8oj0.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> > We can't prevent this sort of thing totally, but I think it should be
> > possible to carry out simple unicode manipulations (like this example
> > of returning a buffer) without incurring this kind of binary
> > compatibility worry.  Maybe a "safe" api, plastered with warning signs
> > in the docs about poking into the internal structure of the objects.
> 
> That might work.  Or you could call the Python APIs from C. :-)

That's what I'm doing for pygame.  It's probably the best option,
really -- complaining ain't gonna get this changed for the 2.2 series,
for one thing.

Better docs would help; I'll put that on my list, and stop moaning
about this.

> > I wonder why Redhat distribute wide unicode builds?  That's the
> > immediate cause of the problem.  Maybe we could ask them...
> 
> I've had little luck trying to communicate with RedHat about their
> Python releases.

There's an email address in the spec file; teg (at) obvious.domain.  I
might ask him.

[...]
> > [*] actually, I think pygame might break with a wide unicode build.
> 
> Hm, so maybe you should fix that first before you start complaining.

Hey, I can do two things at once!  Patches are on their way to Pete.

Cheers,
M.

-- 
  While preceding your entrance with a grenade is a good tactic in
  Quake, it can lead to problems if attempted at work.    -- C Hacking
               -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html



From mwh@python.net  Thu Jul 11 13:01:46 2002
From: mwh@python.net (Michael Hudson)
Date: 11 Jul 2002 13:01:46 +0100
Subject: [Python-Dev] The C API and wide unicode support
In-Reply-To: "M.-A. Lemburg"'s message of "Thu, 11 Jul 2002 13:46:28 +0200"
References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net> <2my9cihbwr.fsf@starship.python.net> <3D2D7014.6050609@lemburg.com>
Message-ID: <2mbs9e8od1.fsf@starship.python.net>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> > Prediction: this is going to cause pain.  For instance, if this user
> > decides that he wants to upgrade to 2.2.1, he might download Sean's
> > RPMs from python.org which are narrow unicode builds -- and then his
> > extensions will break.  The problem here is that the kind of users
> > this is going to trouble are exactly the users who will not know
> > what's going on.
> 
> It's a pain, yes, but still better than having seg faults
> due to memory corruption afterwords.

Probably true.  At least the tracebacks make the problem obvious.

> > We can't prevent this sort of thing totally, but I think it should be
> > possible to carry out simple unicode manipulations (like this example
> > of returning a buffer) without incurring this kind of binary
> > compatibility worry.  Maybe a "safe" api, plastered with warning signs
> > in the docs about poking into the internal structure of the objects.
> 
> Perhaps we need an additional abstract API PyObject_UnicodeEx()
> which provides a way to additionally define the encoding to assume
> for decoding string objects ? (PyObject_Unicode() always assumes
> the default encoding)

That would be nice, yes.  Beats digging "unicode" out of
__builtin__...

> > [*] actually, I think pygame might break with a wide unicode build.
> 
> Why's that ?

Oh, the obvious thing: assuming sizeof(Py_UNICODE) == 2; or rather
assuming that Python's idea of what a unicode buffer is is the same as
SDL's idea (why I can't find written down anywhere, but I assume it's
the same kind of UCS-2 thing narrow builds use).

So, I retract my complaint, and propose to write some docs on the
subject.

Cheers,
M.

-- 
  Two things I learned for sure during a particularly intense acid
  trip in my own lost youth: (1) everything is a trivial special case
  of something else; and, (2) death is a bunch of blue spheres.
                                             -- Tim Peters, 1 May 1998



From guido@python.org  Thu Jul 11 13:19:31 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jul 2002 08:19:31 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Thu, 11 Jul 2002 13:09:14 +0200."
 <E17SbpT-0002yd-00@mail.python.org>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net>
 <E17SbpT-0002yd-00@mail.python.org>
Message-ID: <200207111219.g6BCJVU30095@pcp02138704pcs.reston01.va.comcast.net>

> > No.  I won't have "for line in file" be slower than attainable.
> 
> +1.  I _intensely_ want to be able to teach beginners to use "for line in 
> file" and have it be fast in the common case.  "Nice" behavior for rarer 
> cases of prematurely interrupted loops is OK, if feasible, but secondary.  
> Having "for line in file" play nicely with other method calls on 'file' has 
> no importance to me in this context -- no more than, e.g., having "for item 
> in alist" play nicely with calls to mutating methods of object alist.

Exactly.

> > The only solution I accept is a complete rewrite of the I/O system
> > without using stdio, so xreadlines can be integrated.
> 
> I thought Just's suggestion (about having the file object remember
> the xreadlines object in use, so that another for loop would continue
> right where the first one exited) seemed like a reasonable hack -- a
> compromise of reasonably little effort for some small secondary gain.

Oops, I missed that.  That seems reasonable indeed.

> Guess I must be missing something?  Of course the "complete rewrite"
> is an alluring prospect -- for many other reasons, such as enabling
> user control of file buffering in cross-platform ways, *yum* -- but it's not 
> going to happen in time for 2.3 anyway, is it?

I'm not going to hold up the 2.3 release, but if a patch lands in the
SF patch manager, I'm not going to reject it.  Hint, hint. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ark@research.att.com  Thu Jul 11 16:31:47 2002
From: ark@research.att.com (Andrew Koenig)
Date: 11 Jul 2002 11:31:47 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>
Message-ID: <yu99bs9ejn6k.fsf@europa.research.att.com>

David> I keep running into the problem that there is no reliable way
David> to introspect about whether a type supports multi-pass
David> iterability (in the sense that an input stream might support
David> only a single pass, but a list supports multiple passes). I
David> suppose you could check for __getitem__, but that wouldn't
David> cover linked lists, for example.

Here's a suggestion for a long-term strategy for solving this
problem, should it be deemed desirable to do so:

Right now, every iterator, and every object that supports
iteration, must have an __iter__() method.  Suppose we augment
that with the following:

        A new kind of iterator, called a multiple iterator, that
        supports multiple iterations over the same sequence.

        A requirement that every object that supports multiple
        iteration have a __multiter__() method that yields a
        multiple iterator over its sequence, in addition to
        an __iter__() method that yields a (multiple or single)
        iterator (so that every sequence that supports multiple
        iteration also supports single iteration).

        A requirement that every multiple iterator support the
        following methods:

            __multiter__()  yields the iterator object itself
            __iter__()      also yields the iterator object itself
                            (so that every multiple iterator is
                            also an iterator)
            __next__()      return the next item from the container
                            or raise StopIteration
            __copy__()      return a distinct, newly created multiple
                            iterator that iterates over the same
                            sequence as the original, starting from
                            the current element.

Note that when the last multiple iterator has left an element, there
is no possibility of going back to that element again unless the
sequence itself provides a way of doing so.  Therefore, for example,
it might be possible for files to provide multiple iterators without
undue space inefficiency.

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark



From David Abrahams" <david.abrahams@rcn.com  Thu Jul 11 17:08:55 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 11 Jul 2002 12:08:55 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com>
Message-ID: <12e101c228f5$42908020$6601a8c0@boostconsulting.com>

From: "Andrew Koenig" <ark@research.att.com>

> David> I keep running into the problem that there is no reliable way
> David> to introspect about whether a type supports multi-pass
> David> iterability (in the sense that an input stream might support
> David> only a single pass, but a list supports multiple passes). I
> David> suppose you could check for __getitem__, but that wouldn't
> David> cover linked lists, for example.
>
> Here's a suggestion for a long-term strategy for solving this
> problem, should it be deemed desirable to do so:
>
> Right now, every iterator, and every object that supports
> iteration, must have an __iter__() method.  Suppose we augment
> that with the following:
>
>         A new kind of iterator, called a multiple iterator, that
>         supports multiple iterations over the same sequence.
>
>         A requirement that every object that supports multiple
>         iteration have a __multiter__() method that yields a
>         multiple iterator over its sequence, in addition to
>         an __iter__() method that yields a (multiple or single)
>         iterator (so that every sequence that supports multiple
>         iteration also supports single iteration).
>
>         A requirement that every multiple iterator support the
>         following methods:
>
>             __multiter__()  yields the iterator object itself
>             __iter__()      also yields the iterator object itself
>                             (so that every multiple iterator is
>                             also an iterator)
>             __next__()      return the next item from the container
>                             or raise StopIteration
>             __copy__()      return a distinct, newly created multiple
>                             iterator that iterates over the same
>                             sequence as the original, starting from
>                             the current element.

Why bother with __multiter__? If you can distinguish a multiple iterator by
the presence of __copy__,  You can always do
hasattr(x.__iter__(),"__copy__") to find out whether something is
multi-iteratable.

-Dave




From ark@research.att.com  Thu Jul 11 17:16:15 2002
From: ark@research.att.com (Andrew Koenig)
Date: Thu, 11 Jul 2002 12:16:15 -0400 (EDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <12e101c228f5$42908020$6601a8c0@boostconsulting.com>
 (david.abrahams@rcn.com)
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <12e101c228f5$42908020$6601a8c0@boostconsulting.com>
Message-ID: <200207111616.g6BGGFB16385@europa.research.att.com>

David> Why bother with __multiter__? If you can distinguish a multiple
David> iterator by the presence of __copy__, You can always do
David> hasattr(x.__iter__(),"__copy__") to find out whether something
David> is multi-iteratable.

Because explicit is better than implicit :-)

More seriously, I can imagine distinguishing a multiple iterator by
the presence of __copy__, but I can't imagine using the presence of
__copy__ to determine whether a *container* supports multiple
iteration.  For example, there surely exist containers today that
support __copy__ but whose __iter__ methods yield iterators that do
not themselves support __copy__.

Another reason is that I can imagine this idea extended to encompass,
say, ambidextrous iterators that support prev() as well as next(),
and I would want to use __ambiter__ as a marker for those rather
than having to create an iterator and see if it has prev().



From guido@python.org  Thu Jul 11 17:40:12 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jul 2002 12:40:12 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Tue, 09 Jul 2002 05:37:47 EDT."
 <075a01c2272c$4acae390$6601a8c0@boostconsulting.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17Rqzq-0003eW-00@mx05.mrf.mail.rcn.net>
 <075a01c2272c$4acae390$6601a8c0@boostconsulting.com>
Message-ID: <200207111640.g6BGeCb13218@odiug.zope.com>

> I don't know if we need them, but I'm certainly finding that not having
> some more information is difficult for me. If I need to make multiple
> passes over the information in a generalized iterable object, the only
> solution AFAICT is to unconditionally copy all the information into a list
> first.

Or you could just document "this argument must support multiple
independent iterators."

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Thu Jul 11 21:10:28 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 11 Jul 2002 16:10:28 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17Rqzq-0003eW-00@mx05.mrf.mail.rcn.net>              <075a01c2272c$4acae390$6601a8c0@boostconsulting.com>  <200207111640.g6BGeCb13218@odiug.zope.com>
Message-ID: <13be01c22917$aa306480$6601a8c0@boostconsulting.com>

The real reason to be able to introspect is so that you can handle both
kinds.
Even if you're willing to destroy the data by examining it, if you know you
have a single-pass sequence, you might need to copy its elements into a
multi-pass sequence (e.g. file.lines()) in order to get your work done.


From: "Guido van Rossum" <guido@python.org>
> > I don't know if we need them, but I'm certainly finding that not having
> > some more information is difficult for me. If I need to make multiple
> > passes over the information in a generalized iterable object, the only
> > solution AFAICT is to unconditionally copy all the information into a
list
> > first.
>
> Or you could just document "this argument must support multiple
> independent iterators."





From guido@python.org  Thu Jul 11 22:48:22 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jul 2002 17:48:22 -0400
Subject: [Python-Dev] Re: *Simpler* string substitutions
In-Reply-To: Your message of "Sat, 22 Jun 2002 21:36:31 EDT."
 <15637.9759.111784.481102@anthem.wooz.org>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B3B2@UKRUX002.rundc.uk.origin-it.com>
 <15637.9759.111784.481102@anthem.wooz.org>
Message-ID: <200207112148.g6BLmMw14591@odiug.zope.com>

>     PM> 4. Access to variables is also problematic. Without
>     PM> compile-time support, access to nested scopes is impossible
>     PM> (AIUI).
> 
> Is this really true?  I think it was two IPC's ago that Jeremy and I
> discussed the possibility of adding a method to frame objects that
> would basically yield you the equivalent of globals+freevars+locals.

If f is a function and g is a function nested inside f, only those
locals of f that are also used in g get turned into cells.

So if f has a local variable x that isn't used by g (as far as the
compiler can see), there's no way for g to find f's value for x.
Remember that f may not be on g's call stack at all!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From oren-py-d@hishome.net  Thu Jul 11 22:59:28 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 00:59:28 +0300
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207111219.g6BCJVU30095@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Thu, Jul 11, 2002 at 08:19:31AM -0400
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net> <E17SbpT-0002yd-00@mail.python.org> <200207111219.g6BCJVU30095@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020712005928.A9833@hishome.net>

On Thu, Jul 11, 2002 at 08:19:31AM -0400, Guido van Rossum wrote:
> > Guess I must be missing something?  Of course the "complete rewrite"
> > is an alluring prospect -- for many other reasons, such as enabling
> > user control of file buffering in cross-platform ways, *yum* -- but it's not 
> > going to happen in time for 2.3 anyway, is it?
> 
> I'm not going to hold up the 2.3 release, but if a patch lands in the
> SF patch manager, I'm not going to reject it.  Hint, hint. :-)

http://www.python.org/sf/580331

No, it's not a complete rewrite of file buffering.  This patch implements 
Just's idea of xreadlines caching in the file object.  It also makes a file 
into an iterator: __iter__ returns self and next calls the next method of
the cached xreadlines object.

See my previous postings for why I think a file should be an iterator.

With this patch any combination of multiple xreadlines and iterator protocol 
operations on a file object is safe. Using xreadlines/iterator followed by 
regular readline has the same buffering problem as before. 

	Oren




From oren-py-d@hishome.net  Thu Jul 11 23:07:25 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 01:07:25 +0300
Subject: [Python-Dev] Alternative implementation of interning, take 2
Message-ID: <20020712010725.A10686@hishome.net>

Thanks for all feedback on the previous version.  This one supports both 
immortal interned strings created with PyString_InternInPlace and mortal 
interned strings created with the new function PyString_Intern.

Places that might affect compatibility still use immortals but most interned
strings are now mortal.

This version, like the previous one, does not support indirect interning of
strings.  Is there any evidence that this optimization is still important?
Nothing in the Python distribution itself needs it.

http://www.python.org/sf/576101

	Oren




From martin@v.loewis.de  Thu Jul 11 23:16:29 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 12 Jul 2002 00:16:29 +0200
Subject: [Python-Dev] Alternative implementation of interning, take 2
In-Reply-To: <20020712010725.A10686@hishome.net>
References: <20020712010725.A10686@hishome.net>
Message-ID: <m3n0sxuczm.fsf@mira.informatik.hu-berlin.de>

Oren Tirosh <oren-py-d@hishome.net> writes:

> This version, like the previous one, does not support indirect interning of
> strings.  Is there any evidence that this optimization is still important?
> Nothing in the Python distribution itself needs it.

That is still factually incorrect; the code is triggered in a test case.

Regards,
Martin



From oren-py-d@hishome.net  Thu Jul 11 23:20:00 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 01:20:00 +0300
Subject: [Python-Dev] Alternative implementation of interning, take 2
In-Reply-To: <m3n0sxuczm.fsf@mira.informatik.hu-berlin.de>; from martin@v.loewis.de on Fri, Jul 12, 2002 at 12:16:29AM +0200
References: <20020712010725.A10686@hishome.net> <m3n0sxuczm.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020712012000.A11330@hishome.net>

On Fri, Jul 12, 2002 at 12:16:29AM +0200, Martin v. Loewis wrote:
> Oren Tirosh <oren-py-d@hishome.net> writes:
> 
> > This version, like the previous one, does not support indirect interning of
> > strings.  Is there any evidence that this optimization is still important?
> > Nothing in the Python distribution itself needs it.
> 
> That is still factually incorrect; the code is triggered in a test case.

A few indirectly interned strings are still created, but I couldn't find any 
case where one was actually used as a key to PyDict_GetItem.

	Oren



From pinard@iro.umontreal.ca  Fri Jul 12 00:56:30 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 11 Jul 2002 19:56:30 -0400
Subject: [Python-Dev] Re: PendingDeprecationWarning
In-Reply-To: <E17D7Gk-0003Mz-00@mail.python.org>
References: <m17CwvB-004K87C@artcom0.artcom-gmbh.de>
 <00e001c2072f$7b8d2460$5d61accf@othello>
 <200205291642.g4TGgaf18754@odiug.zope.com>
 <E17D7Gk-0003Mz-00@mail.python.org>
Message-ID: <oqu1n53jkh.fsf@titan.progiciels-bpi.ca>

[Alex Martelli]

> On Wednesday 29 May 2002 06:42 pm, Guido van Rossum wrote:
> 	...
> > > oct()
> > > hex()
> >
> > Why?  I use these a lot...

> I assume the duplication of oct and hex wrt '%o'% and '%x'% was the
> reason to suggest silently-deprecating the former (trying to have 'just
> one obvious way' and all that).

Hi, people.  I'm revising many accumulated notes, while writing the draft
of a Python style and migration guide (in French) for a small team of
Python programmers, here.  By the way, I thank you all for the richness
of the exchanged ideas in that area, lately.  Also, poking around, I see
even a bit deeper than before how beautiful the Python project is!


Stumbling on the above message, I feel like making a further comment.
When I was learning Python, I found elegant to discover that Python had
all that is required so one could rewrite the `FORMAT % THINGS' operator,
if one wanted to.

If we deprecate built-ins (like `repr', `hex' and `oct') in favour of
leaving `%' as the only way, we would loose that elegance.  Moreover,
it might be more speedy not having to go through the interpretation of a
format string, and this might matter in some circumstances.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From David Abrahams" <david.abrahams@rcn.com  Fri Jul 12 01:18:20 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 11 Jul 2002 20:18:20 -0400
Subject: [Python-Dev] Re: [Types-sig] Re: [meta-sig] SIG charters
References: <15657.62121.520364.556758@anthem.wooz.org> <200207092050.g69KoxW04101@odiug.zope.com> <31E5E26A.BE7C5DC@home.se> <31E5E3C3.569EBAFA@home.se>
Message-ID: <146001c22939$a3335440$6601a8c0@boostconsulting.com>

From: "Sverker Nilsson" <sverker.is@home.se>


> Sverker Nilsson wrote:
> >
> > Guido van Rossum wrote:
> > > >     types-sig
> > > +1
> >
> > -1. I think there may come interesting discussions on this list,
> > when the time is due and things come up. Why dismiss it? It is
> > a good place to have.
> >
> > Sverker Nilsson

Plus, if you dismiss it I might be tempted to bring my thread about
multimethods/overload resolution back to this list, and that would be
messy... it was so neatly killed by diverting it to types-sig. ;-)

-Dave




From guido@python.org  Fri Jul 12 01:41:20 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jul 2002 20:41:20 -0400
Subject: [Python-Dev] Re: [Types-sig] Re: [meta-sig] SIG charters
In-Reply-To: Your message of "Thu, 11 Jul 2002 20:18:20 EDT."
 <146001c22939$a3335440$6601a8c0@boostconsulting.com>
References: <15657.62121.520364.556758@anthem.wooz.org> <200207092050.g69KoxW04101@odiug.zope.com> <31E5E26A.BE7C5DC@home.se> <31E5E3C3.569EBAFA@home.se>
 <146001c22939$a3335440$6601a8c0@boostconsulting.com>
Message-ID: <200207120041.g6C0fLi30951@pcp02138704pcs.reston01.va.comcast.net>

> > > > >     types-sig
> > > > +1
> > >
> > > -1. I think there may come interesting discussions on this list,
> > > when the time is due and things come up. Why dismiss it? It is
> > > a good place to have.
> > >
> > > Sverker Nilsson
> 
> Plus, if you dismiss it I might be tempted to bring my thread about
> multimethods/overload resolution back to this list, and that would be
> messy... it was so neatly killed by diverting it to types-sig. ;-)
> 
> -Dave

I apologize for that!  I had expected that some people on the type-sig
who would be interested.

But it proves that the types-sig is dead.  It's had its chance.  If
there's really a need to revive it, well, there's a procedure for
reviving SIG, too.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido@python.org  Fri Jul 12 01:43:57 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jul 2002 20:43:57 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Thu, 11 Jul 2002 16:10:28 EDT."
 <13be01c22917$aa306480$6601a8c0@boostconsulting.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17Rqzq-0003eW-00@mx05.mrf.mail.rcn.net> <075a01c2272c$4acae390$6601a8c0@boostconsulting.com> <200207111640.g6BGeCb13218@odiug.zope.com>
 <13be01c22917$aa306480$6601a8c0@boostconsulting.com>
Message-ID: <200207120043.g6C0hvD30978@pcp02138704pcs.reston01.va.comcast.net>

> The real reason to be able to introspect is so that you can handle both
> kinds.
> Even if you're willing to destroy the data by examining it, if you know you
> have a single-pass sequence, you might need to copy its elements into a
> multi-pass sequence (e.g. file.lines()) in order to get your work done.

Hm.  I think it's just as good to make it the responsibility of the
caller to pass a multi-iterable.  There could be a standard tool that
takes a single-iterable and produces a multi-iterable.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Fri Jul 12 03:43:13 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Thu, 11 Jul 2002 22:43:13 -0400
Subject: [Python-Dev] long long configuration
Message-ID: <14b201c2294d$fdebb9e0$6601a8c0@boostconsulting.com>

Hi,

I recently came across a nasty configuration conflict between boost and
python.

In LongObject.h we have:

    #ifdef HAVE_LONG_LONG

    /* Hopefully this is portable... */
    #ifndef ULONG_MAX
    #define ULONG_MAX 4294967295U
    #endif
    #ifndef LONGLONG_MAX
    #define LONGLONG_MAX 9223372036854775807LL
    #endif
    #ifndef ULONGLONG_MAX
    #define ULONGLONG_MAX 0xffffffffffffffffULL
    #endif

Well, it turns out that boost detects whether the compiler supports long
long by #including <limits.h> and looking for these macros:

#include <limits.h>
# if !defined(BOOST_MSVC) && !defined(__BORLANDC__) \
   && (defined(ULLONG_MAX) || defined(ULONG_LONG_MAX) ||
defined(ULONGLONG_MAX))
#  define BOOST_HAS_LONG_LONG
#endif

So it turns out that on some platforms, Python's configuration sets
HAVE_LONG_LONG even when limits.h doesn't include definitions of these
macros. For example, there's MSVC6, where Python substitutes __int64 for
long long using its LONG_LONG macro. However, I didn't actually notice the
problem until I tried linking something at LLNL where they're using an
older KCC. Two translation units had different ideas of
BOOST_HAS_LONG_LONG, so linking failed when one of them was looking for the
long long support supposedly provided by another. I'm surprised it wasn't a
worse problem with MSVC6, because after all, it doesn't even supply a type
called "long long".

Is there any chance that something can be done to prevent this sort of
conflict?

Thanks,
Dave





From tim.one@comcast.net  Fri Jul 12 05:54:15 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 00:54:15 -0400
Subject: [Python-Dev] long long configuration
In-Reply-To: <14b201c2294d$fdebb9e0$6601a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPEADAB.tim.one@comcast.net>

[David Abrahams]
> I recently came across a nasty configuration conflict between boost and
> python.
>
> In LongObject.h we have:
>
>     #ifdef HAVE_LONG_LONG
>
>     /* Hopefully this is portable... */
>     #ifndef ULONG_MAX
>     #define ULONG_MAX 4294967295U
>     #endif
>     #ifndef LONGLONG_MAX
>     #define LONGLONG_MAX 9223372036854775807LL
>     #endif
>     #ifndef ULONGLONG_MAX
>     #define ULONGLONG_MAX 0xffffffffffffffffULL
>     #endif
>
> Well, it turns out that boost detects whether the compiler supports long
> long by #including <limits.h> and looking for these macros:
>
> #include <limits.h>
> # if !defined(BOOST_MSVC) && !defined(__BORLANDC__) \
>    && (defined(ULLONG_MAX) || defined(ULONG_LONG_MAX) ||
> defined(ULONGLONG_MAX))
> #  define BOOST_HAS_LONG_LONG
> #endif
>
> So it turns out that on some platforms, Python's configuration sets
> HAVE_LONG_LONG even when limits.h doesn't include definitions of these
> macros.

Yes.  Python cares about the conceptual type, not about how a platform
spells it.

> For example, there's MSVC6, where Python substitutes __int64 for
> long long using its LONG_LONG macro. However, I didn't actually notice
> the problem

What problem?

> until I tried linking something at LLNL where they're using an older KCC.
> Two translation units had different ideas of BOOST_HAS_LONG_LONG,

Why was that?  Nothing you showed us for it, unless there's an implied
#include of Python.h before the Boost limits.h block you did show us.

> so linking failed when one of them was looking for the long long support
> supposedly provided by another. I'm surprised it wasn't a worse problem
> with MSVC6, because after all, it doesn't even supply a type called
> "long long".
>
> Is there any chance that something can be done to prevent this sort of
> conflict?

Rather than try to extract a clear question out of this <wink>, let me turn
it around:  would your problem go away if this code in LongObject.h went
away entirely?  Python has no business defining ULONG_MAX anymore (that's
left over from K&R C days), and I'm sure I got rid of all uses of
LONGLONG_MAX and ULONGLONG_MAX in 2.2 (I vaguely recall some; they weren't
really needed, and won't be needed again).




From tim.one@comcast.net  Fri Jul 12 06:18:39 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 01:18:39 -0400
Subject: [Python-Dev] Alternative implementation of interning, take 2
In-Reply-To: <20020712010725.A10686@hishome.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPFADAB.tim.one@comcast.net>

[Oren Tirosh]
> ...
> This version, like the previous one, does not support indirect
> interning of strings.  Is there any evidence that this optimization is
> still important?  Nothing in the Python distribution itself needs it.

We've already been thru the last part at length:  indirect interning wasn't
targeted at the core, so that the core doesn't need it is evidence of no
more than that Guido's implementation worked as he intended it to in this
respect.

It would help if you could get Marc-Andre and /F to pronounce on whether
their code benefits from it -- they're the most prolific extension authors
we've got.




From tim.one@comcast.net  Fri Jul 12 06:39:47 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 01:39:47 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <13be01c22917$aa306480$6601a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEPFADAB.tim.one@comcast.net>

[David Abrahams]
> The real reason to be able to introspect is so that you can handle both
> kinds.  Even if you're willing to destroy the data by examining it, if
> you know you have a single-pass sequence, you might need to copy its
> elements into a multi-pass sequence (e.g. file.lines()) in order to get
> your work done.

Note that Python uses PySequence_Fast() internally in such cases.  This does
whatever it takes to turn an iterable object into something that can be
indexed at random via PySequence_Fast_GET_ITEM(fastseq, int_index).  Under
the covers it leaves lists and tuples alone, and materializes everything
else into a temp tuple.  I haven't felt a need for something fancier than
that in practice; the lack of participation in this thread from other
old-timers suggests they haven't either (piling on more protocols would
allow to optimize some cases, but it's not clear such cases are important
enough in Python Life to be worth the bother).




From oren-py-d@hishome.net  Fri Jul 12 06:43:32 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 01:43:32 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <yu99bs9ejn6k.fsf@europa.research.att.com>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com>
Message-ID: <20020712054332.GA77883@hishome.net>

On Thu, Jul 11, 2002 at 11:31:47AM -0400, Andrew Koenig wrote:
> Right now, every iterator, and every object that supports
> iteration, must have an __iter__() method.  Suppose we augment
> that with the following:
> 
>         A new kind of iterator, called a multiple iterator, that
>         supports multiple iterations over the same sequence.
...
>             __copy__()      return a distinct, newly created multiple
>                             iterator that iterates over the same
>                             sequence as the original, starting from
>                             the current element.


There is no need for a new type of iterator. It's ok that iterators are
disposable.  If I need multiple iterations I don't want to copy the
iterator - I prefer to ask the original iterable object for a new iterator.
All I need is some way to know whether the iterable object (container) can 
produce multiple iterators that generate the same sequence.

  An object is re-iterable if its iterators do not modify its state.

The iterator of an iterator is itself.  Calling the next method, by
definition, modifies the internal state of an object. Therefore anything 
that has a next method is not re-iterable. 

"hasattr(obj,'__iter__') and hasattr(obj, 'next')" is a good signature of
a non re-iterable object.  Unfortunately, the opposite is not true.  One
iterable object in Python produces iterators that modify its state when 
their .next() method is called - the file object.

I have just submitted a patch that makes a file into an iterator (i.e. adds 
a .next method to files).  With this change all Python objects that have
an __iter__ method and no next method produce iterators that do not modify
the container.  Another possibility would be to make file iterators that
use seek or re-open the file to avoid modifying the file position of the
parent file object.  I don't think that would be a good idea because files
can be devices, pipes or sockets which are not seekable. 

I think it may be a good idea to add a note to the documentation pages
about the iterator protocol that the iterators of a container should not
modify the state of the container. If you think they must it's probably 
a good sign that your 'container' is not really a container and maybe it 
should be an iterator rather than produce iterators of itself.

	Oren




From aleax@aleax.it  Fri Jul 12 07:43:38 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 12 Jul 2002 08:43:38 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207120043.g6C0hvD30978@pcp02138704pcs.reston01.va.comcast.net>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <13be01c22917$aa306480$6601a8c0@boostconsulting.com> <200207120043.g6C0hvD30978@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17Su9a-0005Fb-00@mail.python.org>

On Friday 12 July 2002 02:43 am, Guido van Rossum wrote:
> > The real reason to be able to introspect is so that you can handle both
> > kinds.
> > Even if you're willing to destroy the data by examining it, if you know
> > you have a single-pass sequence, you might need to copy its elements into
> > a multi-pass sequence (e.g. file.lines()) in order to get your work done.
>
> Hm.  I think it's just as good to make it the responsibility of the
> caller to pass a multi-iterable.  There could be a standard tool that
> takes a single-iterable and produces a multi-iterable.

At the risk of sounding like a broken record -- doesn't protocol adaptation
stand out as a good way to package up such a "standard tool"?  Why
should we keep inventing a variety of different ways to ask the same kind
of service -- "Here is an object X, please return it or a wrapper on it in 
such a way that it satisfies protocol Y, if possible"...?

In this specific case, Y is "a multi-iterable".  Last time the subject came
up in this list, as I recall, Y was "usable as an index on a sequence".

Having protocol-adaptation machinery would not save the work of designing
protocols and adapters, and there would still be the need to decide case
by case "do we want to standardize this specific adaptation".  However, it
would save the work involved in "given that we DO want to standardize
this adaptation, how do we dress it up" -- how do we present the service
to client-code.  The greatest benefits might be to authors of client code (and
aren't we all, from time to time?-) -- reducing the amount of learning 
involved with each protocol-adaptation to "what is the protocol Y I want".

I don't think it's strictly necessary for Y to be "an interface", and thus 
that protocol adaptation must necessarily wait for interfaces to become a
recognized and formalized Python concept.  I think that accepting any type
as "a protocol" would be fine, and pragmatically equivalent to requiring a
protocol to be "an interface".


Alex




From martin@v.loewis.de  Fri Jul 12 08:08:36 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 12 Jul 2002 09:08:36 +0200
Subject: [Python-Dev] long long configuration
In-Reply-To: <14b201c2294d$fdebb9e0$6601a8c0@boostconsulting.com>
References: <14b201c2294d$fdebb9e0$6601a8c0@boostconsulting.com>
Message-ID: <m3adoxh18r.fsf@mira.informatik.hu-berlin.de>

"David Abrahams" <david.abrahams@rcn.com> writes:

> #include <limits.h>
> # if !defined(BOOST_MSVC) && !defined(__BORLANDC__) \
>    && (defined(ULLONG_MAX) || defined(ULONG_LONG_MAX) ||
> defined(ULONGLONG_MAX))
> #  define BOOST_HAS_LONG_LONG
> #endif
[...]
> I'm surprised it wasn't a
> worse problem with MSVC6, because after all, it doesn't even supply a type
> called "long long".

Could that have resulted from defining BOOST_MSVC?

Regards,
Martin



From mal@lemburg.com  Fri Jul 12 09:24:10 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 12 Jul 2002 10:24:10 +0200
Subject: [Python-Dev] Alternative implementation of interning, take 2
References: <LNBBLJKPBEHFEDALKOLCCEPFADAB.tim.one@comcast.net>
Message-ID: <3D2E922A.4040005@lemburg.com>

Tim Peters wrote:
> [Oren Tirosh]
> 
>>...
>>This version, like the previous one, does not support indirect
>>interning of strings.  Is there any evidence that this optimization is
>>still important?  Nothing in the Python distribution itself needs it.
> 
> 
> We've already been thru the last part at length:  indirect interning wasn't
> targeted at the core, so that the core doesn't need it is evidence of no
> more than that Guido's implementation worked as he intended it to in this
> respect.
> 
> It would help if you could get Marc-Andre and /F to pronounce on whether
> their code benefits from it -- they're the most prolific extension authors
> we've got.

Gee, thanks :-)

If you could spell out what exactly you mean by "indirect interning"
that would help.

What I do need and rely on is the fact that the
Python compiler interns all constant strings and identifiers in
Python programs. This makes switching like so:

if a == 'x':
elif a == 'y':
else:

also work like this (only faster):

if a is 'x':
elif a is 'y':
else:

provided that 'a' only uses interned strings.

If that's what you mean by "indirect interning" then I do
need this.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From David Abrahams" <david.abrahams@rcn.com  Fri Jul 12 11:01:18 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 12 Jul 2002 06:01:18 -0400
Subject: [Python-Dev] long long configuration
References: <LNBBLJKPBEHFEDALKOLCCEPEADAB.tim.one@comcast.net>
Message-ID: <14ea01c2298b$b91a59a0$6601a8c0@boostconsulting.com>

From: "Tim Peters" <tim.one@comcast.net>


> [David Abrahams]
> > I recently came across a nasty configuration conflict between boost and
> > python.
> >
> > In LongObject.h we have:
> >
> >     #ifdef HAVE_LONG_LONG
> >
> >     /* Hopefully this is portable... */
> >     #ifndef ULONG_MAX
> >     #define ULONG_MAX 4294967295U
> >     #endif
> >     #ifndef LONGLONG_MAX
> >     #define LONGLONG_MAX 9223372036854775807LL
> >     #endif
> >     #ifndef ULONGLONG_MAX
> >     #define ULONGLONG_MAX 0xffffffffffffffffULL
> >     #endif
> >
> > Well, it turns out that boost detects whether the compiler supports
long
> > long by #including <limits.h> and looking for these macros:
> >
> > #include <limits.h>
> > # if !defined(BOOST_MSVC) && !defined(__BORLANDC__) \
> >    && (defined(ULLONG_MAX) || defined(ULONG_LONG_MAX) ||
> > defined(ULONGLONG_MAX))
> > #  define BOOST_HAS_LONG_LONG
> > #endif
> >
> > So it turns out that on some platforms, Python's configuration sets
> > HAVE_LONG_LONG even when limits.h doesn't include definitions of these
> > macros.
>
> Yes.  Python cares about the conceptual type, not about how a platform
> spells it.
>
> > For example, there's MSVC6, where Python substitutes __int64 for
> > long long using its LONG_LONG macro. However, I didn't actually notice
> > the problem
>
> What problem?

Uh, sorry. Depending on the order of #includes, Python's headers can
confuse Boost's configuration.

> > until I tried linking something at LLNL where they're using an older
KCC.
> > Two translation units had different ideas of BOOST_HAS_LONG_LONG,
>
> Why was that?  Nothing you showed us for it, unless there's an implied
> #include of Python.h before the Boost limits.h block you did show us.

Because one translation unit said (in effect):

#include <Python.h>          // defines ULONGLONG_MAX
#include <boost/config.hpp>  // decides long long is available

and the other said:

#include <boost/config.hpp> // decides long long is unavailable
#include <Python.h>         // defines ULONGLONG_MAX (harmless this time)

> > so linking failed when one of them was looking for the long long
support
> > supposedly provided by another. I'm surprised it wasn't a worse problem
> > with MSVC6, because after all, it doesn't even supply a type called
> > "long long".
> >
> > Is there any chance that something can be done to prevent this sort of
> > conflict?
>
> Rather than try to extract a clear question out of this <wink>,

Too late (I hope!)

> let me turn
> it around:  would your problem go away if this code in LongObject.h went
> away entirely?  Python has no business defining ULONG_MAX anymore (that's
> left over from K&R C days), and I'm sure I got rid of all uses of
> LONGLONG_MAX and ULONGLONG_MAX in 2.2 (I vaguely recall some; they
weren't
> really needed, and won't be needed again).

Actually, that was the answer I was hoping you'd come up with. I'd also
suggest prefixing HAVE_LONG_LONG with some kind of PYTHON_ grist to keep it
out of the way of more-naive applications, but I don't want to push my luck
\<wink> -- I still remember what happened when I suggested that _Py_...
names should be avoided!

-Dave




From David Abrahams" <david.abrahams@rcn.com  Fri Jul 12 11:06:34 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 12 Jul 2002 06:06:34 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <LNBBLJKPBEHFEDALKOLCMEPFADAB.tim.one@comcast.net>
Message-ID: <14fb01c2298c$71a145b0$6601a8c0@boostconsulting.com>

From: "Tim Peters" <tim.one@comcast.net>

> Note that Python uses PySequence_Fast() internally in such cases.  This
does
> whatever it takes to turn an iterable object into something that can be
> indexed at random via PySequence_Fast_GET_ITEM(fastseq, int_index).
Under
> the covers it leaves lists and tuples alone, and materializes everything
> else into a temp tuple.  I haven't felt a need for something fancier than
> that in practice; the lack of participation in this thread from other
> old-timers suggests they haven't either (piling on more protocols would
> allow to optimize some cases, but it's not clear such cases are important
> enough in Python Life to be worth the bother).

Yep, I know about PySequence_Fast(), annd we're currently using that.
However I have a bunch of numerics users who will undoubtedly be working
with some kind of array from NumPy or something -- they'll be really
unimpressed with me when PySequence_Fast() copies their huge multi-pass
sequence without individual Python objects for the elements into a tuple
with each double expressed as a separate Python float.

can-you-say-PySequence_SLOW?-ly y'rs,
dave




From David Abrahams" <david.abrahams@rcn.com  Fri Jul 12 11:11:33 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 12 Jul 2002 06:11:33 -0400
Subject: [Python-Dev] long long configuration
References: <14b201c2294d$fdebb9e0$6601a8c0@boostconsulting.com> <m3adoxh18r.fsf@mira.informatik.hu-berlin.de>
Message-ID: <151501c2298d$25186b00$6601a8c0@boostconsulting.com>

From: "Martin v. Loewis" <martin@v.loewis.de>


> "David Abrahams" <david.abrahams@rcn.com> writes:
>
> > #include <limits.h>
> > # if !defined(BOOST_MSVC) && !defined(__BORLANDC__) \
> >    && (defined(ULLONG_MAX) || defined(ULONG_LONG_MAX) ||
> > defined(ULONGLONG_MAX))
> > #  define BOOST_HAS_LONG_LONG
> > #endif
> [...]
> > I'm surprised it wasn't a
> > worse problem with MSVC6, because after all, it doesn't even supply a
type
> > called "long long".
>
> Could that have resulted from defining BOOST_MSVC?

Sorry, I don't understand the question. Could *what* have resulted from
defining BOOST_MSVC?

-Dave




From David Abrahams" <david.abrahams@rcn.com  Fri Jul 12 11:22:05 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 12 Jul 2002 06:22:05 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net>
Message-ID: <152601c2298d$f9a71740$6601a8c0@boostconsulting.com>

Oren,

I like the direction this is going in, but I have some reservations about
any protocol which requires users to avoid using a simple method name like
next() on their own multi-pass sequence types unless they intend their
sequence to be treated as single-pass.

One other possibility: if x.__iter__() is x, it's a single-pass sequence. I
realize this involves actually invoking the __iter__ method and conjuring
up a new iterator, but that's generally a lightweight operation...

-Dave

From: "Oren Tirosh" <oren-py-d@hishome.net>

> There is no need for a new type of iterator. It's ok that iterators are
> disposable.  If I need multiple iterations I don't want to copy the
> iterator - I prefer to ask the original iterable object for a new
iterator.
> All I need is some way to know whether the iterable object (container)
can
> produce multiple iterators that generate the same sequence.
>
>   An object is re-iterable if its iterators do not modify its state.
>
> The iterator of an iterator is itself.  Calling the next method, by
> definition, modifies the internal state of an object. Therefore anything
> that has a next method is not re-iterable.
>
> "hasattr(obj,'__iter__') and hasattr(obj, 'next')" is a good signature of
> a non re-iterable object.  Unfortunately, the opposite is not true.  One
> iterable object in Python produces iterators that modify its state when
> their .next() method is called - the file object.
>
> I have just submitted a patch that makes a file into an iterator (i.e.
adds
> a .next method to files).  With this change all Python objects that have
> an __iter__ method and no next method produce iterators that do not
modify
> the container.  Another possibility would be to make file iterators that
> use seek or re-open the file to avoid modifying the file position of the
> parent file object.  I don't think that would be a good idea because
files
> can be devices, pipes or sockets which are not seekable.
>
> I think it may be a good idea to add a note to the documentation pages
> about the iterator protocol that the iterators of a container should not
> modify the state of the container. If you think they must it's probably
> a good sign that your 'container' is not really a container and maybe it
> should be an iterator rather than produce iterators of itself.
>
> Oren
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev




From oren-py-d@hishome.net  Fri Jul 12 12:01:13 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 07:01:13 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <152601c2298d$f9a71740$6601a8c0@boostconsulting.com>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net> <152601c2298d$f9a71740$6601a8c0@boostconsulting.com>
Message-ID: <20020712110113.GA13526@hishome.net>

On Fri, Jul 12, 2002 at 06:22:05AM -0400, David Abrahams wrote:
> Oren,
> 
> I like the direction this is going in, but I have some reservations about
> any protocol which requires users to avoid using a simple method name like
> next() on their own multi-pass sequence types unless they intend their
> sequence to be treated as single-pass.

I'm not too thrilled about it, either, but I don't think it's too bad. If
you implement an object with an __iter__ method you must be aware of the
iteration protocol and the next method.  If you put a next method on an
iterable you are most probably confusing iterators and iterables and not 
just using the name 'next' for some other innocent purpose.

> One other possibility: if x.__iter__() is x, it's a single-pass sequence. I
> realize this involves actually invoking the __iter__ method and conjuring
> up a new iterator, but that's generally a lightweight operation...

I think it is critical that all protocols should be defined by something
passive like presence of attributes and attributes of attributes and not by
active probing. I don't see how a future typing system could be retrofitted 
to Python otherwise (pssst, don't tell anyone, but I'm working on such a
system...)

	Oren



From oren-py-d@hishome.net  Fri Jul 12 12:15:03 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 07:15:03 -0400
Subject: [Python-Dev] Alternative implementation of interning, take 2
In-Reply-To: <3D2E922A.4040005@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCCEPFADAB.tim.one@comcast.net> <3D2E922A.4040005@lemburg.com>
Message-ID: <20020712111503.GA16058@hishome.net>

On Fri, Jul 12, 2002 at 10:24:10AM +0200, M.-A. Lemburg wrote:
> >It would help if you could get Marc-Andre and /F to pronounce on whether
> >their code benefits from it -- they're the most prolific extension authors
> >we've got.
> 
> Gee, thanks :-)
> 
> If you could spell out what exactly you mean by "indirect interning"
> that would help.

That's how I call a string whose ob_sinterned is not NULL but doesn't point 
to itself, either.  Such strings are relatively rare. In order to create
one you need to call PyString_InternInPlace on a string that has more than
one reference. The pointer used for the interning is replaced with a "true"
interned string (i.e s->ob_sinterned == s).  The other references still
point to the original string which is now "indirectly interned".

Indirectly interned strings can't be used to speed up comparisons using
'is' instead of '=='.  Using them as dictionary keys does save an strcmp,
though. If I understand this correctly they are used as an optimization for 
extension modules that use PyString_FromString instead of 
PyString_InternFromString for their string constants using a hack in 
PyDict_SetItem that interns the key it gets. 

	Oren



From David Abrahams" <david.abrahams@rcn.com  Fri Jul 12 12:14:21 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 12 Jul 2002 07:14:21 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net> <152601c2298d$f9a71740$6601a8c0@boostconsulting.com> <20020712110113.GA13526@hishome.net>
Message-ID: <157d01c22995$53ae8f50$6601a8c0@boostconsulting.com>

From: "Oren Tirosh" <oren-py-d@hishome.net>

> On Fri, Jul 12, 2002 at 06:22:05AM -0400, David Abrahams wrote:
> > Oren,
> >
> > I like the direction this is going in, but I have some reservations
about
> > any protocol which requires users to avoid using a simple method name
like
> > next() on their own multi-pass sequence types unless they intend their
> > sequence to be treated as single-pass.
>
> I'm not too thrilled about it, either, but I don't think it's too bad. If
> you implement an object with an __iter__ method you must be aware of the
> iteration protocol and the next method.  If you put a next method on an
> iterable you are most probably confusing iterators and iterables and not
> just using the name 'next' for some other innocent purpose.

People may have already written that innocent code, but I'm not sure the
consequences of misinterpreting such sequences as single-pass are so
terrible. Still, I would prefer if we were looking for "__next__" instead
of next().


> > One other possibility: if x.__iter__() is x, it's a single-pass
sequence. I
> > realize this involves actually invoking the __iter__ method and
conjuring
> > up a new iterator, but that's generally a lightweight operation...
>
> I think it is critical that all protocols should be defined by something
> passive like presence of attributes and attributes of attributes and not
by
> active probing.

Isn't that passive/active distinction illusory though? What about
__getattr__ methods?

> I don't see how a future typing system could be retrofitted
> to Python otherwise (pssst, don't tell anyone, but I'm working on such a
> system...)

Nifty! I'd love to get a preview, if possible. Types come into play at the
Python/C++ boundary, and I'm interested in how our systems will interact
(c.f. http://aspn.activestate.com/ASPN/Mail/Message/types-sig/1222793)

-Dave




From oren-py-d@hishome.net  Fri Jul 12 12:50:10 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 14:50:10 +0300
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <157d01c22995$53ae8f50$6601a8c0@boostconsulting.com>; from david.abrahams@rcn.com on Fri, Jul 12, 2002 at 07:14:21AM -0400
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net> <152601c2298d$f9a71740$6601a8c0@boostconsulting.com> <20020712110113.GA13526@hishome.net> <157d01c22995$53ae8f50$6601a8c0@boostconsulting.com>
Message-ID: <20020712145010.A29279@hishome.net>

On Fri, Jul 12, 2002 at 07:14:21AM -0400, David Abrahams wrote:
> > I'm not too thrilled about it, either, but I don't think it's too bad. If
> > you implement an object with an __iter__ method you must be aware of the
> > iteration protocol and the next method.  If you put a next method on an
> > iterable you are most probably confusing iterators and iterables and not
> > just using the name 'next' for some other innocent purpose.
> 
> People may have already written that innocent code, but I'm not sure the
> consequences of misinterpreting such sequences as single-pass are so
> terrible. Still, I would prefer if we were looking for "__next__" instead
> of next().

I'm not actually suggesting this as a reliable way to detect re-iterable 
objects, it's more of an observation.  If you want something that can be 
relied upon for optimizations that would probably require a new __magic__ 
attribute. Any suggestions?

> Isn't that passive/active distinction illusory though? What about
> __getattr__ methods?

I can't believe that any static or semi-static typing system will be able to 
handle __getattr__ virtual attributes.  An object simply won't match a type 
predicate if any of the attributes checked by the predicate are virtual.

> > I don't see how a future typing system could be retrofitted
> > to Python otherwise (pssst, don't tell anyone, but I'm working on such a
> > system...)
> 
> Nifty! I'd love to get a preview, if possible. Types come into play at the
> Python/C++ boundary, and I'm interested in how our systems will interact
> (c.f. http://aspn.activestate.com/ASPN/Mail/Message/types-sig/1222793)

I don't know what you're talking about.  :-)

	Oren




From ark@research.att.com  Fri Jul 12 13:27:56 2002
From: ark@research.att.com (Andrew Koenig)
Date: Fri, 12 Jul 2002 08:27:56 -0400 (EDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <20020712054332.GA77883@hishome.net> (message from Oren Tirosh on
 Fri, 12 Jul 2002 01:43:32 -0400)
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net>
Message-ID: <200207121227.g6CCRtI24509@europa.research.att.com>

Oren> There is no need for a new type of iterator. It's ok that
Oren> iterators are disposable.  If I need multiple iterations I don't
Oren> want to copy the iterator - I prefer to ask the original
Oren> iterable object for a new iterator.  All I need is some way to
Oren> know whether the iterable object (container) can produce
Oren> multiple iterators that generate the same sequence.

You are assuming that you still have access to the original iterable
object.  But what if all you have is an iterator?  Then you need to
be able to ask the iterator for a new iterator.

Oren>   An object is re-iterable if its iterators do not modify its state.

Oren> The iterator of an iterator is itself.  Calling the next method,
Oren> by definition, modifies the internal state of an
Oren> object. Therefore anything that has a next method is not
Oren> re-iterable.

That's not the only possible definition of an iterator.

I'm thinking, in part, about how one might translate some of the C++
standard-library algorithms into Python.  If that translation requires
that the user always supply the original container, rather than using
iterators only, then some algorithms become harder to express or less
ueful.



From guido@python.org  Fri Jul 12 13:46:26 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 08:46:26 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 12 Jul 2002 01:43:32 EDT."
 <20020712054332.GA77883@hishome.net>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com>
 <20020712054332.GA77883@hishome.net>
Message-ID: <200207121246.g6CCkQi32082@pcp02138704pcs.reston01.va.comcast.net>

> I have just submitted a patch that makes a file into an iterator
> (i.e. adds a .next method to files).  With this change all Python
> objects that have an __iter__ method and no next method produce
> iterators that do not modify the container.  Another possibility
> would be to make file iterators that use seek or re-open the file to
> avoid modifying the file position of the parent file object.  I
> don't think that would be a good idea because files can be devices,
> pipes or sockets which are not seekable.

Cute trick, but I think it's too fragile.  You don't know about 3rd
party iterables that have the same problem as file.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 13:50:32 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 08:50:32 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 12 Jul 2002 08:43:38 +0200."
 <E17Su9a-0005Fb-00@mail.python.org>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <13be01c22917$aa306480$6601a8c0@boostconsulting.com> <200207120043.g6C0hvD30978@pcp02138704pcs.reston01.va.comcast.net>
 <E17Su9a-0005Fb-00@mail.python.org>
Message-ID: <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net>

> At the risk of sounding like a broken record -- doesn't protocol
> adaptation stand out as a good way to package up such a "standard
> tool"?  Why should we keep inventing a variety of different ways to
> ask the same kind of service -- "Here is an object X, please return
> it or a wrapper on it in such a way that it satisfies protocol Y, if
> possible"...?

Protocol adaptation sounds like a great reason to be very conservative
in inventing other ways to address such problems.

I don't see protocol adaptation go into Python 2.3.  As Tim channeled
me just after I went on vacation, it's such a tremendous change in how
users will view things that we need to be conservative in introducing
it.

I would encourage experimenting with protocol adaptation though.
Maybe the next steps would be to (a) revise the PEP and (b) produce a
more usable reference implementation as a 3rd party package?

I think Alex is in a great position to become co-author of PEP 246.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 13:52:01 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 08:52:01 -0400
Subject: [Python-Dev] Alternative implementation of interning, take 2
In-Reply-To: Your message of "Fri, 12 Jul 2002 10:24:10 +0200."
 <3D2E922A.4040005@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCCEPFADAB.tim.one@comcast.net>
 <3D2E922A.4040005@lemburg.com>
Message-ID: <200207121252.g6CCq1u32115@pcp02138704pcs.reston01.va.comcast.net>

> What I do need and rely on is the fact that the
> Python compiler interns all constant strings and identifiers in
> Python programs. This makes switching like so:
> 
> if a == 'x':
> elif a == 'y':
> else:
> 
> also work like this (only faster):
> 
> if a is 'x':
> elif a is 'y':
> else:
> 
> provided that 'a' only uses interned strings.

Yuck.  This is an implementation detail.  While it's unlikely to go
away in Python 2.0, please don't rely on this in portable Python.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 13:57:14 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 08:57:14 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 12 Jul 2002 07:01:13 EDT."
 <20020712110113.GA13526@hishome.net>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net> <152601c2298d$f9a71740$6601a8c0@boostconsulting.com>
 <20020712110113.GA13526@hishome.net>
Message-ID: <200207121257.g6CCvEQ07265@pcp02138704pcs.reston01.va.comcast.net>

> If you put a next method on an iterable you are most probably
> confusing iterators and iterables and not just using the name 'next'
> for some other innocent purpose.

Quite to the contrary.  You might have a multi-iterable class that was
defined before the iterator protocol existed, and had a "built-in"
iterator that keeps the iteration state in the object itself (a common
design, e.g. BSD db files have this).  This is OK for simple uses.
But with iterators available the class might grow a proper iterator
class that keeps state external from the object.  But for backward
compatibility reasons you cannot remove the next() method on the class
itself.

QED: you have a multi-iterable object that has both an __iter__ method
and a next method.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 14:05:21 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 09:05:21 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 12 Jul 2002 08:27:56 EDT."
 <200207121227.g6CCRtI24509@europa.research.att.com>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net>
 <200207121227.g6CCRtI24509@europa.research.att.com>
Message-ID: <200207121305.g6CD5Lb07338@pcp02138704pcs.reston01.va.comcast.net>

> I'm thinking, in part, about how one might translate some of the C++
> standard-library algorithms into Python.  If that translation requires
> that the user always supply the original container, rather than using
> iterators only, then some algorithms become harder to express or less
> ueful.

Indeed.  There's a whole slew of interesting things you can do with
iterators that means you won't have a container, only an iterator.

For example, you can define "iterator algebra" functions that take
iterators and return iterators.  A simple example is this generator,
which yields alternating elements of a given iterator.

def alternating(it):
    while 1:
        yield it.next()
        it.next()

The nice thing is that you can combine these easily.  For example
alternating(alternating(it)) would yield every 4th element.

It would be a pity if the results of iterator algebra operations would
not be acceptable to Andrew's proposed algorithm library.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From oren-py-d@hishome.net  Fri Jul 12 14:04:18 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 16:04:18 +0300
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207121227.g6CCRtI24509@europa.research.att.com>; from ark@research.att.com on Fri, Jul 12, 2002 at 08:27:56AM -0400
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com>
Message-ID: <20020712160418.A412@hishome.net>

On Fri, Jul 12, 2002 at 08:27:56AM -0400, Andrew Koenig wrote:
> Oren> There is no need for a new type of iterator. It's ok that
> Oren> iterators are disposable.  If I need multiple iterations I don't
> Oren> want to copy the iterator - I prefer to ask the original
> Oren> iterable object for a new iterator.  All I need is some way to
> Oren> know whether the iterable object (container) can produce
> Oren> multiple iterators that generate the same sequence.
> 
> You are assuming that you still have access to the original iterable
> object.  But what if all you have is an iterator?  Then you need to
> be able to ask the iterator for a new iterator.

Here are two cases I can think of where I don't have access to the iterable
object:

1. There is no iterable object. An iterator object was created directly.
For example, the result of a generator function is an iterator which isn't
the result of some container's __iter__ method.

2. The iterator was received as an argument and the caller sent iter(x) 
instead of x.  In that case I guess it means that the caller doesn't *want* 
to give me access to x.  

> Oren>   An object is re-iterable if its iterators do not modify its state.
> 
> Oren> The iterator of an iterator is itself.  Calling the next method,
> Oren> by definition, modifies the internal state of an
> Oren> object. Therefore anything that has a next method is not
> Oren> re-iterable.
> 
> That's not the only possible definition of an iterator.

It isn't a definition of an iterator.  It isn't even a definition of a 
re-iterable object, it's a sufficient (but not required) condition for 
objects to be re-iterable.

> I'm thinking, in part, about how one might translate some of the C++
> standard-library algorithms into Python.  

Why not translate *what* they do instead of *how* they do it? I'm pretty
sure the Python way would be shorter and simpler anyway.

	Oren




From oren-py-d@hishome.net  Fri Jul 12 14:17:31 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 16:17:31 +0300
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207121246.g6CCkQi32082@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 12, 2002 at 08:46:26AM -0400
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net> <200207121246.g6CCkQi32082@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020712161731.A977@hishome.net>

On Fri, Jul 12, 2002 at 08:46:26AM -0400, Guido van Rossum wrote:
> > I have just submitted a patch that makes a file into an iterator
> > (i.e. adds a .next method to files).  With this change all Python
> > objects that have an __iter__ method and no next method produce
> > iterators that do not modify the container.  Another possibility
> > would be to make file iterators that use seek or re-open the file to
> > avoid modifying the file position of the parent file object.  I
> > don't think that would be a good idea because files can be devices,
> > pipes or sockets which are not seekable.
> 
> Cute trick, but I think it's too fragile.  You don't know about 3rd
> party iterables that have the same problem as file.

I don't understand what you mean by fragile. I'm not suggesting anything
that actually depends on this behavior so I don't see what could break.

I think it's semantically cleaner for iterable objects to produce iterators 
that do not modify the state of the original iterable object. There's no 
way to force extension writers to adhere to this but Python should at least 
set a good example. Python file objects are not a good example. The xrange
object that was its own iterator was not a good example.

	Oren




From guido@python.org  Fri Jul 12 14:26:37 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 09:26:37 -0400
Subject: [Python-Dev] python package
In-Reply-To: Your message of "Thu, 11 Jul 2002 09:43:28 +0200."
 <3D2D3720.9040100@lemburg.com>
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net>
 <3D2D3720.9040100@lemburg.com>
Message-ID: <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net>

I have thought some more about the idea of moving the entire stdlib
into a package named "python" and I reject the idea.

Think of the impact the change would have on the tutorial.

Think of the amount of needless changes to perfectly working code it
would entail.

If you want to avoid 3rd party module/package names to be invalidated
by additions to the standard library, you might just as well introduce
a "nonstd" package into which all 3rd party extensions must be placed.
This at least doesn't require people who don't use 3rd party code to
change their programs.

Maybe we should create a standard package hierarchy; Eric Raymond once
started working on such a proposal but I have discouraged him because
I think it would cause too much upheaval.  But for Python 3 I would
consider it.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 14:31:36 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 09:31:36 -0400
Subject: [Python-Dev] The C API and wide unicode support
In-Reply-To: Your message of "Wed, 10 Jul 2002 23:21:30 +0200."
 <3D2CA55A.6080803@lemburg.com>
References: <2mr8ibzmy2.fsf@starship.python.net> <3D2C4B4C.6050204@livinglogic.de> <200207101502.g6AF2rP26398@pcp02138704pcs.reston01.va.comcast.net> <3D2C51D5.8060000@livinglogic.de> <2m1yabiotx.fsf@starship.python.net> <3D2C6C90.7090006@lemburg.com> <200207101826.g6AIQSY27317@pcp02138704pcs.reston01.va.comcast.net>
 <3D2CA55A.6080803@lemburg.com>
Message-ID: <200207121331.g6CDVa707544@pcp02138704pcs.reston01.va.comcast.net>

[me]
> > Maybe because other macros are often disallowed in (3rd party)
> > extensions, the reason being that the macros dig in the internal
> > representation which isn't guaranteed to be binary compatible?  It
> > would make sense that the same rules applies to the Unicode macros in
> > 3rd party extensions.
> 
> Which macros would that be ? I modelled the macros in the
> Unicode implementation after those of the string
> implementation. And those macros are certainly used in
> a lot of 3rd party extensions.

I take it back.  We're anal about binary compatibility in part because
of this.  There are (or were? it's changed so much!) a few macros in
the memory allocator API that were not supposed to be used except in
core code; I think I was thinking of those.

> I guess, having the macros in the header files without an
> explicit warning marks them as public interface. That's how
> I have used them in tons of code and I think that I'm not
> alone in using this approach.

If there was a warning in the docs, that would prove you wrong, but
fortunately for you there isn't. :-)

> I think that the fact that Michael is seeing breakage is
> a good thing. Otherwise, he would probably not have noticed
> that RedHat chose to use the wide build as default.

Exactly.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ark@research.att.com  Fri Jul 12 14:32:36 2002
From: ark@research.att.com (Andrew Koenig)
Date: Fri, 12 Jul 2002 09:32:36 -0400 (EDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <20020712160418.A412@hishome.net> (message from Oren Tirosh on
 Fri, 12 Jul 2002 16:04:18 +0300)
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com> <20020712160418.A412@hishome.net>
Message-ID: <200207121332.g6CDWaE25713@europa.research.att.com>

Oren> 1. There is no iterable object. An iterator object was created
Oren> directly.  For example, the result of a generator function is an
Oren> iterator which isn't the result of some container's __iter__
Oren> method.

Yes.

Oren> 2. The iterator was received as an argument and the caller sent
Oren> iter(x) instead of x.  In that case I guess it means that the
Oren> caller doesn't *want* to give me access to x.

3. The caller sent an iterator that refers to an element of the
container other than the initial one.  For example:

	  def findafter(it, x):
	      it = iter(it)
	      while it.next() != x:
		    pass
	      return it

This function locates the first element equal to x in the sequence
denoted by iter, and returns an iterator that refers to the element
after the one equal to x.  It raises StopIteration if no such element
exists.

Now, suppose you want to use this function to find all of the elements
in a sequence that are equal to x.  On the second and subsequent calls,
you're going to have to pass an iterator as the first argument, because
passing the container isn't going to give you the right answer.

For another, more detailed example of how sensitive library design
is to the details of iterator behavior, please look at
http://www.research.att.com/~ark/design.pdf
(I hope I have uttered the right incantations to make it available
outside our firewall; if I haven't, please let me know)



From guido@python.org  Fri Jul 12 14:36:22 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 09:36:22 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 12 Jul 2002 16:17:31 +0300."
 <20020712161731.A977@hishome.net>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net> <200207121246.g6CCkQi32082@pcp02138704pcs.reston01.va.comcast.net>
 <20020712161731.A977@hishome.net>
Message-ID: <200207121336.g6CDaMD07592@pcp02138704pcs.reston01.va.comcast.net>

> > > I have just submitted a patch that makes a file into an iterator
> > > (i.e. adds a .next method to files).  With this change all Python
> > > objects that have an __iter__ method and no next method produce
> > > iterators that do not modify the container.  Another possibility
> > > would be to make file iterators that use seek or re-open the file to
> > > avoid modifying the file position of the parent file object.  I
> > > don't think that would be a good idea because files can be devices,
> > > pipes or sockets which are not seekable.
> > 
> > Cute trick, but I think it's too fragile.  You don't know about 3rd
> > party iterables that have the same problem as file.
> 
> I don't understand what you mean by fragile. I'm not suggesting anything
> that actually depends on this behavior so I don't see what could break.

If nothing depends on it, what's the point?

> I think it's semantically cleaner for iterable objects to produce
> iterators that do not modify the state of the original iterable
> object.

Too bad.  Files are the only first but certainly not the only example,
and saying it's cleaner doesn't make it so.

> There's no way to force extension writers to adhere to this but
> Python should at least set a good example. Python file objects are
> not a good example. The xrange object that was its own iterator was
> not a good example.

That version of the xrange object was broken.

I don't see what's wrong with the file object.  Iterating over a file
changes the file's state, that's just a fact of life.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ark@research.att.com  Fri Jul 12 14:37:39 2002
From: ark@research.att.com (Andrew Koenig)
Date: 12 Jul 2002 09:37:39 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <20020712160418.A412@hishome.net>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>
 <yu99bs9ejn6k.fsf@europa.research.att.com>
 <20020712054332.GA77883@hishome.net>
 <200207121227.g6CCRtI24509@europa.research.att.com>
 <20020712160418.A412@hishome.net>
Message-ID: <yu99lm8hhxss.fsf@europa.research.att.com>

Oh yes -- one more use for being able to copy an iterator: If you
can't copy an iterator, how can you determine the value to which the
iterator refers without losing access to that value?

Oren> Why not translate *what* they do instead of *how* they do it?
Oren> I'm pretty sure the Python way would be shorter and simpler
Oren> anyway.

Maybe yes, maybe no.  It would certainly be different, because the C++
algorithms generally assume that iterators support comparison
operations.  That assumption makes possible algorithms in C++ that are
difficult to express at all using Python iterators as they stand.

On the other hand, the availability of garbage collection in Python,
combined with the dynamic nature of its type system, makes it possible
to express algorithms in Python that cannot be expressed easily in
C++ using C++ iterators as they now stand.

Details about language design can have a profound effect on usage,
which in turn has a profound effect on future design.

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark



From aleax@aleax.it  Fri Jul 12 14:46:10 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 12 Jul 2002 15:46:10 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17Su9a-0005Fb-00@mail.python.org> <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17T0ku-0008Ap-00@mail.python.org>

On Friday 12 July 2002 02:50 pm, Guido van Rossum wrote:
	...
> I think Alex is in a great position to become co-author of PEP 246.

Aye aye, cap'n.  What's the procedure for "becoming co-author" -- edit
python/nondist/peps/pep-0246.txt and send the cvs diff to Barry, or ... ?


Alex



From aahz@pythoncraft.com  Fri Jul 12 15:07:57 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 12 Jul 2002 10:07:57 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <E17T0ku-0008Ap-00@mail.python.org>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17Su9a-0005Fb-00@mail.python.org> <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net> <E17T0ku-0008Ap-00@mail.python.org>
Message-ID: <20020712140757.GB24795@panix.com>

On Fri, Jul 12, 2002, Alex Martelli wrote:
> On Friday 12 July 2002 02:50 pm, Guido van Rossum wrote:
>>
>> I think Alex is in a great position to become co-author of PEP 246.
> 
> Aye aye, cap'n.  What's the procedure for "becoming co-author" -- edit
> python/nondist/peps/pep-0246.txt and send the cvs diff to Barry, or ... ?

Get the original author's permission first, if possible.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From guido@python.org  Fri Jul 12 15:15:23 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 10:15:23 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 12 Jul 2002 15:46:10 +0200."
 <E17T0ku-0008Ap-00@mail.python.org>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17Su9a-0005Fb-00@mail.python.org> <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net>
 <E17T0ku-0008Ap-00@mail.python.org>
Message-ID: <200207121415.g6CEFNr07738@pcp02138704pcs.reston01.va.comcast.net>

> > I think Alex is in a great position to become co-author of PEP 246.
> 
> Aye aye, cap'n.  What's the procedure for "becoming co-author" -- edit
> python/nondist/peps/pep-0246.txt and send the cvs diff to Barry, or ... ?

I expect Barry won't accept your changes unless the original author
agrees.  This just happened to the logging PEP (wich was completely
transferred to the new author).

(Barry: maybe PEP 1 should discuss transfer of PEP ownership?  I
think that Trent should actually have remained co-author of PEP 282,
even if he intends not to contribute another line.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From oren-py-d@hishome.net  Fri Jul 12 15:16:26 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 17:16:26 +0300
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207121336.g6CDaMD07592@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 12, 2002 at 09:36:22AM -0400
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net> <200207121246.g6CCkQi32082@pcp02138704pcs.reston01.va.comcast.net> <20020712161731.A977@hishome.net> <200207121336.g6CDaMD07592@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020712171626.A2253@hishome.net>

On Fri, Jul 12, 2002 at 09:36:22AM -0400, Guido van Rossum wrote:
> > I don't understand what you mean by fragile. I'm not suggesting anything
> > that actually depends on this behavior so I don't see what could break.
> 
> If nothing depends on it, what's the point?

To satisfy my perverted obsession for semantic hygiene, of course!

> That version of the xrange object was broken.

That's exactly my point.  There will be more broken code like this as long 
as people keep confusing iterators and iterables. Making the language 
semantically cleaner should help prevent things like this in the long run.

I remember it was pretty hard to actually convince anyone that xrange was 
broken. When I pointed out that the xrange 'iterator' modified the state 
of the xrange 'container' people responded that it's ok because this 
happens with file objects, too...

> I don't see what's wrong with the file object.  Iterating over a file
> changes the file's state, that's just a fact of life.

A file object is an iterator pretending to be a container. For historical
reasons it uses 'readline' instead of 'next' and an empty string instead of
StopIteration but it basically does the same job. A file object is not 
really a container that can produce iterators of itself.

	Oren




From guido@python.org  Fri Jul 12 15:30:08 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 10:30:08 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 12 Jul 2002 17:16:26 +0300."
 <20020712171626.A2253@hishome.net>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net> <200207121246.g6CCkQi32082@pcp02138704pcs.reston01.va.comcast.net> <20020712161731.A977@hishome.net> <200207121336.g6CDaMD07592@pcp02138704pcs.reston01.va.comcast.net>
 <20020712171626.A2253@hishome.net>
Message-ID: <200207121430.g6CEU8t07911@pcp02138704pcs.reston01.va.comcast.net>

I think this thread is ready to die.

> > That version of the xrange object was broken.
> 
> That's exactly my point.  There will be more broken code like this
> as long as people keep confusing iterators and iterables. Making the
> language semantically cleaner should help prevent things like this
> in the long run.

I don't think that the language can help this.  There's nothing oyu
can do to remove the wart from file objects.

> I remember it was pretty hard to actually convince anyone that
> xrange was broken.

Huh?  IIRC I said it was broken right away and pushed Raymond to fix
it.

> When I pointed out that the xrange 'iterator' modified the state of
> the xrange 'container' people responded that it's ok because this
> happens with file objects, too...

A confusion that you don't stamp out by "fixing" files.

> > I don't see what's wrong with the file object.  Iterating over a file
> > changes the file's state, that's just a fact of life.
> 
> A file object is an iterator pretending to be a container.

In what sense does it pretend to be a container?  File objects are
what they are; they have rich semantics for a reason.

> For historical reasons it uses 'readline' instead of 'next' and an
> empty string instead of StopIteration but it basically does the same
> job. A file object is not really a container that can produce
> iterators of itself.

I think this thread is ready to die.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 15:40:45 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 10:40:45 -0400
Subject: [Python-Dev] Status of various Python branches
Message-ID: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net>

I believe the PBF has reached consensus that Python 2.2 will be the
tie-wearing release.

IMO this means that backporting fixes to 2.2 will continue to be
valuable; I don't see the PBF coming up with a volunteer to do this
right away.  If you can't backport a fix yourself, at least add
something like "bugfix candidate" to the checkin message.

I think that backporting fixes to 2.1 is *not* worth our time any
more, with the exception of (a) critical security fixes, and (b) fixes
for severe problems that we know affect Python 2.1 users who cannot
upgrade to 2.2.  Example: Zope 2.5 requires Python 2.1.  I'm not aware
of any such fixes now.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 15:47:34 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 10:47:34 -0400
Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting
In-Reply-To: Your message of "Sun, 23 Jun 2002 15:16:30 -0300."
 <20020623181630.GN25927@laranja.org>
References: <20020623181630.GN25927@laranja.org>
Message-ID: <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net>

> Guido, can you please, for our enlightenment, tell us what are the
> reasons you feel %(foo)s was a mistake?

Because of the trailing 's'.  It's very easy to leave it out by
mistake, and because the definition of printf formats skips over
spaces (don't ask me why), the first character of the following word
is used as the type indicator.

(FWIW, I agree with your other observations -- this was why I
support exploring an alternative in PEP 292.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 16:01:22 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 11:01:22 -0400
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: Your message of "Mon, 24 Jun 2002 23:01:40 +0300."
 <20020624230140.B3555@hishome.net>
References: <20020624230140.B3555@hishome.net>
Message-ID: <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net>

I'd like to reject PEP 294.

Adding the type names that are already builtins to types.py is
definitely a bad idea (the patch is full of lines like "int = int" --
this can only serve to confuse).

I propose to leave types.py alone.

If we need a place to name types that don't deserve being builtins,
perhaps new.py is a better place?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Fri Jul 12 16:12:07 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 11:12:07 -0400
Subject: [Python-Dev] Status of various Python branches
In-Reply-To: <15662.61124.842141.265751@slothrop.zope.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBCAEAB.tim.one@comcast.net>

> Speaking of maintenance branches, the test suite currently fails on
> the release22-maint branch.  test_descr encounters a fatal Python
> error.  The tail of the output is:
>
> Testing deepcopy of recursive objects...
> Testing uninitialized module objects...
> Testing pickling of classes with __slots__ ...
> Testing __doc__ descriptor...
> Testing for __imul__ problems...
> Testing that copy.*copy() correctly uses __setstate__...
> Testing resurrection of new-style instance...
> Fatal Python error: GC object already in linked list

Did you do an update and a fresh build?  That's exactly how the current
branch test_gc would fail if you're using the released 2.2.1 Python, or
anything after that older than about yesterday.




From jeremy@zope.com  Fri Jul 12 15:59:16 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Fri, 12 Jul 2002 10:59:16 -0400
Subject: [Python-Dev] Status of various Python branches
In-Reply-To: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net>
References: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15662.61124.842141.265751@slothrop.zope.com>

Speaking of maintenance branches, the test suite currently fails on
the release22-maint branch.  test_descr encounters a fatal Python
error.  The tail of the output is:

Testing deepcopy of recursive objects...
Testing uninitialized module objects...
Testing pickling of classes with __slots__ ...
Testing __doc__ descriptor...
Testing for __imul__ problems...
Testing that copy.*copy() correctly uses __setstate__...
Testing resurrection of new-style instance...
Fatal Python error: GC object already in linked list

Jeremy




From jeremy@zope.com  Fri Jul 12 16:20:06 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Fri, 12 Jul 2002 11:20:06 -0400
Subject: [Python-Dev] Status of various Python branches
In-Reply-To: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net>
References: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15662.62374.495006.119684@slothrop.zope.com>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

  GvR> I believe the PBF has reached consensus that Python 2.2 will be
  GvR> the tie-wearing release.

Did we ever establish what kind of tie?  I was thinking a string tie
would be distinctive!

  GvR> IMO this means that backporting fixes to 2.2 will continue to
  GvR> be valuable; I don't see the PBF coming up with a volunteer to
  GvR> do this right away.  If you can't backport a fix yourself, at
  GvR> least add something like "bugfix candidate" to the checkin
  GvR> message.

It would be helpful if the Snake Farm was more accessible to
developers.  Specifically, I see that they are running regular builds
of Python and, apparently, collecting the output of "make test."  It
is hard, however, to find the actual results of these test runs.  I've
got a bunch of concrete suggestions, but I don't know who to make them
to.

The test results we get from the Zope CVS are quite helpful, and I'd
find similar results for the Python CVS equally helpful.  The results
could show several branches, debug vs. normal build, and different
platforms.  Getting those results every night would notify us of
errors much more reliabily than depending on individual developers
checking in changes to run all those various tests.

  GvR> I think that backporting fixes to 2.1 is *not* worth our time
  GvR> any more, with the exception of (a) critical security fixes,
  GvR> and (b) fixes for severe problems that we know affect Python
  GvR> 2.1 users who cannot upgrade to 2.2.  Example: Zope 2.5
  GvR> requires Python 2.1.  I'm not aware of any such fixes now.

I'm going to make one more change on the release21-maint branch,
because my earlier httplib bug fix had a few bugs of its own.

Jeremy




From guido@python.org  Fri Jul 12 16:26:41 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 11:26:41 -0400
Subject: [Python-Dev] Indexing builtin sequences with objects which supply __int__
In-Reply-To: Your message of "Fri, 21 Jun 2002 12:00:27 EDT."
 <3D134D9B.7030601@stsci.edu>
References: <3D123B1E.6050600@stsci.edu> <200206202053.g5KKrCA05552@odiug.zope.com> <3D1254E5.6010007@stsci.edu> <200206210129.g5L1TV509345@pcp02138704pcs.reston01.va.comcast.net> <3D132D2A.7080801@stsci.edu> <200206211359.g5LDxoF25028@pcp02138704pcs.reston01.va.comcast.net>
 <3D134D9B.7030601@stsci.edu>
Message-ID: <200207121526.g6CFQfI08263@pcp02138704pcs.reston01.va.comcast.net>

I've thought about this more, and I think I don't want to make the
requested change (accept objects which implement __int__ as valid
sequence indices).  I also don't want to add a new protocol
(the proposed __index__).

I suggest that you try to find a solution that works without requiring
changes to Python -- that way you have a much better chance that your
code will work with Python 2.2, which will very likely have a lifetime
comparable to that of Python 1.5.2 (in parallel with 2.3, for sure).

I understand your desire to equate 0-D arrays and scalars, but I'm
afraid that's not how the rest of Python works.  I don't think we
should change Python's semantic framework with APL's.

I'm neutral on what you should do instead; personally, I'd continue to
return Python scalars for 0-D arrays, but you could switch to 0-D
arrays if you think the advantages outweigh the disadvantages.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 16:30:55 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 11:30:55 -0400
Subject: [Python-Dev] Status of various Python branches
In-Reply-To: Your message of "Fri, 12 Jul 2002 11:20:06 EDT."
 <15662.62374.495006.119684@slothrop.zope.com>
References: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net>
 <15662.62374.495006.119684@slothrop.zope.com>
Message-ID: <200207121530.g6CFUtA08320@pcp02138704pcs.reston01.va.comcast.net>

> It would be helpful if the Snake Farm was more accessible to
> developers.  Specifically, I see that they are running regular builds
> of Python and, apparently, collecting the output of "make test."  It
> is hard, however, to find the actual results of these test runs.  I've
> got a bunch of concrete suggestions, but I don't know who to make them
> to.

Subscribe to http://lists.lysator.liu.se/mailman/listinfo/snake-farm

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas@python.ca  Fri Jul 12 16:37:56 2002
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 12 Jul 2002 08:37:56 -0700
Subject: [Python-Dev] Alternative implementation of interning, take 2
In-Reply-To: <200207121252.g6CCq1u32115@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 12, 2002 at 08:52:01AM -0400
References: <LNBBLJKPBEHFEDALKOLCCEPFADAB.tim.one@comcast.net> <3D2E922A.4040005@lemburg.com> <200207121252.g6CCq1u32115@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020712083756.A18124@glacier.arctrix.com>

Guido van Rossum wrote:
> Yuck.  This is an implementation detail.  While it's unlikely to go
> away in Python 2.0, please don't rely on this in portable Python.

The time machine in action. :-)

  Neil



From guido@python.org  Fri Jul 12 16:36:52 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 11:36:52 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: Your message of "Sun, 23 Jun 2002 15:22:09 PDT."
 <20020623222209.62675.qmail@web40105.mail.yahoo.com>
References: <20020623222209.62675.qmail@web40105.mail.yahoo.com>
Message-ID: <200207121536.g6CFaqr09850@pcp02138704pcs.reston01.va.comcast.net>

I'm a little surprised.  Raymond Hettinger checked in a change that
makes all slices of buffer objects return strings.  His comments on SF
bug 546434 say that only one person replied and that they agreed
returning strings was the better solution.  But that's not how I read
the only response to his query that I see in python-dev, from Scott
Gilbert:

> Since the array module already has a way to create a ByteArray (and a
> ShortArray, and...), buffer objects don't really need to duplicate that
> effort.  Except creating an array from your own "special memory" (mmap,
> DMA, third party API), and backwards compatibility in general.  :-)
> 
> 
> 
> BTW: I chuckled when I saw you post this the first time.  This topic seems
> to draw a lot of silence.
> 
> I know that I would suggest deprecating the PyBufferObject to just being a
> BufferInspector, and taking what little extra functionality was in there
> and stuffing it into arraymodule.c.  Another solution would be to factor
> PyBufferObject into PyBufferInspector and a "bytes" object.  A few months
> ago, I was tempted to submit a PEP saying as much, but I think that would
> have quietly fallen to the floor.  Nobody seems to like this topic too
> much...

I read this as a recommendation to forget about returning strings.  Am
I mistaken?

Also, I wish you'd submitted that PEP.  IMO the reason that nobody
likes this topic is that there is much confusion about why we have
buffer objects in the first place.  Any attempt at clarifying this
(e.g. proposing separate byte arrays and buffer inspectors) would be
welcome.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From oren-py-d@hishome.net  Fri Jul 12 16:44:28 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 18:44:28 +0300
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 12, 2002 at 11:01:22AM -0400
References: <20020624230140.B3555@hishome.net> <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020712184428.A4777@hishome.net>

On Fri, Jul 12, 2002 at 11:01:22AM -0400, Guido van Rossum wrote:
> I'd like to reject PEP 294.
> 
> Adding the type names that are already builtins to types.py is
> definitely a bad idea (the patch is full of lines like "int = int" --
> this can only serve to confuse).
> 
> I propose to leave types.py alone.
> 
> If we need a place to name types that don't deserve being builtins,
> perhaps new.py is a better place?

The new. prefix is natural enough for 

	m = new.module('name')

type but it looks pretty awkward in 

	if isinstance(obj, new.generator):

What's the meaning of 'new' in this context?

The idea of using the types module turned out to have more problems than 
appeared at first but new doesn't look much better to me.

Anyone has other suggestions?

	Oren



From guido@python.org  Fri Jul 12 16:51:27 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 11:51:27 -0400
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: Your message of "Fri, 12 Jul 2002 18:44:28 +0300."
 <20020712184428.A4777@hishome.net>
References: <20020624230140.B3555@hishome.net> <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net>
 <20020712184428.A4777@hishome.net>
Message-ID: <200207121551.g6CFpRg10647@pcp02138704pcs.reston01.va.comcast.net>

> > If we need a place to name types that don't deserve being builtins,
> > perhaps new.py is a better place?
> 
> The new. prefix is natural enough for 
> 
> 	m = new.module('name')
> 
> type but it looks pretty awkward in 
> 
> 	if isinstance(obj, new.generator):
> 
> What's the meaning of 'new' in this context?

Sometimes you ask too many questions. :-)

Let's just say that this is a historically available name.  I don't
expect that isinstance(obj, generator) is a very common question to
ask, so I don't mind if you have to ask it in a somewhat awkward way.

> The idea of using the types module turned out to have more problems than 
> appeared at first but new doesn't look much better to me.

Using new.py looks much better to me because it already works.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 16:44:38 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 11:44:38 -0400
Subject: [Python-Dev] Status of various Python branches
In-Reply-To: Your message of "Fri, 12 Jul 2002 10:59:16 EDT."
 <15662.61124.842141.265751@slothrop.zope.com>
References: <200207121440.g6CEeka07973@pcp02138704pcs.reston01.va.comcast.net>
 <15662.61124.842141.265751@slothrop.zope.com>
Message-ID: <200207121544.g6CFick10578@pcp02138704pcs.reston01.va.comcast.net>

> Speaking of maintenance branches, the test suite currently fails on
> the release22-maint branch.  test_descr encounters a fatal Python
> error.  The tail of the output is:
> 
[...]
> Fatal Python error: GC object already in linked list

I dont see this.  But for me, two tests fail in the "release22-maint"
branch:

    test_httplib test_pyclbr

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy@zope.com  Fri Jul 12 17:06:25 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Fri, 12 Jul 2002 12:06:25 -0400
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: <200207121551.g6CFpRg10647@pcp02138704pcs.reston01.va.comcast.net>
References: <20020624230140.B3555@hishome.net>
 <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net>
 <20020712184428.A4777@hishome.net>
 <200207121551.g6CFpRg10647@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15662.65153.90462.540450@slothrop.zope.com>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

  >> > If we need a place to name types that don't deserve being
  >> > builtins, perhaps new.py is a better place?
  >>
  >> The new. prefix is natural enough for
  >>
  >> m = new.module('name')
  >>
  >> type but it looks pretty awkward in
  >>
  >> if isinstance(obj, new.generator):
  >>
  >> What's the meaning of 'new' in this context?

  GvR> Sometimes you ask too many questions. :-)

  GvR> Let's just say that this is a historically available name.  I
  GvR> don't expect that isinstance(obj, generator) is a very common
  GvR> question to ask, so I don't mind if you have to ask it in a
  GvR> somewhat awkward way.

I recently wrote some code that needed to look for functions.  I wrote
it this way:

from new import function

# ...

if isinstance(obj, function):
    # ...

It didn't look odd at all.  And I don't care much where I import
function from.  I wouldn't mind if all the type objects defined in new
where available in types.  IOW, the names exported by new could also
be exported by types.

This means types would fall into two categories: types with builtin
names and types available in the types module.  I expect the current
set of types with builtin names is sufficient.

Jeremy




From barry@zope.com  Fri Jul 12 16:48:57 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 12 Jul 2002 11:48:57 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>
 <yu99bs9ejn6k.fsf@europa.research.att.com>
 <20020712054332.GA77883@hishome.net>
 <200207121227.g6CCRtI24509@europa.research.att.com>
Message-ID: <15662.64105.997215.294990@anthem.wooz.org>

>>>>> "AK" == Andrew Koenig <ark@research.att.com> writes:

    AK> You are assuming that you still have access to the original
    AK> iterable object.  But what if all you have is an iterator?
    AK> Then you need to be able to ask the iterator for a new
    AK> iterator.

Would it be useful to add to the interator "interface" a method which
would retrieve the original iterable object?  I've no idea what that
method should be called, but it seems like it would be trivial to add
since most (all?) iterators have a pointer to their underlying object
anyway, don't they?

-Barry



From tim.one@comcast.net  Fri Jul 12 17:10:30 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 12:10:30 -0400
Subject: [Python-Dev] long long configuration
In-Reply-To: <14ea01c2298b$b91a59a0$6601a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEBKAEAB.tim.one@comcast.net>

>>> In LongObject.h we have:
>>>
>>>     #ifdef HAVE_LONG_LONG
>>>
>>>     /* Hopefully this is portable... */
>>>     #ifndef ULONG_MAX
>>>     #define ULONG_MAX 4294967295U
>>>     #endif
>>>     #ifndef LONGLONG_MAX
>>>     #define LONGLONG_MAX 9223372036854775807LL
>>>     #endif
>>>     #ifndef ULONGLONG_MAX
>>>     #define ULONGLONG_MAX 0xffffffffffffffffULL
>>>     #endif

Note that I already removed all this from current CVS (except for the #ifdef
HAVE_LONG_LONG, which is still needed for code following the quoted block).

That's for 2.3.  Would it be of value to remove it from 2.2.2 too?

>> What problem?

> Uh, sorry. Depending on the order of #includes, Python's headers can
> confuse Boost's configuration.
> ...
> Because one translation unit said (in effect):
>
> #include <Python.h>          // defines ULONGLONG_MAX
> #include <boost/config.hpp>  // decides long long is available
>
> and the other said:
>
> #include <boost/config.hpp> // decides long long is unavailable
> #include <Python.h>         // defines ULONGLONG_MAX (harmless this time)

OK, that's what I figured -- blatant user error, and probably a deliberate
and malicious one too <wink>.

> ...
> I'd also suggest prefixing HAVE_LONG_LONG with some kind of PYTHON_
> grist to keep it out of the way of more-naive applications, but I don't
> want to push my luck \<wink> -- I still remember what happened when I
> suggested that _Py_... names should be avoided!

IIRC, we said we wouldn't avoid them, and I agree that if you were to
suggest it, you'd likely get the same kind of response to suggesting we slap
PYTHON_ in front of HAVE_XYZ names.  A problem is that those more-naive
applications are at least equally likely to *rely* on Python.h continuing to
expose the same set of names it currently exposes, advertised or not.
Indeed, I'm afraid there's a real chance I broke someone's extension by
removing the unadvertised LONGLONG_MAX name.  In any case, it's too much
fiddling just to save you the effort of ordering a pair of includes
consistently <0.9 wink>.




From guido@python.org  Fri Jul 12 17:12:58 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 12:12:58 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 12 Jul 2002 11:48:57 EDT."
 <15662.64105.997215.294990@anthem.wooz.org>
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com> <yu99bs9ejn6k.fsf@europa.research.att.com> <20020712054332.GA77883@hishome.net> <200207121227.g6CCRtI24509@europa.research.att.com>
 <15662.64105.997215.294990@anthem.wooz.org>
Message-ID: <200207121612.g6CGCwF12306@pcp02138704pcs.reston01.va.comcast.net>

> Would it be useful to add to the interator "interface" a method which
> would retrieve the original iterable object?  I've no idea what that
> method should be called, but it seems like it would be trivial to add
> since most (all?) iterators have a pointer to their underlying object
> anyway, don't they?

No.  The (important!) class of generator-iterators does not have an
underlying container object.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@zope.com  Fri Jul 12 17:12:00 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 12 Jul 2002 12:12:00 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com>
 <E17Su9a-0005Fb-00@mail.python.org>
 <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net>
 <E17T0ku-0008Ap-00@mail.python.org>
Message-ID: <15662.65488.741894.155099@anthem.wooz.org>

>>>>> "AM" == Alex Martelli <aleax@aleax.it> writes:

    >> I think Alex is in a great position to become co-author of PEP
    >> 246.

    AM> Aye aye, cap'n.  What's the procedure for "becoming co-author"
    AM> -- edit python/nondist/peps/pep-0246.txt and send the cvs diff
    AM> to Barry, or ... ?

That would work fine, although I would like to get /some/
acknowledgement from Clark Evans that passing the torch (or sharing
the flame as it were) was okay with him.

-Barry



From jeremy@zope.com  Fri Jul 12 17:11:03 2002
From: jeremy@zope.com (Jeremy Hylton)
Date: Fri, 12 Jul 2002 12:11:03 -0400
Subject: [Python-Dev] Status of various Python branches
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEBCAEAB.tim.one@comcast.net>
References: <15662.61124.842141.265751@slothrop.zope.com>
 <LNBBLJKPBEHFEDALKOLCCEBCAEAB.tim.one@comcast.net>
Message-ID: <15662.65431.129271.558320@slothrop.zope.com>

>>>>> "TP" == Tim Peters <tim.one@comcast.net> writes:

  >> Speaking of maintenance branches, the test suite currently fails
  >> on the release22-maint branch.  test_descr encounters a fatal
  >> Python error.  The tail of the output is:
  >>
  >> Testing deepcopy of recursive objects...  Testing uninitialized
  >> module objects...  Testing pickling of classes with __slots__ ...
  >> Testing __doc__ descriptor...  Testing for __imul__ problems...
  >> Testing that copy.*copy() correctly uses __setstate__...  Testing
  >> resurrection of new-style instance...  Fatal Python error: GC
  >> object already in linked list

  TP> Did you do an update and a fresh build?  That's exactly how the
  TP> current branch test_gc would fail if you're using the released
  TP> 2.2.1 Python, or anything after that older than about yesterday.

I thought I was, but apparently not.  Another round of update and
build and the problem went away.

Jeremy




From guido@python.org  Fri Jul 12 17:15:02 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 12:15:02 -0400
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: Your message of "Fri, 12 Jul 2002 12:06:25 EDT."
 <15662.65153.90462.540450@slothrop.zope.com>
References: <20020624230140.B3555@hishome.net> <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net> <20020712184428.A4777@hishome.net> <200207121551.g6CFpRg10647@pcp02138704pcs.reston01.va.comcast.net>
 <15662.65153.90462.540450@slothrop.zope.com>
Message-ID: <200207121615.g6CGF2q12335@pcp02138704pcs.reston01.va.comcast.net>

> from new import function
> 
> # ...
> 
> if isinstance(obj, function):
>     # ...
> 
> It didn't look odd at all.  And I don't care much where I import
> function from.  I wouldn't mind if all the type objects defined in new
> where available in types.  IOW, the names exported by new could also
> be exported by types.

No, the docs for types.py promises that it only exports names ending
in 'Type'.  That's not a promise I want to break lightly.

> This means types would fall into two categories: types with builtin
> names and types available in the types module.  I expect the current
> set of types with builtin names is sufficient.

This is already the case, but the names exported by types.py don't
match the __name__ attribute of those types.  Is that a problem?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aleax@aleax.it  Fri Jul 12 17:16:54 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 12 Jul 2002 18:16:54 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <15662.65488.741894.155099@anthem.wooz.org>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17T0ku-0008Ap-00@mail.python.org> <15662.65488.741894.155099@anthem.wooz.org>
Message-ID: <E17T36k-0003xt-00@mail.python.org>

On Friday 12 July 2002 06:12 pm, Barry A. Warsaw wrote:
> >>>>> "AM" == Alex Martelli <aleax@aleax.it> writes:
>     >> I think Alex is in a great position to become co-author of PEP
>     >> 246.
>
>     AM> Aye aye, cap'n.  What's the procedure for "becoming co-author"
>     AM> -- edit python/nondist/peps/pep-0246.txt and send the cvs diff
>     AM> to Barry, or ... ?
>
> That would work fine, although I would like to get /some/
> acknowledgement from Clark Evans that passing the torch (or sharing
> the flame as it were) was okay with him.

Makes sense (& thanks to the others who suggested the same thing).
I mailed Clark and I'll wait to hear from him.


Alex



From guido@python.org  Fri Jul 12 17:23:37 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 12:23:37 -0400
Subject: [Python-Dev] String substitution: compile-time versus runtime
In-Reply-To: Your message of "Thu, 20 Jun 2002 20:05:17 PDT."
 <3D1297ED.3990C30F@prescod.net>
References: <Pine.LNX.4.44.0206201537410.1419-100000@ziggy> <200206210141.g5L1fDv09800@pcp02138704pcs.reston01.va.comcast.net>
 <3D1297ED.3990C30F@prescod.net>
Message-ID: <200207121623.g6CGNbf12384@pcp02138704pcs.reston01.va.comcast.net>

[Paul]
> I think that what I hear you saying is that interpolation should ideally
> be done at a compile time for simple uses and at runtime for i18n. The
> compile-time version should have the ability to do full expressions
> (array indexes and self.members at the very least) and will have access
> to nested scopes. The runtime version should only work with
> dictionaries.

Yes.

> I think you also said that they should both use named parameters instead
> of positional parameters. And presumably just for simplicity they would
> use similar syntax although one would be triggered at compile time and
> one at runtime.

Yes.

> If "%" survives, it would be used for positional parameters, instead of
> named parameters.

Yes (in Python 3).  I can also see the viewpoint that the printf
syntax should be abandoned entirely (in Python 3), in favor of a
different (and probably more verbose) way to spell things like "%6.3f"
or "%04x".  Although there may be application areas (like producing
output from numeric programs) where the formatting options are very
convenient.  In that case Python 3 could retain the positional %
syntax but drop the by-name syntax.  I'm undecided on this.

> Is that your current thinking on the matter? 

Yes.  But based on a lot of feedback (e.g. Alex's anecdote) I'm
inclined to let the matter rest rather than rush to add a new language
feature.

> I think we are making progress if we're coming to understand that the
> two different problem domains (simple scripts versus i18n) have
> different needs and that there is probably no one solution that fits
> both.

OTOH, there's François's position:

[François]
> The mantra I repeated all along had two key points:
> 
> 1) internationalisation will only be successful if designed to be
>    unobtrusive, otherwise average maintainers and implementors will
>    resist it.
> 
> 2) programmer duties and translation duties are to be kept separate,
>    so these activities could be done asynchronously from one
>    another.[1]
> 
> I really, really think that with enough and proper care, Python
> could be set so internationalisation of Python scripts is just
> unobtrusive routine.  There should not be one way to write Python
> when one does not internationalise, and another different way to use
> it when one internationalises.  The full power and facilities of
> Python should be available at all times, unrelated to
> internationalisation intents.  Non-English people should not have to
> pay a penalty, or if they do, the penalty should be minimised As
> Much As Possible.

However, he fails to suggest even a glimpse of a solution that
satisfies his requirements, so I'm intended to write him off as the
crank he usually is. ;-)

> Our BDFL, Guido, should favour internationalisation as a principle
> in the evolution for the language, that is, more than a random
> negligible feature.  I sincerely hope he will do.  For many people,
> internationalisation issues cannot be separated out that simply, or
> otherwise dismissed.  We should rather learn to collaborate at
> properly addressing and solving them at each evolutionary step, so
> Python really remains a language for everybody.

To the contrary, I think most users don't care about writing code that
can be switched easily from one language to the next.  They only care
about being able to write code that prints text in their own language
(and perhaps about being able to use words in their own language as
identifiers).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 17:37:10 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 12:37:10 -0400
Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error
In-Reply-To: Your message of "Tue, 25 Jun 2002 09:29:34 MDT."
 <3D188C5D.D519DD90@3captus.com>
References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com>
 <3D188C5D.D519DD90@3captus.com>
Message-ID: <200207121637.g6CGbAE12463@pcp02138704pcs.reston01.va.comcast.net>

[Skip Montanaro]
> > I just noticed in the development docs that when a timeout on a socket
> > occurs, socket.error is raised.  I rather liked the idea that a different
> > exception was raised for timeouts (I used Tim O'Malley's timeout_socket
> > module).  Making a TimeoutError exception a subclass of socket.error would
> > be fine so you can catch it with existing code, but I could see recovering
> > differently for a timeout as opposed to other possible errors:
> > 
> >     sock.settimeout(5.0)
> >     try:
> >         data = sock.recv(8192)
> >     except socket.TimeoutError:
> >         # maybe requeue the request
> >         ...
> >     except socket.error, codes:
> >         # some more drastic solution is needed
> >         ...
> > 

[Bernard Yue]
> +1 on your suggestion.  Anyway, under windows, the current
> implementation returns incorrect socket.error code for timeout.  I am
> working on the test suite as well as a fix for problem found.  Once the
> code is bug free maybe we can put the TimeoutError in.
> 
> I will leave it to Guido for the approval of the change.  When he comes
> back from his holiday.

The way I restructured the code it is impossible to distinguish a
timeout error from other errors; you simply get the "no data
available" error from the socket operation.  This is the same error
you'd get in non-blocking mode.

Before I recomplicate the code so that it can raise a separate error
when the select fails, I'd like to understand the use case better.
Why would you want to make this distinction?  Requeueing the request
(as in Skip's example) doesn't make sense IMO: you set the timeout for
a reason, and that reason is that you want to give up if it takes too
long.  If you really intend to retry you're better of disabling the
timeout!

If you really want to, you can already distinguish the timeout case,
because you get an EAGAIN error then (maybe something else on Windows
-- Bernard, if you have a fix for that, please send it to me).

So a -0 unless more evidence is brought forward.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ark@research.att.com  Fri Jul 12 17:40:03 2002
From: ark@research.att.com (Andrew Koenig)
Date: Fri, 12 Jul 2002 12:40:03 -0400 (EDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <15662.64105.997215.294990@anthem.wooz.org> (barry@zope.com)
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>
 <yu99bs9ejn6k.fsf@europa.research.att.com>
 <20020712054332.GA77883@hishome.net>
 <200207121227.g6CCRtI24509@europa.research.att.com> <15662.64105.997215.294990@anthem.wooz.org>
Message-ID: <200207121640.g6CGe3O26775@europa.research.att.com>

Barry> Would it be useful to add to the interator "interface" a method
Barry> which would retrieve the original iterable object?

What if there isn't one?



From oren-py-d@hishome.net  Fri Jul 12 17:41:51 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 19:41:51 +0300
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: <15662.65153.90462.540450@slothrop.zope.com>; from jeremy@alum.mit.edu on Fri, Jul 12, 2002 at 12:06:25PM -0400
References: <20020624230140.B3555@hishome.net> <200207121501.g6CF1ME08105@pcp02138704pcs.reston01.va.comcast.net> <20020712184428.A4777@hishome.net> <200207121551.g6CFpRg10647@pcp02138704pcs.reston01.va.comcast.net> <15662.65153.90462.540450@slothrop.zope.com>
Message-ID: <20020712194151.A6406@hishome.net>

On Fri, Jul 12, 2002 at 12:06:25PM -0400, Jeremy Hylton wrote:
> I recently wrote some code that needed to look for functions.  I wrote
> it this way:
> 
> from new import function
> 
> # ...
> 
> if isinstance(obj, function):
>     # ...
> 
> It didn't look odd at all.  And I don't care much where I import
> function from.  I wouldn't mind if all the type objects defined in new
> where available in types.  IOW, the names exported by new could also
> be exported by types.

That's exactly what PEP 294 proposed. The primary objection was that the 
documentation for the types module says that names exported by future 
versions will all end in "Type".  People that do 'from types import *' based
on this promise will tend to get offended if a variable called 'code' is 
clobbered.  Anyway, my mother also told me that breaking promises is not a 
nice thing to do so I try to keep that in mind when I design programming 
interfaces.

	Oren




From barry@zope.com  Fri Jul 12 17:48:16 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 12 Jul 2002 12:48:16 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>
 <yu99bs9ejn6k.fsf@europa.research.att.com>
 <20020712054332.GA77883@hishome.net>
 <200207121227.g6CCRtI24509@europa.research.att.com>
 <15662.64105.997215.294990@anthem.wooz.org>
 <200207121612.g6CGCwF12306@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15663.2128.143056.795328@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    >> Would it be useful to add to the interator "interface" a method
    >> which would retrieve the original iterable object?  I've no
    >> idea what that method should be called, but it seems like it
    >> would be trivial to add since most (all?) iterators have a
    >> pointer to their underlying object anyway, don't they?

    GvR> No.  The (important!) class of generator-iterators does not
    GvR> have an underlying container object.

Yup, but in that case I think it would be fine if
it.gimme_the_underlying_iteratable_object() returned None.

It still may be useless. ;)
-Barry



From martin@v.loewis.de  Fri Jul 12 17:49:59 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 12 Jul 2002 18:49:59 +0200
Subject: [Python-Dev] long long configuration
In-Reply-To: <151501c2298d$25186b00$6601a8c0@boostconsulting.com>
References: <14b201c2294d$fdebb9e0$6601a8c0@boostconsulting.com>
 <m3adoxh18r.fsf@mira.informatik.hu-berlin.de>
 <151501c2298d$25186b00$6601a8c0@boostconsulting.com>
Message-ID: <m3ptxszya0.fsf@mira.informatik.hu-berlin.de>

"David Abrahams" <david.abrahams@rcn.com> writes:

> > > I'm surprised it wasn't a
> > > worse problem with MSVC6, because after all, it doesn't even supply a
> type
> > > called "long long".
> >
> > Could that have resulted from defining BOOST_MSVC?
> 
> Sorry, I don't understand the question. Could *what* have resulted from
> defining BOOST_MSVC?

That it (the Python long long configuration) wasn't a worse problem
with MSVC6, even though it doesn't even supply a type called "long
long".

Regards,
Martin



From mal@lemburg.com  Fri Jul 12 17:58:41 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 12 Jul 2002 18:58:41 +0200
Subject: [Python-Dev] Fw: Behavior of buffer()
References: <20020623222209.62675.qmail@web40105.mail.yahoo.com> <200207121536.g6CFaqr09850@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D2F0AC1.508@lemburg.com>

Guido van Rossum wrote:
> I'm a little surprised.  Raymond Hettinger checked in a change that
> makes all slices of buffer objects return strings.  His comments on SF
> bug 546434 say that only one person replied and that they agreed
> returning strings was the better solution.  But that's not how I read
> the only response to his query that I see in python-dev, from Scott
> Gilbert:

Interesting. I must have skipped that message.

IMHO, all slices of buffer object should return buffer objects,
but since all Python releases return strings, I guess this is too
late to change.

Note that the only case where a buffer object
is returned in Python 2.x (x < 3) is if you write
buffer()[:], i.e. you want a copy of the buffer object.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From David Abrahams" <david.abrahams@rcn.com  Fri Jul 12 17:43:56 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 12 Jul 2002 12:43:56 -0400
Subject: [Python-Dev] long long configuration
References: <LNBBLJKPBEHFEDALKOLCAEBKAEAB.tim.one@comcast.net>
Message-ID: <178d01c229c3$b27e5320$6601a8c0@boostconsulting.com>

From: "Tim Peters" <tim.one@comcast.net>

> In any case, it's too much
> fiddling just to save you the effort of ordering a pair of includes
> consistently <0.9 wink>.

Just addressing the <0.1 wink> you left out: even if I get the include
order "right", my users are still screwed if they don't do it the same way.

-Dave





From guido@python.org  Fri Jul 12 18:00:10 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 13:00:10 -0400
Subject: [Python-Dev] New Subscriber Introduction
In-Reply-To: Your message of "Tue, 25 Jun 2002 12:06:26 PDT."
 <Pine.SOL.4.44.0206251202000.12420-100000@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.44.0206251202000.12420-100000@death.OCF.Berkeley.EDU>
Message-ID: <200207121700.g6CH0AO12617@pcp02138704pcs.reston01.va.comcast.net>

> Ah, OK.  Well, that is handy, but since this is meant to be a
> drop-in replacement for strptime, I don't think it is warranted
> here.  Perhaps something like that could be put into Python when
> Guido starts putting in new fxns for the forthcoming new datetime
> type?

No, parsing dates is specifically not part of the datetime proposal.
The examples shown of mxDateTime.Parser behavior here reinforce my
desire to stay out of the time parsing business. :-)

> And I do agree that strptime is not need most of the time.  But it is
> there so might as well fix that non-portable wart.

Exactly.

Brett: I'm reviewing your SF patch 474274, but I'm finding problems.
I've added a comment to the SF page.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@zope.com  Fri Jul 12 17:58:59 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 12 Jul 2002 12:58:59 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>
 <yu99bs9ejn6k.fsf@europa.research.att.com>
 <20020712054332.GA77883@hishome.net>
 <200207121227.g6CCRtI24509@europa.research.att.com>
 <15662.64105.997215.294990@anthem.wooz.org>
 <200207121640.g6CGe3O26775@europa.research.att.com>
Message-ID: <15663.2771.621927.778230@anthem.wooz.org>

>>>>> "AK" == Andrew Koenig <ark@research.att.com> writes:

    Barry> Would it be useful to add to the interator "interface" a
    Barry> method which would retrieve the original iterable object?

    AK> What if there isn't one?

The method would return None.
-Barry



From ark@research.att.com  Fri Jul 12 18:02:15 2002
From: ark@research.att.com (Andrew Koenig)
Date: Fri, 12 Jul 2002 13:02:15 -0400 (EDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <15663.2771.621927.778230@anthem.wooz.org> (barry@zope.com)
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>
 <yu99bs9ejn6k.fsf@europa.research.att.com>
 <20020712054332.GA77883@hishome.net>
 <200207121227.g6CCRtI24509@europa.research.att.com>
 <15662.64105.997215.294990@anthem.wooz.org>
 <200207121640.g6CGe3O26775@europa.research.att.com> <15663.2771.621927.778230@anthem.wooz.org>
Message-ID: <200207121702.g6CH2F827735@europa.research.att.com>

>>>>>> "AK" == Andrew Koenig <ark@research.att.com> writes:

Barry> Would it be useful to add to the interator "interface" a
Barry> method which would retrieve the original iterable object?

AK> What if there isn't one?

Barry> The method would return None.

But then you can't rely on it.  That is, if you want to write code
that depends on the ability to retrieve the original iterable,
you have to give up the ability for that code to work on
generators, for example.

I'm not saying it's not a useful thing to have; I'm just saying
it might not be as useful as it appears at first.





From barry@zope.com  Fri Jul 12 18:06:57 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 12 Jul 2002 13:06:57 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <051f01c226ac$38f4e880$6601a8c0@boostconsulting.com>
 <yu99bs9ejn6k.fsf@europa.research.att.com>
 <20020712054332.GA77883@hishome.net>
 <200207121227.g6CCRtI24509@europa.research.att.com>
 <15662.64105.997215.294990@anthem.wooz.org>
 <200207121640.g6CGe3O26775@europa.research.att.com>
 <15663.2771.621927.778230@anthem.wooz.org>
 <200207121702.g6CH2F827735@europa.research.att.com>
Message-ID: <15663.3249.952554.249795@anthem.wooz.org>

>>>>> "AK" == Andrew Koenig <ark@research.att.com> writes:

    Barry> Would it be useful to add to the interator "interface" a
    Barry> method which would retrieve the original iterable object?

    AK> What if there isn't one?

    Barry> The method would return None.

    AK> But then you can't rely on it.  That is, if you want to write
    AK> code that depends on the ability to retrieve the original
    AK> iterable, you have to give up the ability for that code to
    AK> work on generators, for example.

    AK> I'm not saying it's not a useful thing to have; I'm just
    AK> saying it might not be as useful as it appears at first.

I'm not sure it's even useful at all, e.g. I've never had a use for
it.  But if you have code that depends on the ability to retrieve the
original iterable, and you have iterators for which there /is no/
original iterable, it doesn't matter how you spell it, you're going to
have to special case around that fact.

-Barry



From guido@python.org  Fri Jul 12 18:09:32 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 13:09:32 -0400
Subject: [Python-Dev] Xrange and Slices
In-Reply-To: Your message of "Sun, 30 Jun 2002 13:39:03 EDT."
 <20020630173903.GA37045@hishome.net>
References: <000d01c21cdb$eb03b720$91d8accf@othello>
 <20020630173903.GA37045@hishome.net>
Message-ID: <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net>

[Raymond Hettinger]
> > Merge the code for xrange() into slice().

[Oren Tirosh]
> There's a patch pending for this: www.python.org/sf/575515

I've rejected this.  It's better to let these two be different, so
that it's clear what the intended use is.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Fri Jul 12 17:52:01 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 12 Jul 2002 12:52:01 -0400
Subject: [Python-Dev] long long configuration
References: <14b201c2294d$fdebb9e0$6601a8c0@boostconsulting.com><m3adoxh18r.fsf@mira.informatik.hu-berlin.de><151501c2298d$25186b00$6601a8c0@boostconsulting.com> <m3ptxszya0.fsf@mira.informatik.hu-berlin.de>
Message-ID: <17a901c229c5$1a54dae0$6601a8c0@boostconsulting.com>

Sorry, I just wasn't looking. Yes, that's probably the right explanation.

Thanks,
Dave


From: "Martin v. Loewis" <martin@v.loewis.de>


> "David Abrahams" <david.abrahams@rcn.com> writes:
>
> > > > I'm surprised it wasn't a
> > > > worse problem with MSVC6, because after all, it doesn't even supply
a
> > type
> > > > called "long long".
> > >
> > > Could that have resulted from defining BOOST_MSVC?
> >
> > Sorry, I don't understand the question. Could *what* have resulted from
> > defining BOOST_MSVC?
>
> That it (the Python long long configuration) wasn't a worse problem
> with MSVC6, even though it doesn't even supply a type called "long
> long".
>
> Regards,
> Martin




From barry@zope.com  Fri Jul 12 17:42:14 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 12 Jul 2002 12:42:14 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com>
 <E17Su9a-0005Fb-00@mail.python.org>
 <200207121250.g6CCoWr32099@pcp02138704pcs.reston01.va.comcast.net>
 <E17T0ku-0008Ap-00@mail.python.org>
 <200207121415.g6CEFNr07738@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15663.1766.101214.339822@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> I expect Barry won't accept your changes unless the original
    GvR> author agrees.  This just happened to the logging PEP (wich
    GvR> was completely transferred to the new author).

Right.  I sent a previous message about this, but my email's been
flakey lately.

    GvR> (Barry: maybe PEP 1 should discuss transfer of PEP ownership?

Yes, good idea.  I've added a paragrpah.

    GvR> I think that Trent should actually have remained co-author of
    GvR> PEP 282, even if he intends not to contribute another line.)

I'll leave that up to the original author for each specific transfer.

In this case, Trent, let me know if you'd like to remain a co-author
of PEP 282 (it's not like this stuff is set in stone. :).

-Barry



From guido@python.org  Fri Jul 12 18:17:57 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 13:17:57 -0400
Subject: [Python-Dev] Dict constructor
In-Reply-To: Your message of "Wed, 26 Jun 2002 03:36:21 EDT."
 <008101c21ce4$2b504fc0$91d8accf@othello>
References: <008101c21ce4$2b504fc0$91d8accf@othello>
Message-ID: <200207121717.g6CHHvr12817@pcp02138704pcs.reston01.va.comcast.net>

> Second wild idea of the day:
> 
> The dict constructor currently accepts sequences where each element has
> length 2, interpreted as a key-value pair.
> 
> Let's have it also accept sequences with elements of length 1, interpreted
> as a key:None pair.
> 
> The benefit is that it provides a way to rapidly construct sets:
> 
> lowercase = dict('abcdefghijklmnopqrstuvwxyz')
> if char in lowercase: ...
> 
> dict([key1, key2, key3, key1]).keys()  # eliminate duplicate keys

Rejecting (even in the modified form you showed after prompring from
Tim).  I think the dict() constructor is already overloaded to the
brim.  Let's do a set module instead.  There's only one hurdle to take
for a set module, and that's the issue with using mutable sets as
keys.  Let's just pick one solution and implement it (my favorite
being that sets simply cannot be used as keys, since it's the
simplest, and matches dicts and lists).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Fri Jul 12 18:22:18 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 13:22:18 -0400
Subject: [Python-Dev] Alternative implementation of interning, take 2
In-Reply-To: <3D2E922A.4040005@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKECBAEAB.tim.one@comcast.net>

This is a multi-part message in MIME format.

--Boundary_(ID_RM57m/IJOIS03WJjE8ryaQ)
Content-type: text/plain; charset=Windows-1252
Content-transfer-encoding: 7BIT

[M.-A. Lemburg]
> If you could spell out what exactly you mean by "indirect interning"
> that would help.

Actually, I don't think it would -- the issue is whether the possibility for
the ob_sinterned member of a PyStringObject not to *be* the string object
itself ever saves time in your extensions, and it's darned hard to guess
that.  If you apply the attached patch to current CVS, though, it will tell
you whenever your code benefits from it.

AFAICT, there are only 3 routines where it *might* save cycles (but note
that checking for the possibility costs cycles whether or not it pays; it's
a net loss when it doesn't pay):

+ PyDict_SetItem:  I believe this is the only real possibility for gain.  If
it ever helps you here, the patch arranges to print

    ii paid on a setitem

to stderr whenever it does pay.  I haven't yet seen that get printed.

+ PyString_InternInPlace:  Whenever it pays here, the patch spits

    ii paid on an InternInPlace

That triggers 6 times in the Python test suite, all from test_descr.  Since
this one is an optimization *of* setting ob_sinterned, it's a
snake-eating-its-tail kind of thing -- it's of no real benefit unless
ob_sintered pays off somewhere else too.

+ string_hash:  The patch spits

    ii paid on a hash???

The question marks are there because I don't see how it's possible for this
to get printed.

> What I do need and rely on is the fact that the
> Python compiler interns all constant strings and identifiers in
> Python programs. This makes switching like so:

Ya, while that's evil, it's not affected by indirect interning.

--Boundary_(ID_RM57m/IJOIS03WJjE8ryaQ)
Content-type: text/plain; name=ii.txt
Content-transfer-encoding: 7BIT
Content-disposition: attachment; filename=ii.txt

Index: Objects/dictobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/dictobject.c,v
retrieving revision 2.126
diff -c -c -r2.126 dictobject.c
*** Objects/dictobject.c	13 Jun 2002 20:32:57 -0000	2.126
--- Objects/dictobject.c	12 Jul 2002 17:14:19 -0000
***************
*** 512,517 ****
--- 512,519 ----
  	mp = (dictobject *)op;
  	if (PyString_CheckExact(key)) {
  		if (((PyStringObject *)key)->ob_sinterned != NULL) {
+ 			if (key != ((PyStringObject *)key)->ob_sinterned)
+ 				fprintf(stderr, "ii paid on a setitem\n");
  			key = ((PyStringObject *)key)->ob_sinterned;
  			hash = ((PyStringObject *)key)->ob_shash;
  		}
Index: Objects/stringobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/stringobject.c,v
retrieving revision 2.169
diff -c -c -r2.169 stringobject.c
*** Objects/stringobject.c	11 Jul 2002 06:23:50 -0000	2.169
--- Objects/stringobject.c	12 Jul 2002 17:14:20 -0000
***************
*** 925,933 ****
  
  	if (a->ob_shash != -1)
  		return a->ob_shash;
! 	if (a->ob_sinterned != NULL)
  		return (a->ob_shash =
  			((PyStringObject *)(a->ob_sinterned))->ob_shash);
  	len = a->ob_size;
  	p = (unsigned char *) a->ob_sval;
  	x = *p << 7;
--- 925,940 ----
  
  	if (a->ob_shash != -1)
  		return a->ob_shash;
! 	if (a->ob_sinterned != NULL) {
! 		if ((PyObject *)a != a->ob_sinterned)
! 			/* This shouldn't be possible?  'a' would have
! 			 * had its ob_shash set as part of a->ob_sinterned
! 			 * getting set.
! 			 */
! 			fprintf(stderr, "ii paid on a hash???\n");
  		return (a->ob_shash =
  			((PyStringObject *)(a->ob_sinterned))->ob_shash);
+ 	}
  	len = a->ob_size;
  	p = (unsigned char *) a->ob_sval;
  	x = *p << 7;
***************
*** 3829,3834 ****
--- 3836,3842 ----
  	if ((t = s->ob_sinterned) != NULL) {
  		if (t == (PyObject *)s)
  			return;
+ 		fprintf(stderr, "ii paid on an InternInPlace\n");
  		Py_INCREF(t);
  		*p = t;
  		Py_DECREF(s);

--Boundary_(ID_RM57m/IJOIS03WJjE8ryaQ)--



From guido@python.org  Fri Jul 12 18:24:31 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 13:24:31 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: Your message of "Fri, 12 Jul 2002 18:58:41 +0200."
 <3D2F0AC1.508@lemburg.com>
References: <20020623222209.62675.qmail@web40105.mail.yahoo.com> <200207121536.g6CFaqr09850@pcp02138704pcs.reston01.va.comcast.net>
 <3D2F0AC1.508@lemburg.com>
Message-ID: <200207121724.g6CHOVt12881@pcp02138704pcs.reston01.va.comcast.net>

> Guido van Rossum wrote:
> > I'm a little surprised.  Raymond Hettinger checked in a change that
> > makes all slices of buffer objects return strings.  His comments on SF
> > bug 546434 say that only one person replied and that they agreed
> > returning strings was the better solution.  But that's not how I read
> > the only response to his query that I see in python-dev, from Scott
> > Gilbert:
> 
> Interesting. I must have skipped that message.

You blink, and you find that the world has changed.

> IMHO, all slices of buffer object should return buffer objects,
> but since all Python releases return strings, I guess this is too
> late to change.

That was my preference too, but Raymond disagreed and somehow tried to
find support for his position :-).

Since buffer objects (of course :-) support the C-level buffer
protocol, they can still be used in most places where strings are
needed.  But it would be incompatible.  But so is Raymond's solution
(because it changes buffer()[:] to also return a string).

> Note that the only case where a buffer object
> is returned in Python 2.x (x < 3) is if you write
> buffer()[:], i.e. you want a copy of the buffer object.

What does a copy of a buffer object buy you?

It's not too late to revert Raymond's changes.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mclay@nist.gov  Fri Jul 12 18:24:21 2002
From: mclay@nist.gov (Michael McLay)
Date: Fri, 12 Jul 2002 13:24:21 -0400
Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting
In-Reply-To: <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net>
References: <20020623181630.GN25927@laranja.org> <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200207121324.21609.mclay@nist.gov>

On Friday 12 July 2002 10:47 am, Guido van Rossum wrote:

> (FWIW, I agree with your other observations -- this was why I
> support exploring an alternative in PEP 292.)

The syntax rules of PEP 292 are likely to cause confusion for newbies who have 
never used sh or perl. They will ask why Python have two syntaxes for doing 
string substitutions? Why not always spell the substitution string with 
${identifier} or %(identifier)? The third rule of PEP292 in particular look 
like a patch to fix a kludge when an unanticipated exception was discovered.

   3. ${identifier} is equivalent to $identifier.  It is required for
          when valid identifier characters follow the placeholder but are
          not part of the placeholder, e.g. "${noun}ification".

> On Sunday 23 June 2002 02:16 pm, Lalo Martins wrote:
> > More, I'm completely opposed to "<<name>> is <<age:.0d>> years old"
> > because it's still cryptic and invasive. This should instead read similar 
> > to "<<name>> is <<age>> years old".sub({'name': x.name, 'age':
> > x.age.format(None, 0)})
>
> > Guido, can you please, for our enlightenment, tell us what are the
> > reasons you feel %(foo)s was a mistake?
>
> Because of the trailing 's'.  It's very easy to leave it out by
> mistake, and because the definition of printf formats skips over
> spaces (don't ask me why), the first character of the following word
> is used as the type indicator.

It's easy to leave it out by mistake, but the error is almost always 
immediately obvious. In the interest of keeping the language as simple as 
possible, I hope no changes are made. If a method based .sub() capability is 
to be added, why not reuse the %(identifier) syntax instead of introducing $ 
and ${} syntax? The .sub() string method would use the %(identifier) syntax 
without the 's' to spell the new substitution format. Instead of the 
proposed:

	'$name was born in ${country}'.sub()

the phrase would be spelled:

	'%(name) was born in %(country)'.sub()

This approach would introduce one new string method with a small variation on 
the existing '%' substitution syntax. 





From mal@lemburg.com  Fri Jul 12 18:30:45 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 12 Jul 2002 19:30:45 +0200
Subject: [Python-Dev] python package
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net>              <3D2D3720.9040100@lemburg.com> <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D2F1245.1030804@lemburg.com>

Guido van Rossum wrote:
> I have thought some more about the idea of moving the entire stdlib
> into a package named "python" and I reject the idea.
> 
> Think of the impact the change would have on the tutorial.
> 
> Think of the amount of needless changes to perfectly working code it
> would entail.
> 
> If you want to avoid 3rd party module/package names to be invalidated
> by additions to the standard library, you might just as well introduce
> a "nonstd" package into which all 3rd party extensions must be placed.
> This at least doesn't require people who don't use 3rd party code to
> change their programs.

Uhm, the point I was trying to make was to provide a long
running upgrade path from the current situation (everthing is
top-level) to the single package structure.

It is fairly easy to move from 'import os' to 'from python import os',
but I understand that people will not want to do this until
Python 3.

I was not suggesting to start breaking code by enforcing this
strategy in some way, I just though it would be a good idea
to start providing means to work with the single python package
approach now to make the transition less painful in Python 3.

> Maybe we should create a standard package hierarchy; Eric Raymond once
> started working on such a proposal but I have discouraged him because
> I think it would cause too much upheaval.  But for Python 3 I would
> consider it.

That's what I was targetting :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From guido@python.org  Fri Jul 12 18:36:38 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 13:36:38 -0400
Subject: [Python-Dev] python package
In-Reply-To: Your message of "Fri, 12 Jul 2002 19:30:45 +0200."
 <3D2F1245.1030804@lemburg.com>
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> <3D2D3720.9040100@lemburg.com> <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net>
 <3D2F1245.1030804@lemburg.com>
Message-ID: <200207121736.g6CHact13010@pcp02138704pcs.reston01.va.comcast.net>

> Guido van Rossum wrote:
> > I have thought some more about the idea of moving the entire stdlib
> > into a package named "python" and I reject the idea.
> > 
> > Think of the impact the change would have on the tutorial.
> > 
> > Think of the amount of needless changes to perfectly working code it
> > would entail.
> > 
> > If you want to avoid 3rd party module/package names to be invalidated
> > by additions to the standard library, you might just as well introduce
> > a "nonstd" package into which all 3rd party extensions must be placed.
> > This at least doesn't require people who don't use 3rd party code to
> > change their programs.

[MAL]
> Uhm, the point I was trying to make was to provide a long
> running upgrade path from the current situation (everthing is
> top-level) to the single package structure.

And my suggestion of a "nonstd" toplevel package had the same goal. :-)

> It is fairly easy to move from 'import os' to 'from python import os',
> but I understand that people will not want to do this until
> Python 3.
> 
> I was not suggesting to start breaking code by enforcing this
> strategy in some way, I just though it would be a good idea
> to start providing means to work with the single python package
> approach now to make the transition less painful in Python 3.

Two problems.  First, your proposal has lots of practical warts that I
already pointed out; your suggestion to fix one of them by making all
the old names stubs would require a massive set of changes to the CVS
repository.  Second, I don't think a 'python' toplevel package is the
right solution.

> > Maybe we should create a standard package hierarchy; Eric Raymond once
> > started working on such a proposal but I have discouraged him because
> > I think it would cause too much upheaval.  But for Python 3 I would
> > consider it.
> 
> That's what I was targetting :-)

Then please think about a proper solution rather than proposing
something whose only virtue seems to be that you can implement a poor
approximation of it in two lines.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Fri Jul 12 18:36:22 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 13:36:22 -0400
Subject: [Python-Dev] long long configuration
In-Reply-To: <178d01c229c3$b27e5320$6601a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOECEAEAB.tim.one@comcast.net>

[David Abrahams]
> Just addressing the <0.1 wink> you left out: even if I get the include
> order "right", my users are still screwed if they don't do it the
> same way.

Give them a pythonboost.h instead that contains the includes in the right
order, or make your boost.h smarter, or ...




From barry@zope.com  Fri Jul 12 18:33:35 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 12 Jul 2002 13:33:35 -0400
Subject: [Python-Dev] Dict constructor
References: <008101c21ce4$2b504fc0$91d8accf@othello>
 <200207121717.g6CHHvr12817@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15663.4847.187726.359608@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> Let's do a set module instead.

+1
    
    GvR> There's only one hurdle to take for a set module, and that's
    GvR> the issue with using mutable sets as keys.  Let's just pick
    GvR> one solution and implement it (my favorite being that sets
    GvR> simply cannot be used as keys, since it's the simplest, and
    GvR> matches dicts and lists).

+1
-Barry



From mal@lemburg.com  Fri Jul 12 18:47:09 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 12 Jul 2002 19:47:09 +0200
Subject: [Python-Dev] python package
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> <3D2D3720.9040100@lemburg.com> <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net>              <3D2F1245.1030804@lemburg.com> <200207121736.g6CHact13010@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D2F161D.40005@lemburg.com>

Guido van Rossum wrote:
>>Guido van Rossum wrote:
>>
>>>I have thought some more about the idea of moving the entire stdlib
>>>into a package named "python" and I reject the idea.
>>>
>>>Think of the impact the change would have on the tutorial.
>>>
>>>Think of the amount of needless changes to perfectly working code it
>>>would entail.
>>>
>>>If you want to avoid 3rd party module/package names to be invalidated
>>>by additions to the standard library, you might just as well introduce
>>>a "nonstd" package into which all 3rd party extensions must be placed.
>>>This at least doesn't require people who don't use 3rd party code to
>>>change their programs.
>>
> 
> [MAL]
> 
>>Uhm, the point I was trying to make was to provide a long
>>running upgrade path from the current situation (everthing is
>>top-level) to the single package structure.
> 
> 
> And my suggestion of a "nonstd" toplevel package had the same goal. :-)

With the exception that we have control over the Python core
code while we don't over third party extensions, so providing
means to simplify the transition for the standard lib is easier
than trying to enforce your proposed 'nonstd' package.

>>It is fairly easy to move from 'import os' to 'from python import os',
>>but I understand that people will not want to do this until
>>Python 3.
>>
>>I was not suggesting to start breaking code by enforcing this
>>strategy in some way, I just though it would be a good idea
>>to start providing means to work with the single python package
>>approach now to make the transition less painful in Python 3.
> 
> 
> Two problems.  First, your proposal has lots of practical warts that I
> already pointed out; your suggestion to fix one of them by making all
> the old names stubs would require a massive set of changes to the CVS
> repository.  Second, I don't think a 'python' toplevel package is the
> right solution.
> 
> 
>>>Maybe we should create a standard package hierarchy; Eric Raymond once
>>>started working on such a proposal but I have discouraged him because
>>>I think it would cause too much upheaval.  But for Python 3 I would
>>>consider it.
>>
>>That's what I was targetting :-)
> 
> 
> Then please think about a proper solution rather than proposing
> something whose only virtue seems to be that you can implement a poor
> approximation of it in two lines.

Just testing waters here... there's no point in trying to
find a solution to something which is not regarded as problem
anyway.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From David Abrahams" <david.abrahams@rcn.com  Fri Jul 12 18:37:35 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 12 Jul 2002 13:37:35 -0400
Subject: [Python-Dev] long long configuration
References: <LNBBLJKPBEHFEDALKOLCOECEAEAB.tim.one@comcast.net>
Message-ID: <181901c229ca$ee810280$6601a8c0@boostconsulting.com>

From: "Tim Peters" <tim.one@comcast.net>

> [David Abrahams]
> > Just addressing the <0.1 wink> you left out: even if I get the include
> > order "right", my users are still screwed if they don't do it the
> > same way.
>
> Give them a pythonboost.h instead that contains the includes in the right
> order, or make your boost.h smarter, or ...

You fixed the LONGLONG_MAX stuff already, so I don't think there's anything
to discuss here, is there? None of my code is confused by HAVE_LONG_LONG.

-Dave




From guido@python.org  Fri Jul 12 18:54:20 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 13:54:20 -0400
Subject: [Python-Dev] python package
In-Reply-To: Your message of "Fri, 12 Jul 2002 19:47:09 +0200."
 <3D2F161D.40005@lemburg.com>
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> <3D2D3720.9040100@lemburg.com> <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net> <3D2F1245.1030804@lemburg.com> <200207121736.g6CHact13010@pcp02138704pcs.reston01.va.comcast.net>
 <3D2F161D.40005@lemburg.com>
Message-ID: <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net>

> With the exception that we have control over the Python core
> code while we don't over third party extensions, so providing
> means to simplify the transition for the standard lib is easier
> than trying to enforce your proposed 'nonstd' package.

I think you could get a long way with minor changes along the lines of
making site-packages a package itself.

> > Then please think about a proper solution rather than proposing
> > something whose only virtue seems to be that you can implement a poor
> > approximation of it in two lines.
> 
> Just testing waters here... there's no point in trying to
> find a solution to something which is not regarded as problem
> anyway.

You started by claiming that there's a problem: expansion of the
stdlib could conflict with 3rd party module/package names.

I don't regard it as a problem that's so bad that we need to make big
changes to solve it.

If you still think a solution is desired, you could start by proposing
a new standard package hierarchy.  Then new standard modules could be
placed in that new hierarchy rather than at the top level.

I'm rejecting the proposal of a single top-level package named "python".

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Fri Jul 12 18:57:59 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 13:57:59 -0400
Subject: [Python-Dev] python package
In-Reply-To: <3D2F161D.40005@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIECHAEAB.tim.one@comcast.net>

[M.-A. Lemburg]
> ...
> Just testing waters here... there's no point in trying to
> find a solution to something which is not regarded as problem
> anyway.

There is something to be solved here.  Anecdote:  I sucked an early version
of Greg's textwrap.py module into my build directory.  After he checked it
in, I changed regrtest.py to use textwrap.  This kept failing with baffling
errors, until I realized I was still picking up an incompatible textwrap.py
from the build directory.  So I got rid of the latter.  Somewhere in
between, I synched my desktop and laptop machines and so got another copy on
my laptop that way, which I didn't notice.  When I got home and synched the
laptop back to the desktop, it then restored the deleted testwrap.py to the
desktop machine, and I got the same round of impossible errors all over
again.  I deleted it from home machine again, but the next time I used my
laptop to run the test suite got the impossible errors yet another time --
and had synched the machines again in the meantime so that it once again
showed up on the desktop disk.

So there's one use case <wink>.




From guido@python.org  Fri Jul 12 19:02:45 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 14:02:45 -0400
Subject: [Python-Dev] python package
In-Reply-To: Your message of "Fri, 12 Jul 2002 13:57:59 EDT."
 <LNBBLJKPBEHFEDALKOLCIECHAEAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCIECHAEAB.tim.one@comcast.net>
Message-ID: <200207121802.g6CI2j113182@pcp02138704pcs.reston01.va.comcast.net>

> There is something to be solved here.  Anecdote: I sucked an early
> version of Greg's textwrap.py module into my build directory.  After
> he checked it in, I changed regrtest.py to use textwrap.  This kept
> failing with baffling errors, until I realized I was still picking
> up an incompatible textwrap.py from the build directory.  So I got
> rid of the latter.  Somewhere in between, I synched my desktop and
> laptop machines and so got another copy on my laptop that way, which
> I didn't notice.  When I got home and synched the laptop back to the
> desktop, it then restored the deleted testwrap.py to the desktop
> machine, and I got the same round of impossible errors all over
> again.  I deleted it from home machine again, but the next time I
> used my laptop to run the test suite got the impossible errors yet
> another time -- and had synched the machines again in the meantime
> so that it once again showed up on the desktop disk.

This just shows that having the current directory on sys.path
(especially at the front) causes problems.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Fri Jul 12 19:01:58 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 14:01:58 -0400
Subject: [Python-Dev] long long configuration
In-Reply-To: <181901c229ca$ee810280$6601a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECIAEAB.tim.one@comcast.net>

[David Abrahams]
>>> Just addressing the <0.1 wink> you left out: even if I get the include
>>> order "right", my users are still screwed if they don't do it the
>>> same way.

>> Give them a pythonboost.h instead that contains the includes in
>> the right order, or make your boost.h smarter, or ...

> You fixed the LONGLONG_MAX stuff already, so I don't think
> there's anything to discuss here, is there?

I thought you thought there was, else there was no apparent reason for the
"Just addressing" message I replied to.

BTW, do you need this in 2.2.2 too, or is 2.3 good enough?  I didn't change
anything on the 2.2 branch.




From tim.one@comcast.net  Fri Jul 12 19:11:28 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 14:11:28 -0400
Subject: [Python-Dev] python package
In-Reply-To: <200207121802.g6CI2j113182@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCIECJAEAB.tim.one@comcast.net>

[Guido]
> This just shows that having the current directory on sys.path
> (especially at the front) causes problems.

I thought it showed I shouldn't be so careless when synching machines, but
I'll take an excuse to blame Python instead <wink>.

Still, it's something that would not have happened had I needed to prefix
the import of the standard textwrap with a "standard" name -- or of my
private textwrap with a "non-standard" name.

Putting the current directory in sys.path is just too useful to give up.  I
suspect that putting it specifically at the front is only "a feature" for
Python library developers, though, and "a bug" for others -- end users
stumble into this a lot by unhappy accident, like when creating a random.py
to hold their initial experiments with Python's random-number facilities.




From xscottg@yahoo.com  Fri Jul 12 19:17:06 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Fri, 12 Jul 2002 11:17:06 -0700 (PDT)
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: <200207121536.g6CFaqr09850@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020712181706.63704.qmail@web40101.mail.yahoo.com>

--- Guido van Rossum wrote:
>
> I'm a little surprised.  Raymond Hettinger checked in a change that
> makes all slices of buffer objects return strings.  His comments on SF
> bug 546434 say that only one person replied and that they agreed
> returning strings was the better solution.  But that's not how I read
> the only response to his query that I see in python-dev, from Scott
> Gilbert:
> 

After the message you're referring to, Raymond Hettinger and I corresponded
a little bit off of the list.  I think these are probably the most relevant
snippets:

--- Raymond Hettinger:
> 
> For the problem at hand, do you recommend returning buffer objects or
> strings?
> 

--- To which I responded:
>
> I wish I could give you an easy A or B answer.  What I would like to see
> is for the PyBufferObject to be nothing more than a BufferInspector.  As
> such, it would make more sense to have slices return another BufferObject
> that is inspecting the same data.  In other words, "View Behavior".  In 
> this context, repetition of buffer objects doesn't make any sense and 
> should raise an exception.
> 
> However, that's going to break somebody's code somewhere, so I can't see
> Guido allowing that for a problem he doesn't really care about.  I think
> you're stuck returning strings until Python 3000.  So the best bet would
> be to have it just always return a string...
> 

Forgive the bit about "Guido not caring about it", it seemed that way to
me at the time.  Silence comes off as disinterest or annoyance.

So my suggestion was that since taking away the implicit promotion of
buffer slices/repetitions/concatenations to strings was going to break
someone's code, that just can't be done.  If we want sane behavior, then
any slice, be it buf[1:2] or buf[:], ought to at least return the same type
of object.  Those two in conjunction mean they ought to always returns
strings.



--- Raymond Hettinger also wrote:
> 
> Thanks for your input, this topic doesn't seem to interest anyone,
> 

--- To which I responded:
>
> I think there are others that are interested, but it's pretty tough to
get
> anything done without breaking backwards compatibility.  Mark Hammond
> indicated he wants a usable buffer object for some asynchronous I/O 
> stuff, and the Numarray stuff addresses the shortcomings of the buffer 
> object by reinventing yet another wheel.
> 
> I've said this before, but I think the problem basically boils down to
the
> following - once you realize what the limitations of the buffer object 
> are, you realize that even if you fixed it, it isn't useful for what you 
> wanted to use it for.
>


--- Back to Guido van Rossum:
> 
> I read this as a recommendation to forget about returning strings.  Am
> I mistaken?
> 

Only if breaking backwards compatibility is an option.  I'd like to see
that happen, but I think that would take a pronouncement from someone in
authority.


--- More of Guido van Rossum:
>
> Also, I wish you'd submitted that PEP.  IMO the reason that nobody
> likes this topic is that there is much confusion about why we have
> buffer objects in the first place.  Any attempt at clarifying this
> (e.g. proposing separate byte arrays and buffer inspectors) would be
> welcome.
>

I'm glad to hear this.  I'll submit the PEP sometime in the next week.


Cheers,
    -Scott Gilbert




__________________________________________________
Do You Yahoo!?
Sign up for SBC Yahoo! Dial - First Month Free
http://sbc.yahoo.com



From oren-py-d@hishome.net  Fri Jul 12 19:21:05 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 21:21:05 +0300
Subject: [Python-Dev] Xrange and Slices
In-Reply-To: <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 12, 2002 at 01:09:32PM -0400
References: <000d01c21cdb$eb03b720$91d8accf@othello> <20020630173903.GA37045@hishome.net> <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020712212105.A8666@hishome.net>

On Fri, Jul 12, 2002 at 01:09:32PM -0400, Guido van Rossum wrote:
> [Raymond Hettinger]
> > > Merge the code for xrange() into slice().
> 
> [Oren Tirosh]
> > There's a patch pending for this: www.python.org/sf/575515
> 
> I've rejected this.  It's better to let these two be different, so
> that it's clear what the intended use is.

When I was going through the sources of sliceobject.c I found the function 
PySlice_GetIndicesEx.  It performs the magic of trimming a slice into the 
range of indices of a sequence, including negative indices and intervals 
with None as start or stop value.  A comment in this function says:

 /* this is harder to get right than you might think */

Wouldn't it be a good idea to expose this nontrivial functionality to 
Python code as a method of slice objects?  The method would take an integer 
argument (length) and return an xrange object.  It should make it much 
easier to implement user types that support extended slicing:

    def __getitem__(self, index):
        if isinstance(index, slice):
            return [get_item_at(i) for i in index.trim(len(self))]
        else:
            return get_item_at(index)

Suggestions for a better name than trim?  Any reason why this API should 
stay exposed only to C and not to Python?

	Oren




From guido@python.org  Fri Jul 12 19:34:34 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 14:34:34 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: Your message of "Fri, 12 Jul 2002 11:17:06 PDT."
 <20020712181706.63704.qmail@web40101.mail.yahoo.com>
References: <20020712181706.63704.qmail@web40101.mail.yahoo.com>
Message-ID: <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net>

It seems we're still in the same boat.  It would be saner to change
buffer slices to return buffer objects, except for backward
compatibility.  I was hoping to hear from someone who uses buffer
objects and knows that this would break his code.  Scott apparently
doesn't have this problem with his own code, so his opinion doesn't
help. :-(

Raymond's change still breaks compatibility, though, for slices
without begin and end points.  So we have a contradiction: out of fear
of breaking compatibility, we make a change that breaks
compatibility.

Maybe we should do the same with the buffer object as we did with
xrange(), and plan to remove all functionality that we aren't sure is
useful?  In 2.3, we would have to maintain compatibility but we could
warn about features that will go away; in 2.4, we could remove
unwanted features.

Maybe the name 'buffer' suggests false expectations?  It's not a
buffer, it's an alias for a memory area.

Maybe we should do something stronger, and deprecate the buffer type
altogether.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mclay@nist.gov  Fri Jul 12 19:31:30 2002
From: mclay@nist.gov (Michael McLay)
Date: Fri, 12 Jul 2002 14:31:30 -0400
Subject: [Python-Dev] python package
In-Reply-To: <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net>
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <3D2F161D.40005@lemburg.com> <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200207121431.30722.mclay@nist.gov>

On Friday 12 July 2002 01:54 pm, Guido van Rossum wrote:
> If you still think a solution is desired, you could start by proposing
> a new standard package hierarchy.  Then new standard modules could be
> placed in that new hierarchy rather than at the top level.
>
> I'm rejecting the proposal of a single top-level package named "python".

I've read the entire thread and still do not understand why you are suggesting 
the new standard package hirearchy should be named "new". The contents will 
eventually will grow old and they will still be in something called "new". 
Why not use a name like "std", "misc", "core", or "sph" for the top of the 
standard package hiearchy? It doesn't matter what the name will be, but I 
hope it will be something that isn't confusing.




From guido@python.org  Fri Jul 12 19:38:31 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 14:38:31 -0400
Subject: [Python-Dev] Python version of PySlice_GetIndicesEx
In-Reply-To: Your message of "Fri, 12 Jul 2002 21:21:05 +0300."
 <20020712212105.A8666@hishome.net>
References: <000d01c21cdb$eb03b720$91d8accf@othello> <20020630173903.GA37045@hishome.net> <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net>
 <20020712212105.A8666@hishome.net>
Message-ID: <200207121838.g6CIcV813352@pcp02138704pcs.reston01.va.comcast.net>

(I changed the subject)

> When I was going through the sources of sliceobject.c I found the function 
> PySlice_GetIndicesEx.  It performs the magic of trimming a slice into the 
> range of indices of a sequence, including negative indices and intervals 
> with None as start or stop value.  A comment in this function says:
> 
>  /* this is harder to get right than you might think */
> 
> Wouldn't it be a good idea to expose this nontrivial functionality to 
> Python code as a method of slice objects?

I dunno.  It seems that most code that actually uses slices is written
in C anyway.

> The method would take an integer argument (length) and return an
> xrange object.

Why an xrange object?  That's not inspectable.  *If* we were to do
this (which I doubt) it should return a tuple of three ints.

> It should make it much 
> easier to implement user types that support extended slicing:
> 
>     def __getitem__(self, index):
>         if isinstance(index, slice):
>             return [get_item_at(i) for i in index.trim(len(self))]
>         else:
>             return get_item_at(index)
> 
> Suggestions for a better name than trim?

getindices()

> Any reason why this API should stay exposed only to C and not to
> Python?

Have you got a real use case?  I'm a bit weary of hypothetical use
cases (that's what got us xrange repetition in the first place).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mclay@nist.gov  Fri Jul 12 19:34:26 2002
From: mclay@nist.gov (Michael McLay)
Date: Fri, 12 Jul 2002 14:34:26 -0400
Subject: [Python-Dev] String substitution: compile-time versus runtime
In-Reply-To: <200207121623.g6CGNbf12384@pcp02138704pcs.reston01.va.comcast.net>
References: <Pine.LNX.4.44.0206201537410.1419-100000@ziggy> <3D1297ED.3990C30F@prescod.net> <200207121623.g6CGNbf12384@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200207121434.26976.mclay@nist.gov>

On Friday 12 July 2002 12:23 pm, Guido van Rossum wrote:
> (and perhaps about being able to use words in their own language as
> identifiers).

Beware of possible lookalike characters.  I recently learned that it is 
possible to register for domain name with Unicode characters and since there 
are indistinguishable character symbols on different code pages (for 
instance, the Cyrillic 'o' is indistinguishable from the English 'o') this 
has created an interesting opportunity for domain name exploits. It probably 
isn't dangerous in the Python source code, but limiting the character set of 
identifiers to a small number of characters seems prudent.




From guido@python.org  Fri Jul 12 19:42:26 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 14:42:26 -0400
Subject: [Python-Dev] python package
In-Reply-To: Your message of "Fri, 12 Jul 2002 14:31:30 EDT."
 <200207121431.30722.mclay@nist.gov>
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <3D2F161D.40005@lemburg.com> <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net>
 <200207121431.30722.mclay@nist.gov>
Message-ID: <200207121842.g6CIgQo13399@pcp02138704pcs.reston01.va.comcast.net>

[me]
> > If you still think a solution is desired, you could start by
> > proposing a new standard package hierarchy.  Then new standard
> > modules could be placed in that new hierarchy rather than at the
> > top level.
> >
> > I'm rejecting the proposal of a single top-level package named "python".

[Michael]
> I've read the entire thread and still do not understand why you are
> suggesting the new standard package hirearchy should be named
> "new". The contents will eventually will grow old and they will
> still be in something called "new".  Why not use a name like "std",
> "misc", "core", or "sph" for the top of the standard package
> hiearchy? It doesn't matter what the name will be, but I hope it
> will be something that isn't confusing.

Uh?  Who is proposing to name it "new"?  Not me!  Maybe you should
read the entire thread again? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas.heller@ion-tof.com  Fri Jul 12 19:44:31 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 12 Jul 2002 20:44:31 +0200
Subject: [Python-Dev] Fw: Behavior of buffer()
References: <20020712181706.63704.qmail@web40101.mail.yahoo.com>  <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <0b3901c229d4$2924d340$e000a8c0@thomasnotebook>

I'm not too interested in this anymore (I _was_ a year ago, IIRC).
I have given up using the buffer object myself, I've written
my own (maybe in the same way as others).

> Maybe the name 'buffer' suggests false expectations?  It's not a
> buffer, it's an alias for a memory area.
> 
Hm. The name could be right (and I cold give up my own memory
object) if there were a way to create a buffer owning it's
own memory.

> Maybe we should do something stronger, and deprecate the buffer type
> altogether.

Or this.

Thomas




From guido@python.org  Fri Jul 12 19:51:06 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 14:51:06 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: Your message of "Fri, 12 Jul 2002 20:44:31 +0200."
 <0b3901c229d4$2924d340$e000a8c0@thomasnotebook>
References: <20020712181706.63704.qmail@web40101.mail.yahoo.com> <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net>
 <0b3901c229d4$2924d340$e000a8c0@thomasnotebook>
Message-ID: <200207121851.g6CIp6i13450@pcp02138704pcs.reston01.va.comcast.net>

> I'm not too interested in this anymore (I _was_ a year ago, IIRC).
> I have given up using the buffer object myself, I've written
> my own (maybe in the same way as others).

Right.

> > Maybe the name 'buffer' suggests false expectations?  It's not a
> > buffer, it's an alias for a memory area.
> > 
> Hm. The name could be right (and I cold give up my own memory
> object) if there were a way to create a buffer owning it's
> own memory.

Maybe your memory object could become a standard Python extension.
Extra points if it works well with the memmap and the array modules.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Fri Jul 12 19:50:15 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 12 Jul 2002 14:50:15 -0400
Subject: [Python-Dev] long long configuration
References: <LNBBLJKPBEHFEDALKOLCEECIAEAB.tim.one@comcast.net>
Message-ID: <187601c229d4$fd7a2000$6601a8c0@boostconsulting.com>

I'm trying to work around it for 2.2. I'll let you know if there are
insurmountable problems.

-Dave

----- Original Message -----
From: "Tim Peters" <tim.one@comcast.net>
To: "David Abrahams" <david.abrahams@rcn.com>
Cc: <python-dev@python.org>
Sent: Friday, July 12, 2002 2:01 PM
Subject: RE: [Python-Dev] long long configuration


> [David Abrahams]
> >>> Just addressing the <0.1 wink> you left out: even if I get the
include
> >>> order "right", my users are still screwed if they don't do it the
> >>> same way.
>
> >> Give them a pythonboost.h instead that contains the includes in
> >> the right order, or make your boost.h smarter, or ...
>
> > You fixed the LONGLONG_MAX stuff already, so I don't think
> > there's anything to discuss here, is there?
>
> I thought you thought there was, else there was no apparent reason for
the
> "Just addressing" message I replied to.
>
> BTW, do you need this in 2.2.2 too, or is 2.3 good enough?  I didn't
change
> anything on the 2.2 branch.
>




From thomas.heller@ion-tof.com  Fri Jul 12 20:03:58 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 12 Jul 2002 21:03:58 +0200
Subject: [Python-Dev] Fw: Behavior of buffer()
References: <20020712181706.63704.qmail@web40101.mail.yahoo.com> <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net>              <0b3901c229d4$2924d340$e000a8c0@thomasnotebook>  <200207121851.g6CIp6i13450@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <0b5b01c229d6$e0df5cb0$e000a8c0@thomasnotebook>

> > > Maybe the name 'buffer' suggests false expectations?  It's not a
> > > buffer, it's an alias for a memory area.
> > > 
> > Hm. The name could be right (and I cold give up my own memory
> > object) if there were a way to create a buffer owning it's
> > own memory.
> 
> Maybe your memory object could become a standard Python extension.
> Extra points if it works well with the memmap and the array modules.
> 
What do you mean by 'works well with the mmap and array modules'?

Thomas




From guido@python.org  Fri Jul 12 20:07:06 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 15:07:06 -0400
Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting
In-Reply-To: Your message of "Fri, 12 Jul 2002 13:24:21 EDT."
 <200207121324.21609.mclay@nist.gov>
References: <20020623181630.GN25927@laranja.org> <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net>
 <200207121324.21609.mclay@nist.gov>
Message-ID: <200207121907.g6CJ76N13511@pcp02138704pcs.reston01.va.comcast.net>

> The syntax rules of PEP 292 are likely to cause confusion for
> newbies who have never used sh or perl. They will ask why Python
> have two syntaxes for doing string substitutions? Why not always
> spell the substitution string with ${identifier} or %(identifier)?
> The third rule of PEP292 in particular look like a patch to fix a
> kludge when an unanticipated exception was discovered.
> 
>    3. ${identifier} is equivalent to $identifier.  It is required for
>           when valid identifier characters follow the placeholder but are
>           not part of the placeholder, e.g. "${noun}ification".
> 
> > On Sunday 23 June 2002 02:16 pm, Lalo Martins wrote:
> > > More, I'm completely opposed to "<<name>> is <<age:.0d>> years
> > > old" because it's still cryptic and invasive. This should
> > > instead read similar to "<<name>> is <<age>> years
> > > old".sub({'name': x.name, 'age': x.age.format(None, 0)})
> >
> > > Guido, can you please, for our enlightenment, tell us what are the
> > > reasons you feel %(foo)s was a mistake?
> >
> > Because of the trailing 's'.  It's very easy to leave it out by
> > mistake, and because the definition of printf formats skips over
> > spaces (don't ask me why), the first character of the following word
> > is used as the type indicator.
> 
> It's easy to leave it out by mistake, but the error is almost always
> immediately obvious. In the interest of keeping the language as
> simple as possible, I hope no changes are made. If a method based
> .sub() capability is to be added, why not reuse the %(identifier)
> syntax instead of introducing $ and ${} syntax? The .sub() string
> method would use the %(identifier) syntax without the 's' to spell
> the new substitution format. Instead of the proposed:
> 
> 	'$name was born in ${country}'.sub()
> 
> the phrase would be spelled:
> 
> 	'%(name) was born in %(country)'.sub()
> 
> This approach would introduce one new string method with a small
> variation on the existing '%' substitution syntax.

An argument can be made that since this works rather different than
the current % operator, it's better to avoid confusion by using a
different character.  One can also argue that many Perl and shell
programmers are migrating to Python, for whom this would be helpful --
for others, $ or % makes little difference (DOS batch file programmers
aren't that common, most Windows users never get to this).

But the exact syntax to use in the template is a relatively trivial
detail IMO.  Whether to pick `name`, <<name>>, $name, $(name),
${name}, %name, %{name}, or %(name), is a choice we can make later.
Ditto about whether to allow full expressions, dotted names only, or
simple names only, and whether to allow leaving off the brackets for
simple names (or even for dotted names, as in PEP 215).  User testing
would be good.

User testing has already shown that the current %(name)s notation
causes too many mistakes, because of the odd trailing 's'.  These
errors may be immediately obvious when you run the code, but
constructs that are easily mistyped should still be avoided if
possible.  Also, I believe that the error has actually been puzzling
for many people (e.g. sometimes no error is raised but on close
inspection a few characters appear to be omitted from the output).

The real issues are IMO:

- Compile-time vs. run-time parsing.  I've become convinced that the
  compiler should do the parsing: this is the only way to make access
  to variables in nested scopes work, avoids security issues, and
  makes it easier to diagnose errors (e.g. in PyChecker).

- How to support translation.  Here the template must be replaced at
  run-time, but it is still desirable that the collection of available
  names is known at compile time (to avoid the security issues).

- Optional formatting specifiers.  I agree with Lalo that these should
  not be part of the interpolation syntax but need to be dealt with at
  a different level.  I think these are only relevant for numeric
  data.  Funny, there's still a (now-deprecated) module fpformat.py
  that supports arbitrary floating point formatting, and
  string.zfill() supports a bit of integer formatting.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 20:08:54 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 15:08:54 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: Your message of "Fri, 12 Jul 2002 21:03:58 +0200."
 <0b5b01c229d6$e0df5cb0$e000a8c0@thomasnotebook>
References: <20020712181706.63704.qmail@web40101.mail.yahoo.com> <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net> <0b3901c229d4$2924d340$e000a8c0@thomasnotebook> <200207121851.g6CIp6i13450@pcp02138704pcs.reston01.va.comcast.net>
 <0b5b01c229d6$e0df5cb0$e000a8c0@thomasnotebook>
Message-ID: <200207121908.g6CJ8sx13530@pcp02138704pcs.reston01.va.comcast.net>

> What do you mean by 'works well with the mmap and array modules'?

I'm not sure, since I don't know what your memory object does (and
frankly, I don't really understand what the mmap module does either
:-).

I was just mentioning these because they are other modules that have
been used and/or proposed for buffering needs.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas.heller@ion-tof.com  Fri Jul 12 20:19:53 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 12 Jul 2002 21:19:53 +0200
Subject: [Python-Dev] Fw: Behavior of buffer()
References: <20020712181706.63704.qmail@web40101.mail.yahoo.com> <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net> <0b3901c229d4$2924d340$e000a8c0@thomasnotebook> <200207121851.g6CIp6i13450@pcp02138704pcs.reston01.va.comcast.net>              <0b5b01c229d6$e0df5cb0$e000a8c0@thomasnotebook>  <200207121908.g6CJ8sx13530@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <0b8501c229d9$19a34e10$e000a8c0@thomasnotebook>

> > What do you mean by 'works well with the mmap and array modules'?
> 
> I'm not sure, since I don't know what your memory object does (and
> frankly, I don't really understand what the mmap module does either
> :-).
> 
"Memory-mapped file objects behave like both strings and like file objects.
Unlike normal string objects, however, these are mutable."
More in the Python manual...

Optionally they can be backed up by files in the file system,
and optionally they can be shared between processes. At least that's
what they are under Windows.

> I was just mentioning these because they are other modules that have
> been used and/or proposed for buffering needs.

Now that you mention this, mmap could be used as a 'memory' object,
although it would have to be converted into a new style class.

My own memory object currently supports a private protocol
which dosn't make sense for core Python. But that can be fixed.

Thomas




From oren-py-d@hishome.net  Fri Jul 12 20:23:26 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 12 Jul 2002 22:23:26 +0300
Subject: [Python-Dev] Re: Python version of PySlice_GetIndicesEx
In-Reply-To: <200207121838.g6CIcV813352@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 12, 2002 at 02:38:31PM -0400
References: <000d01c21cdb$eb03b720$91d8accf@othello> <20020630173903.GA37045@hishome.net> <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net> <20020712212105.A8666@hishome.net> <200207121838.g6CIcV813352@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020712222326.A10011@hishome.net>

On Fri, Jul 12, 2002 at 02:38:31PM -0400, Guido van Rossum wrote:
> Have you got a real use case?  I'm a bit weary of hypothetical use
> cases (that's what got us xrange repetition in the first place).

Umm.. implementing slicable user types?  I've written some indexable 
objects with a __getitem__ magic method. Making them fully slicable with 
extended slicing format almost for free would have been really nice. 

Yes, you're right. It's just a nice-to-have. I don't care about it that 
much.  

	Oren



From aahz@pythoncraft.com  Fri Jul 12 20:49:54 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 12 Jul 2002 15:49:54 -0400
Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting
In-Reply-To: <200207121907.g6CJ76N13511@pcp02138704pcs.reston01.va.comcast.net>
References: <20020623181630.GN25927@laranja.org> <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net> <200207121324.21609.mclay@nist.gov> <200207121907.g6CJ76N13511@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020712194954.GA18925@panix.com>

On Fri, Jul 12, 2002, Guido van Rossum wrote:
>
> - Optional formatting specifiers.  I agree with Lalo that these should
>   not be part of the interpolation syntax but need to be dealt with at
>   a different level.  I think these are only relevant for numeric
>   data.  

I've used "%20s" * 5 frequently enough in the past to do crude tables.
That's not a feature I'd like to lose.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From guido@python.org  Fri Jul 12 21:00:21 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 16:00:21 -0400
Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error
In-Reply-To: Your message of "Fri, 12 Jul 2002 13:29:17 MDT."
 <3D2F2E0D.2C4FD92F@3captus.com>
References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com> <3D188C5D.D519DD90@3captus.com> <200207121637.g6CGbAE12463@pcp02138704pcs.reston01.va.comcast.net>
 <3D2F2E0D.2C4FD92F@3captus.com>
Message-ID: <200207122000.g6CK0Lw13863@pcp02138704pcs.reston01.va.comcast.net>

> > The way I restructured the code it is impossible to distinguish a
> > timeout error from other errors; you simply get the "no data
> > available" error from the socket operation.  This is the same error
> > you'd get in non-blocking mode.
> > 
> 
> To distinguish a timeout error, the caller can check s->sock_timeout
> when a non-blocking mode error occured, or just return an error code
> from internal_select() (I guess you must have your reason to taken it
> out in the first place)

I don't understand your first suggestion.  Not all errors mean that
the timeout triggered!

I took it out because it is much less code this way.

> > Before I recomplicate the code so that it can raise a separate error
> > when the select fails, I'd like to understand the use case better.
> > Why would you want to make this distinction?  Requeueing the request
> > (as in Skip's example) doesn't make sense IMO: you set the timeout for
> > a reason, and that reason is that you want to give up if it takes too
> > long.  If you really intend to retry you're better of disabling the
> > timeout!
> >
> 
> How about the following (assume we have socket.setDefaultTimeout()):
> 
>     import socket
>     import urllib
> 
>     socket.setDefaultTimeout(5.0)
>     retry = 0
>     url = 'some url'
> 
>     while retry < 3:
>         try:
>             file = urllib.urlretrieve(url)
>         except socket.TimeoutError:
>             if retry == 2:
>                 print "Server too busy, given up!"
>                 raise
>             else:
>                 print "Server busy, retry!"
>                 retry += 1
>         else:
>             break
> 
> MS IIS behave strangely to http request.  When the server is very busy,
> it will randomly drop some requests without disconnecting the client. 
> So the best approach for the client is to timeout and retry.  I guess
> that might be the reason why people needed timeoutsocket in the first
> place.

One of the reasons (there are lots of reasons why a connect or receive
attempt may be very slow to time out, or even never time out).

Of course, this stll doesn't distinguish between a timeout from
connect() and one from recv().

Have you ever written code like this?

> > If you really want to, you can already distinguish the timeout case,
> > because you get an EAGAIN error then (maybe something else on Windows
> > -- Bernard, if you have a fix for that, please send it to me).
> 
> I am struggling with the test case for the new socket code.  The timeout
> test case I've send you works with the old socketmodule.c (attached),
> but not with the lastest version (on linux or windows).  It's strange,
> your new implementation looks much cleaner.

No need to attach copies of old versions -- just give me the CVS
revision number. :-)

> Please bear with me a bit longer for a patch  :.(

OK.

Anyway, I have no time to play with this right now, so I'm glad you
aren't giving up just yet. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal@lemburg.com  Fri Jul 12 21:13:11 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 12 Jul 2002 22:13:11 +0200
Subject: [Python-Dev] python package
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> <3D2D3720.9040100@lemburg.com> <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net> <3D2F1245.1030804@lemburg.com> <200207121736.g6CHact13010@pcp02138704pcs.reston01.va.comcast.net>              <3D2F161D.40005@lemburg.com> <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D2F3857.3010700@lemburg.com>

Guido van Rossum wrote:
>>With the exception that we have control over the Python core
>>code while we don't over third party extensions, so providing
>>means to simplify the transition for the standard lib is easier
>>than trying to enforce your proposed 'nonstd' package.
> 
> 
> I think you could get a long way with minor changes along the lines of
> making site-packages a package itself.

This wouldn't work since in that case you'd have the problem
of having to fix class names in e.g. pickles for objects
which you don't know anything about. We do know about objects
in the Python standard lib, so we could take care to have mechanisms
like pickle deal with them properly.

>>>Then please think about a proper solution rather than proposing
>>>something whose only virtue seems to be that you can implement a poor
>>>approximation of it in two lines.
>>
>>Just testing waters here... there's no point in trying to
>>find a solution to something which is not regarded as problem
>>anyway.
> 
> 
> You started by claiming that there's a problem: expansion of the
> stdlib could conflict with 3rd party module/package names.
> 
> I don't regard it as a problem that's so bad that we need to make big
> changes to solve it.

I believe that the more Python grows (not only the core,
but the complete set of available modules and packages in
the Python universe), the less likely we are going to
hit a problem.

> If you still think a solution is desired, you could start by proposing
> a new standard package hierarchy.  Then new standard modules could be
> placed in that new hierarchy rather than at the top level.
> 
> I'm rejecting the proposal of a single top-level package named "python".

You've written that before, but you still haven't given any
explanation of why a single package would be worse than a
multi-level hierarchy of modules (e.g. grouped by application
space).

I think that simply moving to one package would cause less
breakage and make the whole transition process much easier
than having to tweak code into using some complicated
multi-package structure.

FWIW, I've been through all this with the mx packages
and using a single new package caused the least amount
of work. Even better: it turned out to be easy to provide
backwards compatibility code so that applications still
using the old layout continue to run, but start using the
new structure in their pickles.

No need to get heated, though. I just thought that it would
be a good time to start thinking about this option again.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From tim.one@comcast.net  Fri Jul 12 21:11:29 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 16:11:29 -0400
Subject: [Python-Dev] PEP 292-related: why string substitution is not the
 same operation as data formatting
In-Reply-To: <20020712194954.GA18925@panix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEDDAEAB.tim.one@comcast.net>

[Aahz]
> I've used "%20s" * 5 frequently enough in the past to do crude tables.
> That's not a feature I'd like to lose.

So has Guido -- he'll remember that before it's too late <wink>.  Ditto "-"
to switch string justification.  Prediction:  the $(name:optional_format)
notation will win in the end.




From tim.one@comcast.net  Fri Jul 12 21:15:42 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 16:15:42 -0400
Subject: [Python-Dev] Re: Python version of PySlice_GetIndicesEx
In-Reply-To: <20020712222326.A10011@hishome.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEDEAEAB.tim.one@comcast.net>

Just to be helpfully irritating, I'll note that Zope's C implementation of
slice index normalization for BTreeItems objects was off in nearly every way
possible, until a few weeks ago.  It really is difficult to get this right.




From gmcm@hypernet.com  Fri Jul 12 21:26:29 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Fri, 12 Jul 2002 16:26:29 -0400
Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEDDAEAB.tim.one@comcast.net>
References: <20020712194954.GA18925@panix.com>
Message-ID: <3D2F0335.25365.5FE8BAB@localhost>

On 12 Jul 2002 at 16:11, Tim Peters wrote:

[Aahz]
> I've used "%20s" * 5 frequently enough in the past to
> do crude tables. That's not a feature I'd like to
> lose.

[Tim] 
> So has Guido -- he'll remember that before it's too
> late <wink>.  Ditto "-" to switch string
> justification. Prediction:  the
> $(name:optional_format) notation will win in the
> end. 

Good. I use both a just enough that I'd really miss
them, but not frequently enough to remember exactly
what each modifier does what with each data type.

-- Gordon
http://www.mcmillan-inc.com/




From mal@lemburg.com  Fri Jul 12 21:31:41 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 12 Jul 2002 22:31:41 +0200
Subject: [Python-Dev] Alternative implementation of interning, take 2
References: <LNBBLJKPBEHFEDALKOLCKECBAEAB.tim.one@comcast.net>
Message-ID: <3D2F3CAD.801@lemburg.com>

Tim Peters wrote:
> [M.-A. Lemburg]
> 
>>If you could spell out what exactly you mean by "indirect interning"
>>that would help.
> 
> 
> Actually, I don't think it would -- the issue is whether the possibility for
> the ob_sinterned member of a PyStringObject not to *be* the string object
> itself ever saves time in your extensions, and it's darned hard to guess
> that.  If you apply the attached patch to current CVS, though, it will tell
> you whenever your code benefits from it.

Cool, I'll try that... hmm, I'll have to backport it to Python 2.1.3
though ;-)

> AFAICT, there are only 3 routines where it *might* save cycles (but note
> that checking for the possibility costs cycles whether or not it pays; it's
> a net loss when it doesn't pay):
> 
> + PyDict_SetItem:  I believe this is the only real possibility for gain.  If
> it ever helps you here, the patch arranges to print
> 
>     ii paid on a setitem

Scanning the source code: I hardly use PyDict_SetItem(); most usages
are PyDict_SetItemString().

> to stderr whenever it does pay.  I haven't yet seen that get printed.
> 
> + PyString_InternInPlace:  Whenever it pays here, the patch spits
> 
>     ii paid on an InternInPlace

I do use this API, but only in mxURL and mxXMLTools (which is
closed source and works with the evil code below I mentioned ;-).

> That triggers 6 times in the Python test suite, all from test_descr.  Since
> this one is an optimization *of* setting ob_sinterned, it's a
> snake-eating-its-tail kind of thing -- it's of no real benefit unless
> ob_sintered pays off somewhere else too.
> 
> + string_hash:  The patch spits
> 
>     ii paid on a hash???
> 
> The question marks are there because I don't see how it's possible for this
> to get printed.
> 
> 
>>What I do need and rely on is the fact that the
>>Python compiler interns all constant strings and identifiers in
>>Python programs. This makes switching like so:
> 
> 
> Ya, while that's evil, it's not affected by indirect interning.

Cool :-)

If Guido should ever decide to rip this out, I can always switch
to a different technique, e.g. use my own interning token type.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From tim.one@comcast.net  Fri Jul 12 21:33:51 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 16:33:51 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEDGAEAB.tim.one@comcast.net>

{Guido, to Scott Gilbert]
> It seems we're still in the same boat.  It would be saner to change
> buffer slices to return buffer objects, except for backward
> compatibility.  I was hoping to hear from someone who uses buffer
> objects and knows that this would break his code.

Raymond did a survey on c.l.py, asking anyone who used buffer objects at
*all* to speak up.  IIRC, he got no replies.  On Python-Dev, apart from
musing whether they might conceivably use them, the only person who
eventually said they actually used them was Marc-Andre.  Fredrik pressed for
details, but we haven't seen any concrete use cases.  In the absence of the
latter, it's impossible to guess what would be backward compatible for MAL's
purposes.

> ...
> Maybe we should do something stronger, and deprecate the buffer type
> altogether.

I told everyone you forgot the essay you wrote suggesting this the last time
this rose above everyone's pain threshold.  It's a comfort to know that my
channeling powers have not diminished with exponentially advancing age
<wink>:

     http://mail.python.org/pipermail/python-dev/2000-October/009974.html




From guido@python.org  Fri Jul 12 21:37:41 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 16:37:41 -0400
Subject: [Python-Dev] python package
In-Reply-To: Your message of "Fri, 12 Jul 2002 22:13:11 +0200."
 <3D2F3857.3010700@lemburg.com>
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net> <3D2CA81E.6060408@lemburg.com> <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net> <3D2D3720.9040100@lemburg.com> <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net> <3D2F1245.1030804@lemburg.com> <200207121736.g6CHact13010@pcp02138704pcs.reston01.va.comcast.net> <3D2F161D.40005@lemburg.com> <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net>
 <3D2F3857.3010700@lemburg.com>
Message-ID: <200207122037.g6CKbf514140@pcp02138704pcs.reston01.va.comcast.net>

> > I think you could get a long way with minor changes along the lines of
> > making site-packages a package itself.
> 
> This wouldn't work since in that case you'd have the problem
> of having to fix class names in e.g. pickles for objects
> which you don't know anything about. We do know about objects
> in the Python standard lib, so we could take care to have mechanisms
> like pickle deal with them properly.

IOW you're suggesting we do a near-infinite amount of work to the core
just so that others can be sloppy in their choice of names for their
modules.  Bah.

> I believe that the more Python grows (not only the core,
> but the complete set of available modules and packages in
> the Python universe), the less likely we are going to
> hit a problem.

I would say, OK, so it will go away by itself, but I guess you made a
typo there, and really meant "the more likely...". :-)

But making the core go away doesn't reduce the problem enough: the
more likely problem is two 3rd parties unaware of each other each
picking the same name.

> > I'm rejecting the proposal of a single top-level package named "python".
> 
> You've written that before, but you still haven't given any
> explanation of why a single package would be worse than a
> multi-level hierarchy of modules (e.g. grouped by application
> space).

Because a single package doesn't have any other benefits besides
getting out of the way from 3rd party developers.

At least a proper hierarchy would have the other benefits of grouping.
(But better make it a shallow hierarchy!  remember "Flat is better
than nested.")

> I think that simply moving to one package would cause less
> breakage and make the whole transition process much easier
> than having to tweak code into using some complicated
> multi-package structure.

Given that you now want us to add special counter-measure to pickle, I
doubt that very much.

> FWIW, I've been through all this with the mx packages
> and using a single new package caused the least amount
> of work. Even better: it turned out to be easy to provide
> backwards compatibility code so that applications still
> using the old layout continue to run, but start using the
> new structure in their pickles.

So it's no big deal for 3rd party developers to do what they should do
to deal with this problem.  Good to hear.  Given that when we change
the standard library, *every* Python user (and developer) is affected,
I prefer the status quo.

> No need to get heated, though. I just thought that it would
> be a good time to start thinking about this option again.

And this would be a good time to end this thread. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 21:39:12 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 16:39:12 -0400
Subject: [Python-Dev] Alternative implementation of interning, take 2
In-Reply-To: Your message of "Fri, 12 Jul 2002 22:31:41 +0200."
 <3D2F3CAD.801@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCKECBAEAB.tim.one@comcast.net>
 <3D2F3CAD.801@lemburg.com>
Message-ID: <200207122039.g6CKdD314156@pcp02138704pcs.reston01.va.comcast.net>

> If Guido should ever decide to rip this out, I can always switch
> to a different technique, e.g. use my own interning token type.

Why wait?  Rip it out now!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 12 21:41:29 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 16:41:29 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: Your message of "Fri, 12 Jul 2002 16:33:51 EDT."
 <LNBBLJKPBEHFEDALKOLCMEDGAEAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEDGAEAB.tim.one@comcast.net>
Message-ID: <200207122041.g6CKfTt14197@pcp02138704pcs.reston01.va.comcast.net>

> Raymond did a survey on c.l.py, asking anyone who used buffer objects at
> *all* to speak up.  IIRC, he got no replies.  On Python-Dev, apart from
> musing whether they might conceivably use them, the only person who
> eventually said they actually used them was Marc-Andre.  Fredrik pressed for
> details, but we haven't seen any concrete use cases.  In the absence of the
> latter, it's impossible to guess what would be backward compatible for MAL's
> purposes.
> 
> > ...
> > Maybe we should do something stronger, and deprecate the buffer type
> > altogether.
> 
> I told everyone you forgot the essay you wrote suggesting this the last time
> this rose above everyone's pain threshold.  It's a comfort to know that my
> channeling powers have not diminished with exponentially advancing age
> <wink>:
> 
>      http://mail.python.org/pipermail/python-dev/2000-October/009974.html

But at least I didn't change my mind. :-)

So let's deprecate buffer().  I also suggest to roll back Raymond's
changes to make slices more consistent -- there's no point in changing
something that's only kept for backwards compatibility reasons.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From mal@lemburg.com  Fri Jul 12 21:39:23 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 12 Jul 2002 22:39:23 +0200
Subject: [Python-Dev] Fw: Behavior of buffer()
References: <20020623222209.62675.qmail@web40105.mail.yahoo.com> <200207121536.g6CFaqr09850@pcp02138704pcs.reston01.va.comcast.net>              <3D2F0AC1.508@lemburg.com> <200207121724.g6CHOVt12881@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D2F3E7B.5030101@lemburg.com>

Guido van Rossum wrote:
>>Guido van Rossum wrote:
>>
>>>I'm a little surprised.  Raymond Hettinger checked in a change that
>>>makes all slices of buffer objects return strings.  His comments on SF
>>>bug 546434 say that only one person replied and that they agreed
>>>returning strings was the better solution.  But that's not how I read
>>>the only response to his query that I see in python-dev, from Scott
>>>Gilbert:
>>
>>Interesting. I must have skipped that message.
> 
> 
> You blink, and you find that the world has changed.

Indeed :-)

>>IMHO, all slices of buffer object should return buffer objects,
>>but since all Python releases return strings, I guess this is too
>>late to change.
> 
> 
> That was my preference too, but Raymond disagreed and somehow tried to
> find support for his position :-).
> 
> Since buffer objects (of course :-) support the C-level buffer
> protocol, they can still be used in most places where strings are
> needed.  But it would be incompatible.  But so is Raymond's solution
> (because it changes buffer()[:] to also return a string).
> 
>>Note that the only case where a buffer object
>>is returned in Python 2.x (x < 3) is if you write
>>buffer()[:], i.e. you want a copy of the buffer object.
> 
> What does a copy of a buffer object buy you?

Nothing... since you only get a new reference, not an
independent copy.

> It's not too late to revert Raymond's changes.

Why not try the buffer slice returns buffer logic for
a few alphas, then betas, and then if noone complains
the final release ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From guido@python.org  Fri Jul 12 21:45:39 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jul 2002 16:45:39 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: Your message of "Fri, 12 Jul 2002 22:39:23 +0200."
 <3D2F3E7B.5030101@lemburg.com>
References: <20020623222209.62675.qmail@web40105.mail.yahoo.com> <200207121536.g6CFaqr09850@pcp02138704pcs.reston01.va.comcast.net> <3D2F0AC1.508@lemburg.com> <200207121724.g6CHOVt12881@pcp02138704pcs.reston01.va.comcast.net>
 <3D2F3E7B.5030101@lemburg.com>
Message-ID: <200207122045.g6CKje914283@pcp02138704pcs.reston01.va.comcast.net>

> Why not try the buffer slice returns buffer logic for
> a few alphas, then betas, and then if noone complains
> the final release ?

Since nobody cares, we won't get complaints.  But it's a waste of
time.  I'm going to deprecate it.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Fri Jul 12 21:48:09 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 16:48:09 -0400
Subject: [Python-Dev] Alternative implementation of interning, take 2
In-Reply-To: <3D2F3CAD.801@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDJAEAB.tim.one@comcast.net>

[M.-A. Lemburg, on my "does ii help at all?" ii patch]
> Cool, I'll try that... hmm, I'll have to backport it to Python 2.1.3
> though ;-)

Your codebase doesn't run under current CVS?  If so, I would have guessed
you would have mentioned that before this <wink>.

> Scanning the source code: I hardly use PyDict_SetItem(); most usages
> are PyDict_SetItemString().

That's why you shouldn't try to guess.  The latter calls the former, and the
real target here is actually indirect optimization of different ways to
spell setattr.  They all end up in PyDict_SetItem; it doesn't matter whether
you call that directly.

>> + PyString_InternInPlace:  Whenever it pays here, the patch spits
>>
>>     ii paid on an InternInPlace

> I do use this API, but only in mxURL and mxXMLTools (which is
> closed source and works with the evil code below I mentioned ;-).

As mentioned before, the optimization in this doesn't do you any good
overall unless it triggers in PyDict_SetItem() later.  If it doesn't trigger
in the latter, your code will run faster overall if we removed the
optimization from PyString_InternInPlace (although probably not measurably
faster in this routine; a never-pays anti-optimization in PyDict_SetItem is
a much more serious matter).

>> Ya, while that's evil, it's not affected by indirect interning.

> Cool :-)
>
> If Guido should ever decide to rip this out,

He won't, but it's quite likely to either not do you any good, or actually
do you harm, in an alternate implementation of Python (e.g., I doubt-- but
don't know --that Jython bothers with this_).

> I can always switch to a different technique, e.g. use my own interning
> token type.

Or you could call intern() explicitly.  That's what I usually do.

IF_TOKEN, ELSE_TOKEN, ... = map(intern, "if else ...". split())




From mclay@nist.gov  Fri Jul 12 21:46:13 2002
From: mclay@nist.gov (Michael McLay)
Date: Fri, 12 Jul 2002 16:46:13 -0400
Subject: [Python-Dev] python package
In-Reply-To: <200207121842.g6CIgQo13399@pcp02138704pcs.reston01.va.comcast.net>
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk> <200207121431.30722.mclay@nist.gov> <200207121842.g6CIgQo13399@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200207121646.13992.mclay@nist.gov>

On Friday 12 July 2002 02:42 pm, Guido van Rossum wrote:
> [me] ...
> Uh?  Who is proposing to name it "new"?  Not me!  Maybe you should
> read the entire thread again? :-)

Ok, I guess I'm just a bit more confused than usual today. I had also read the 
following message and made the unfortunate assumption that you were proposing 
"new" as the name of a new top level module to contain all the standard 
python modules. Opps I merged the threads in my head.

On Friday 12 July 2002 11:51 am, Guido van Rossum wrote:
> > > If we need a place to name types that don't deserve being builtins,
> > > perhaps new.py is a better place?
> >
> > The new. prefix is natural enough for
> >
> > 	m = new.module('name')
> >
> > type but it looks pretty awkward in
> >
> > 	if isinstance(obj, new.generator):
> >
> > What's the meaning of 'new' in this context?
>
> Sometimes you ask too many questions. :-)
>
> Let's just say that this is a historically available name.  I don't
> expect that isinstance(obj, generator) is a very common question to
> ask, so I don't mind if you have to ask it in a somewhat awkward way.

Now back to the issue of moving all the top level names in the standard 
distribution into a "python" namespace. For the remainder of the 2.X release 
cycle it is important to not remove the existing names from the top level 
namespace. However, it might be reasonable to move all standard distribution 
names into a single top level namespace and grandfather the existing top 
level names into the top level namespace for the remainder of the 2.x series. 
The existing set of names would be available from either namespace. All new 
names for the standard distribution would only be placed in the new top level 
standard package namespace. 

With this approach all old names would still be accessible to the existing 
code base as top level names and introducing new names to the standard 
distribution will not clobber third party modules and packages. For the 
remainder of 2.X the rules will be messy because some standard names will be 
accessible from either the top level namespace or from the standard "python" 
namespace. Then for Python 3.0 the grandfathered names would be removed from 
the top level namespace. This approach should enable a smoother transition in 
the documentation and coding practices.

The preferred coding style guide, the tutorial, and other documentation would 
be used to explain the transition plan. The new guidelines would promote the 
use of the new namespace for all cases, but it would not preclude the use of 
the older coding style. 

I"m not keen on the use the name "python" for the top level namespace. Perhaps 
the name "std" would be more desirable (and shorter to type). 




From mal@lemburg.com  Fri Jul 12 21:41:42 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 12 Jul 2002 22:41:42 +0200
Subject: [Python-Dev] Incompatible changes to xmlrpclib
References: <3D240FF2.3060708@lemburg.com>
Message-ID: <3D2F3F06.1060800@lemburg.com>

Any news on this one ?

M.-A. Lemburg wrote:
> I noticed yesterday that the xmlrcplib.py version in CVS
> is incompatible with the version in Python 2.2: all the
> .dump_XXX() interfaces changed and now include a third
> argument.
> 
> Since the Marshaller can be subclassed, this breaks all
> existing application space subclasses extending or changing
> the default xmlrpclib behaviour.
> 
> I'd opt for moving back to the previous style of calling the
> write method via self.write.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Fri Jul 12 21:54:11 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 12 Jul 2002 22:54:11 +0200
Subject: [Python-Dev] Fw: Behavior of buffer()
References: <LNBBLJKPBEHFEDALKOLCMEDGAEAB.tim.one@comcast.net>
Message-ID: <3D2F41F3.9070805@lemburg.com>

Tim Peters wrote:
> {Guido, to Scott Gilbert]
> 
>>It seems we're still in the same boat.  It would be saner to change
>>buffer slices to return buffer objects, except for backward
>>compatibility.  I was hoping to hear from someone who uses buffer
>>objects and knows that this would break his code.
> 
> 
> Raymond did a survey on c.l.py, asking anyone who used buffer objects at
> *all* to speak up.  IIRC, he got no replies.  On Python-Dev, apart from
> musing whether they might conceivably use them, the only person who
> eventually said they actually used them was Marc-Andre.  Fredrik pressed for
> details, but we haven't seen any concrete use cases.  In the absence of the
> latter, it's impossible to guess what would be backward compatible for MAL's
> purposes.

For my purposes, the strategy buffer slice returns a buffer
would be more appropriate because it would save the buffer type
information across the slicing operation... I mean, you don't
want to get bananas when you slice an apple in real life either ;-)

I use buffers to mean: this is a chunk of binary data. The purpose
is to recognize this type of data for pickling via xml-rpc,
soap and other rpc mechanisms etc.

Strings don't provide this information (since they can be a mix of
text and binary data). Buffers are compatible enough with most tools
working on strings that they represent a good alternative to tag data
as being binary while not losing all the nice advantages of
strings. The downside is that most of these tools return their
results as strings :-(

Now it would be nice if at least the type itself would behave in a
sane way.

>>Maybe we should do something stronger, and deprecate the buffer type
>>altogether.
> 
> 
> I told everyone you forgot the essay you wrote suggesting this the last time
> this rose above everyone's pain threshold.  It's a comfort to know that my
> channeling powers have not diminished with exponentially advancing age
> <wink>:
> 
>      http://mail.python.org/pipermail/python-dev/2000-October/009974.html

Oh yeah, that was during the Unicode implementation wars... :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Fri Jul 12 22:04:15 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 12 Jul 2002 23:04:15 +0200
Subject: [Python-Dev] Alternative implementation of interning, take 2
References: <LNBBLJKPBEHFEDALKOLCOEDJAEAB.tim.one@comcast.net>
Message-ID: <3D2F444F.8030203@lemburg.com>

Tim Peters wrote:
> [M.-A. Lemburg, on my "does ii help at all?" ii patch]
> 
>>Cool, I'll try that... hmm, I'll have to backport it to Python 2.1.3
>>though ;-)
> 
> 
> Your codebase doesn't run under current CVS?  If so, I would have guessed
> you would have mentioned that before this <wink>.

I don't test against the current CVS -- no time for that.

>>Scanning the source code: I hardly use PyDict_SetItem(); most usages
>>are PyDict_SetItemString().
> 
> 
> That's why you shouldn't try to guess.  The latter calls the former, and the
> real target here is actually indirect optimization of different ways to
> spell setattr.  They all end up in PyDict_SetItem; it doesn't matter whether
> you call that directly.

Sure, but SetItemString() does some extra magic: it interns the
key for me.

>>>+ PyString_InternInPlace:  Whenever it pays here, the patch spits
>>>
>>>    ii paid on an InternInPlace
>>
> 
>>I do use this API, but only in mxURL and mxXMLTools (which is
>>closed source and works with the evil code below I mentioned ;-).
> 
> 
> As mentioned before, the optimization in this doesn't do you any good
> overall unless it triggers in PyDict_SetItem() later.  If it doesn't trigger
> in the latter, your code will run faster overall if we removed the
> optimization from PyString_InternInPlace (although probably not measurably
> faster in this routine; a never-pays anti-optimization in PyDict_SetItem is
> a much more serious matter).

I only use PyString_InternInPlace() on strings which will be
used as dict keys or for string compares in tokenizers and
parsers.

>>>Ya, while that's evil, it's not affected by indirect interning.
>>
> 
>>Cool :-)
>>
>>If Guido should ever decide to rip this out,
> 
> 
> He won't, but it's quite likely to either not do you any good, or actually
> do you harm, in an alternate implementation of Python (e.g., I doubt-- but
> don't know --that Jython bothers with this_).

Jaja... as soon as PEP 275 is implemented I won't have to
worry any more :-)

>>I can always switch to a different technique, e.g. use my own interning
>>token type.
> 
> 
> Or you could call intern() explicitly.  That's what I usually do.
> 
> IF_TOKEN, ELSE_TOKEN, ... = map(intern, "if else ...". split())

True, but Python's compiler already does this for me. You right,
though, I should make this explicit...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From bernie@3captus.com  Fri Jul 12 21:56:11 2002
From: bernie@3captus.com (Bernard Yue)
Date: Fri, 12 Jul 2002 14:56:11 -0600
Subject: [Python-Dev] Minor socket timeout quibble - timeout raises
 socket.error
References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com> <3D188C5D.D519DD90@3captus.com> <200207121637.g6CGbAE12463@pcp02138704pcs.reston01.va.comcast.net>
 <3D2F2E0D.2C4FD92F@3captus.com> <200207122000.g6CK0Lw13863@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D2F426A.E71B489B@3captus.com>

Guido van Rossum wrote:

> > To distinguish a timeout error, the caller can check s->sock_timeout
> > when a non-blocking mode error occured, or just return an error code
> > from internal_select() (I guess you must have your reason to taken it
> > out in the first place)
> 
> I don't understand your first suggestion.  Not all errors mean that
> the timeout triggered!
> 

For example, when accept() fail with error code EAGAIN and
s->sock_timeout = 5.0, it indicates timeout.  Same for connect() fail
with EINPROGRESS.  Anyway, on second thought, it is messy.

> >
> > How about the following (assume we have socket.setDefaultTimeout()):
> >
> >     import socket
> >     import urllib
> >
> >     socket.setDefaultTimeout(5.0)
> >     retry = 0
> >     url = 'some url'
> >
> >     while retry < 3:
> >         try:
> >             file = urllib.urlretrieve(url)
> >         except socket.TimeoutError:
> >             if retry == 2:
> >                 print "Server too busy, given up!"
> >                 raise
> >             else:
> >                 print "Server busy, retry!"
> >                 retry += 1
> >         else:
> >             break
> >
> > MS IIS behave strangely to http request.  When the server is very busy,
> > it will randomly drop some requests without disconnecting the client.
> > So the best approach for the client is to timeout and retry.  I guess
> > that might be the reason why people needed timeoutsocket in the first
> > place.
> 
> One of the reasons (there are lots of reasons why a connect or receive
> attempt may be very slow to time out, or even never time out).
> 
> Of course, this stll doesn't distinguish between a timeout from
> connect() and one from recv().
> 

I think you are right on the point.  Client might not care if the call
is timeouted on connect() or recv().  In this case a timeout error comes
handy.

> Have you ever written code like this?
> 

Yes I did.  

> > I am struggling with the test case for the new socket code.  The timeout
> > test case I've send you works with the old socketmodule.c (attached),
> > but not with the lastest version (on linux or windows).  It's strange,
> > your new implementation looks much cleaner.
> 
> No need to attach copies of old versions -- just give me the CVS
> revision number. :-)
> 

socketmodule.c  version 1.225
socketmodule.h  version 1.7

> > Please bear with me a bit longer for a patch  :.(
> 
> OK.
> 
> Anyway, I have no time to play with this right now, so I'm glad you
> aren't giving up just yet. :-)
> 

It is very painful indeed (Tim was so right).

> --Guido van Rossum (home page: http://www.python.org/~guido/)

Bernie



From gmcm@hypernet.com  Fri Jul 12 22:32:12 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Fri, 12 Jul 2002 17:32:12 -0400
Subject: [Python-Dev] python package
In-Reply-To: <200207121646.13992.mclay@nist.gov>
References: <200207121842.g6CIgQo13399@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D2F129C.12231.63AB67B@localhost>

On 12 Jul 2002 at 16:46, Michael McLay wrote:

> ... it might be reasonable to move all standard
> distribution names into a single top level namespace
> and grandfather the existing top level names into
> the top level namespace for the remainder of
> the 2.x series. 

Getting
 from <toplevelname> import urllib
and 
 import urllib

to return the same (is, not equals) object will
require very delicate surgery on some very difficult
code. And without it, most non-trivial scripts will
break in very mysterious ways.

-- Gordon
http://www.mcmillan-inc.com/




From tim.one@comcast.net  Fri Jul 12 22:33:18 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 17:33:18 -0400
Subject: [Python-Dev] Alternative implementation of interning, take 2
In-Reply-To: <3D2F444F.8030203@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDPAEAB.tim.one@comcast.net>

[MAL]
> Sure, but SetItemString() does some extra magic: it interns the
> key for me.

As a directly interned string.  Indirect interning is irrelevant to this
benefit.

Don't argue about this, run the patched code <wink>:  it will tell you
directly whether ii is doing you any good.

> ...
> I only use PyString_InternInPlace() on strings which will be
> used as dict keys or for string compares in tokenizers and
> parsers.

Again it doesn't really matter when you call it; if the indirect interning
optimization is doing you any good, it will be because of stuff Python is
doing under the covers.




From tim.one@comcast.net  Fri Jul 12 22:38:32 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 17:38:32 -0400
Subject: [Python-Dev] python package
In-Reply-To: <200207121646.13992.mclay@nist.gov>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEEAAEAB.tim.one@comcast.net>

[Michael McLay]
> ...
> I had also read the following message and made the unfortunate
> assumption that you were proposing "new" as the name of a new top level
> module to contain all the standard python modules.

Note that "new" is already the name of a top-level module, and has been for
years.  That other thread was about drawing useless distinctions between the
already-existing "new" and "types" modules with respect to where to house
new type names that nobody needs <0.9 wink>.




From fredrik@pythonware.com  Fri Jul 12 22:53:16 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 12 Jul 2002 23:53:16 +0200
Subject: [Python-Dev] Alternative implementation of interning, take 2
References: <LNBBLJKPBEHFEDALKOLCCEPFADAB.tim.one@comcast.net>
Message-ID: <052101c229ee$880465a0$0900a8c0@spiff>

tim wrote:

> It would help if you could get Marc-Andre and /F to pronounce on =
whether
> their code benefits from it -- they're the most prolific extension =
authors
> we've got.

no problem here, from what I can tell.  we can live with or
without this change.

</F>




From barry@zope.com  Sat Jul 13 00:22:01 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 12 Jul 2002 19:22:01 -0400
Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting
References: <20020623181630.GN25927@laranja.org>
 <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net>
 <200207121324.21609.mclay@nist.gov>
 <200207121907.g6CJ76N13511@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15663.25753.999787.858627@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> The real issues are IMO:

I've added these to the PEP, thanks.

-Barry



From tim.one@comcast.net  Sat Jul 13 02:57:33 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 21:57:33 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: <200207122041.g6CKfTt14197@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEENAEAB.tim.one@comcast.net>

[Guido]
> But at least I didn't change my mind. :-)

I would not have pointed out your previous position if you had <wink>.

> So let's deprecate buffer().  I also suggest to roll back Raymond's
> changes to make slices more consistent -- there's no point in changing
> something that's only kept for backwards compatibility reasons.

I expect Raymond will be agreeable, but he announced he'll be missing in
action for about another month.  If rollback can wait, I prefer that to
electing me to do it just because I replied <0.9 wink>.




From tim.one@comcast.net  Sat Jul 13 03:15:19 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 22:15:19 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: <3D2F41F3.9070805@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEEPAEAB.tim.one@comcast.net>

[Tim]
> Fredrik pressed for details, but we haven't seen any concrete use cases.
> In the absence of the latter, it's impossible to guess what would be
> backward compatible for MAL's purposes.

[M.-A. Lemburg]
> For my purposes, the strategy buffer slice returns a buffer
> would be more appropriate because it would save the buffer type
> information across the slicing operation... I mean, you don't
> want to get bananas when you slice an apple in real life either ;-)
>
> I use buffers to mean: this is a chunk of binary data. The purpose
> is to recognize this type of data for pickling via xml-rpc,
> soap and other rpc mechanisms etc.

How do you use buffers?  Do you stick to their C API?  Do you use the
Python-level buffer() function?  If the latter, what do you do in Python
code with a buffer object after you get one?  The only use I've seen made of
a buffer object in Python code is as a way to trick the interpreter into
crashing (via recycling the memory the buffer object points to).

And from where do you get a buffer?  There are darned few types in Python
that buffer() accepts as an argument.  Do your extension types implement
tp_as_buffer?  I'm blindly casting for a reason why your appreciation of the
buffer object seems unique.

> Strings don't provide this information (since they can be a mix of
> text and binary data). Buffers are compatible enough with most tools
> working on strings that they represent a good alternative to tag data
> as being binary while not losing all the nice advantages of
> strings. The downside is that most of these tools return their
> results as strings :-(
>
> Now it would be nice if at least the type itself would behave in a
> sane way.

Overall, this reinforces the repeated observation that we don't know why the
buffer object exists -- it doesn't appear to do what you really want, but
you've found some way to get it to do part of what you want, up until the
point you actually use it <0.7 wink>.




From tim.one@comcast.net  Sat Jul 13 03:23:00 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 12 Jul 2002 22:23:00 -0400
Subject: [Python-Dev] Alternative implementation of interning, take 2
In-Reply-To: <052101c229ee$880465a0$0900a8c0@spiff>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEPAEAB.tim.one@comcast.net>

[Tim, to Oren]
> It would help if you could get Marc-Andre and /F to pronounce on whether
> their code benefits from it -- they're the most prolific extension
> authors we've got.

[/F]
> no problem here, from what I can tell.  we can live with or
> without this change.

Note that there are (at least) two parts to Oren's agenda:

1. Removing the possibility for indirect interning.

2. Making interned strings mortal, via the usual refcount rules.

In context, I was asking only about #1, and I'm sure your reply was meant to
include #1.  What I remain unclear about is whether you've also got no fear
of #2.

I'm also wondering whether we somehow broke indirect interning since it was
introduced -- so far nobody has found a program or extension module where it
even triggers (not counting the 6 instances in the Python test suite in
intern-in-place, since no use of the indirect interning was made in those
cases).




From tim.one@comcast.net  Sat Jul 13 08:12:44 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 13 Jul 2002 03:12:44 -0400
Subject: [Python-Dev] Dict constructor
In-Reply-To: <200207121717.g6CHHvr12817@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEFHAEAB.tim.one@comcast.net>

[Guido]
> ...
> Let's do a set module instead.  There's only one hurdle to take
> for a set module, and that's the issue with using mutable sets as
> keys.  Let's just pick one solution and implement it (my favorite
> being that sets simply cannot be used as keys, since it's the
> simplest, and matches dicts and lists).

I want a set module, but if I finish Greg's abandoned work I want sets of
sets too.  Sets don't have "keys", they're conceptually collections of
values, and it would be as odd not to allow sets containing sets as not to
allow lists containing lists, or to ban dicts as dict values.  Greg needed
sets of sets for his work, and I've often faked them too.  I'm not going to
be paralyzed by that combining mutable sets with sets of sets requires that
some uses of set-as-set-element will be expensive, fragile, and/or hard to
explain.  If you don't want that pain, don't play that game.  If you do want
sets of sets, though, and aren't willing to live with a purely functional
(immutable) set type, it's non-trivial to implement correctly -- I don't
want to leave it as a term project for the reader.

There's also the Zope BTrees idea of sets of sets:

>>> s1 = OISet()
>>> s1 = OISet(range(10))
>>> s1.keys()
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> s2 = OISet([5])
>>> s2.keys()
[5]
>>> s1.insert(s2)
1
>>> s2 in s1
1
>>> OISet([5]) in s1
0
>>>

That is, like sets of sets in Icon too, this is a notion of inclusion by
object identity (although Icon does that on purpose, while the BTree-based
set mostly inherits it from that BTrees don't implement any comparison
slots).  That's very easy to implement.  It's braindead if you think of sets
as collections of values, but that's what taking pain too seriously leads
to.




From aleax@aleax.it  Sat Jul 13 09:51:59 2002
From: aleax@aleax.it (Alex Martelli)
Date: Sat, 13 Jul 2002 10:51:59 +0200
Subject: [Python-Dev] Dict constructor
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEFHAEAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEFHAEAB.tim.one@comcast.net>
Message-ID: <E17TIdj-0005XX-00@mail.python.org>

On Saturday 13 July 2002 09:12 am, Tim Peters wrote:
> [Guido]
>
> > ...
> > Let's do a set module instead.  There's only one hurdle to take
> > for a set module, and that's the issue with using mutable sets as
> > keys.  Let's just pick one solution and implement it (my favorite
> > being that sets simply cannot be used as keys, since it's the
> > simplest, and matches dicts and lists).
>
> I want a set module, but if I finish Greg's abandoned work I want sets of
> sets too.  Sets don't have "keys", they're conceptually collections of
> values, and it would be as odd not to allow sets containing sets as not to
> allow lists containing lists, or to ban dicts as dict values.  Greg needed
> sets of sets for his work, and I've often faked them too.  I'm not going to

I agree that having sets without having sets of sets would not be anywhere
as useful.

> be paralyzed by that combining mutable sets with sets of sets requires that
> some uses of set-as-set-element will be expensive, fragile, and/or hard to
> explain.  If you don't want that pain, don't play that game.  If you do

What about the following compromise: there are two set types, ImmutableSet 
and MutableSet, with a common supertype Set.  ImmutableSet adds __hash__,
while MutableSet adds insert and remove, to the common core of methods
inherited from Set, such as __contains__ and __iter__.  It's easy to make a
MutableSet instance m from an ImmutableSet instance x, such that m == x,
either by letting each __init__ accept an argument of the other kind (maybe
just a special case of such an __init__ accepting any iterable), or, if that 
can afford very substantial performance improvements, via ad-hoc methods.

The second part of the puzzle is that hash(x) tries to adapt x to the Hashable
protocol before calling x.__hash__.  Types that are already hashable adapt
to Hashable by just returning the same instance, of course.  A MutableSet
instance adapts to Hashable by returning the equivalent ImmutableSet.

Since it's apparently too wild an idea to say "adapt to protocol" when one
means "adapt to protocol", at least for the next few releases (and that, in
the optimistic hypothesis that my future rewrite of the adaptation PEP is
favorably received), there will of course need to arise yet another special
purpose way to express this same general idea, such as:


class MutableSet(Set):
    ...
    def insert(self, item):
        try: item = item.asSetItem()
        except AttributeError: pass
        self.data[item] = True

    def asSetItem(self):
        return ImmutableSet(self)


or the like.  


Alex



From martin@v.loewis.de  Sat Jul 13 10:25:40 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 13 Jul 2002 11:25:40 +0200
Subject: [Python-Dev] PEP 11: unsupported platforms
Message-ID: <m3ele89dyj.fsf@mira.informatik.hu-berlin.de>

Following a recent discussion of the introduction of new platforms
(AtheOS in this case), I've written a PEP on removing support for
platforms that nobody is interested.

If you find that specific platforms should be moved to "unsupported"
status as well, please let me know. Likewise, if you think that some
of the platforms I recommend to unsupport should see continued
support, let me know as well. In this case, it would be good if you
could name a user of Python on this platform.

Regards,
Martin

PEP: 11
Title: Unsupported Platforms
Version: $Revision: 1.1 $
Last-Modified: $Date: 2002/07/12 22:31:47 $
Author: martin@v.loewis.de (Martin v. L=F6wis)
Status: Active
Type: Informational
Created: 07-Jul-2002
Post-History: 07-Jul-2002


Abstract

    This PEP documents operating systems (platforms) which are not
    supported in Python anymore.  For some of these systems,
    supporting code might be still part of Python, but will be removed
    in a future release - unless somebody steps forward as a volunteer
    to maintain this code.


Rationale

    Over time, the Python source code has collected various pieces of
    platform-specific code, which, at some point in time, was
    considered necessary to use Python on a specific platform.
    Without access to this platform, it is not possible to determine
    whether this code is still needed.  As a result, this code may
    either break during the Python evolution, or it may become
    unnecessary as the platforms evolve as well.

    The growing amount of these fragments poses the risk of
    unmaintainability: without having experts for a large number of
    platforms, it is not possible to determine whether a certain
    change to the Python source code will work on all supported
    platforms.

    To reduce this risk, this PEP proposes a procedure to remove code
    for platforms with no Python users.


Unsupporting platforms

    If a certain platform that currently has special code in it is
    deemed to be without Python users, a note must be posted in this
    PEP that this platform is not longer actively supported.  This
    note must include:

    - the name of the system
    - the first release number that does not support this platform
      anymore, and
    - the first release where the historical support code is actively
      removed

    In some cases, it is not possible to identify the specific list of
    systems for which some code is used (e.g. when autoconf tests for
    absence of some feature which is considered present on all
    supported systems).  In this case, the name will give the precise
    condition (usually a preprocessor symbol) that will become
    unsupported.

    At the same time, the Python source code must be changed to
    produce a build-time error if somebody tries to install Python on
    this platform.  On platforms using autoconf, configure must fail.
    This gives potential users of the platform a chance to step
    forward and offer maintenance.


Resupporting platforms

    If a user of a platform wants to see this platform supported
    again, he may volunteer to maintain the platform support.  Such an
    offer must be recorded in the PEP, and the user can submit patches
    to remove the build-time errors, and perform any other maintenance
    work for the platform.


Unsupported platforms

    Name:             SunOS 4
    Unsupported in:   Python 2.3
    Code removed in:  Python 2.4

    Name:             DYNIX
    Unsupported in:   Python 2.3
    Code removed in:  Python 2.4

    Name:             dgux
    Unsupported in:   Python 2.3
    Code removed in:  Python 2.4

    Name:             Systems defining __d6_pthread_create (configure.in)
    Unsupported in:   Python 2.3
    Code removed in:  Python 2.4

    Name:             Systems defining PY_PTHREAD_D4, PY_PTHREAD_D6,
                      or PY_PTHREAD_D7 in thread_pthread.h
    Unsupported in:   Python 2.3
    Code removed in:  Python 2.4


Copyright

    This document has been placed in the public domain.



From oren-py-d@hishome.net  Sat Jul 13 12:04:09 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sat, 13 Jul 2002 07:04:09 -0400
Subject: [Python-Dev] Alternative implementation of interning, take 2
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEEPAEAB.tim.one@comcast.net>
References: <052101c229ee$880465a0$0900a8c0@spiff> <LNBBLJKPBEHFEDALKOLCKEEPAEAB.tim.one@comcast.net>
Message-ID: <20020713110409.GA72037@hishome.net>

On Fri, Jul 12, 2002 at 10:23:00PM -0400, Tim Peters wrote:
> Note that there are (at least) two parts to Oren's agenda:
> 
> 1. Removing the possibility for indirect interning.
> 
> 2. Making interned strings mortal, via the usual refcount rules.

In fact, #1 is only "indirectly" on my agenda. My goal was making interned 
strings mortal and indirectly interned strings kept messing up the reference
counts so I ripped them out after I found out that they're not effective in
the core.

The current version of my patch supports both both mortal and immortal
interned strings for backward compatibility. Anything that is silently
interned by the Python core uses mortal interned strings.  Explicit calls 
from Python code or extensions get immortal strings because they might 
depend on this behavior.

	Oren




From guido@python.org  Sat Jul 13 13:27:46 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 13 Jul 2002 08:27:46 -0400
Subject: [Python-Dev] PEP 11: unsupported platforms
In-Reply-To: Your message of "Sat, 13 Jul 2002 11:25:40 +0200."
 <m3ele89dyj.fsf@mira.informatik.hu-berlin.de>
References: <m3ele89dyj.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200207131227.g6DCRkk17108@pcp02138704pcs.reston01.va.comcast.net>

> Following a recent discussion of the introduction of new platforms
> (AtheOS in this case), I've written a PEP on removing support for
> platforms that nobody is interested.

Did you post this to python-list too?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sat Jul 13 13:34:57 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 13 Jul 2002 08:34:57 -0400
Subject: [Python-Dev] Dict constructor
In-Reply-To: Your message of "Sat, 13 Jul 2002 03:12:44 EDT."
 <LNBBLJKPBEHFEDALKOLCMEFHAEAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEFHAEAB.tim.one@comcast.net>
Message-ID: <200207131234.g6DCYvj17144@pcp02138704pcs.reston01.va.comcast.net>

> I want a set module, but if I finish Greg's abandoned work I want sets of
> sets too.  Sets don't have "keys", they're conceptually collections of
> values, and it would be as odd not to allow sets containing sets as not to
> allow lists containing lists, or to ban dicts as dict values.

IMO it's no odder than disallowing dicts as dict keys: it's a hack
that allows a much faster implementation.

> That is, like sets of sets in Icon too, this is a notion of inclusion by
> object identity (although Icon does that on purpose, while the BTree-based
> set mostly inherits it from that BTrees don't implement any comparison
> slots).  That's very easy to implement.  It's braindead if you think of sets
> as collections of values, but that's what taking pain too seriously leads
> to.

I don't think it is acceptable to have sets-of-sets but test for
membership (in that case) by object identity.

If you really think object identity is all that's needed, I suggest we
stick to disallowing sets of sets; algorithms needing
sets-of-set-object-identities can use id() on the inner sets.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pinard@iro.umontreal.ca  Sat Jul 13 13:35:24 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 13 Jul 2002 08:35:24 -0400
Subject: [Python-Dev] Re: python package
In-Reply-To: <200207122037.g6CKbf514140@pcp02138704pcs.reston01.va.comcast.net>
References: <5.1.1.6.0.20020621200955.00ac9638@spey.st-andrews.ac.uk>
 <200207101925.g6AJPMg27619@pcp02138704pcs.reston01.va.comcast.net>
 <3D2CA81E.6060408@lemburg.com>
 <200207102252.g6AMq0k28152@pcp02138704pcs.reston01.va.comcast.net>
 <3D2D3720.9040100@lemburg.com>
 <200207121326.g6CDQbm07504@pcp02138704pcs.reston01.va.comcast.net>
 <3D2F1245.1030804@lemburg.com>
 <200207121736.g6CHact13010@pcp02138704pcs.reston01.va.comcast.net>
 <3D2F161D.40005@lemburg.com>
 <200207121754.g6CHsKQ13108@pcp02138704pcs.reston01.va.comcast.net>
 <3D2F3857.3010700@lemburg.com>
 <200207122037.g6CKbf514140@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <oqu1n33iwj.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

> And this would be a good time to end this thread. :-)

Agreed.  Yet, allow me for a tiny suggestion, that could solve the stated
problem at a simple cost.  Suffice to choose, then announce a convention
about a set of names which the Python distribution agrees to never use.

It could be anything.  Like, Python could guarantee that it will never
ever install a standard module with a name starting with capital `W', say.
If a user wants to make absolutely sure his/her module does not and will
not conflict with a standard module, just prepend a `W' to its name.
It is likely that people will rarely resort to this convention, but it
will be there for the paranoid, and should be easy to support.  Yet, it
will not solve the paranoia of users against the package name of each other.

If we have been many years ago, the convention I would have preferred is that
Python never uses any capital letter as the first letter of a module, but it
seems to be a little late for this, and I'm not so sure of the benefit. :-)

The most python could say from some `from python import ...' or a `W'
convention is that it gets itself out of the name fight between users, it
does not participate into it.  it does not really solve the problem, anyway.

I guess you are right, in that whatever the direction taken, this thread is
probably doomed to fall into various dead-ends.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From David Abrahams" <david.abrahams@rcn.com  Sat Jul 13 14:00:01 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Sat, 13 Jul 2002 09:00:01 -0400
Subject: [Python-Dev] PyList_Insert() et al.
Message-ID: <1a5f01c22a6d$34e12f50$6601a8c0@boostconsulting.com>

Check it out:

int
PyList_Insert(PyObject *op, int where, PyObject *newitem)
{
 if (!PyList_Check(op)) {
  PyErr_BadInternalCall();
  return -1;
 }
 return ins1((PyListObject *)op, where, newitem);
}

Since the implementation of ins1 gives the subclasses' re-implementation of
insert() no chance to execute, shouldn't this check be changed to
PyList_CheckExact?

If not, what needs to be added to the documentation to make it clear that
these functions really do subclass slicing?

-Dave




From guido@python.org  Sat Jul 13 14:04:40 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 13 Jul 2002 09:04:40 -0400
Subject: [Python-Dev] Dict constructor
In-Reply-To: Your message of "Sat, 13 Jul 2002 10:51:59 +0200."
 <E17TIdj-0005XX-00@mail.python.org>
References: <LNBBLJKPBEHFEDALKOLCMEFHAEAB.tim.one@comcast.net>
 <E17TIdj-0005XX-00@mail.python.org>
Message-ID: <200207131304.g6DD4eE17350@pcp02138704pcs.reston01.va.comcast.net>

> What about the following compromise: there are two set types,
> ImmutableSet and MutableSet, with a common supertype Set.
> ImmutableSet adds __hash__, while MutableSet adds insert and remove,
> to the common core of methods inherited from Set, such as
> __contains__ and __iter__.

Reasonable.

> Since it's apparently too wild an idea to say "adapt to protocol" when one
> means "adapt to protocol", at least for the next few releases (and that, in
> the optimistic hypothesis that my future rewrite of the adaptation PEP is
> favorably received), there will of course need to arise yet another special
> purpose way to express this same general idea, such as:
> 
> 
> class MutableSet(Set):
>     ...
>     def insert(self, item):
>         try: item = item.asSetItem()
>         except AttributeError: pass
>         self.data[item] = True
> 
>     def asSetItem(self):
>         return ImmutableSet(self)
> 
> 
> or the like.  

This would run into similar problems as the PEP's auto-freeze approach
when using "s1 in s2".  If s1 is a mutable set, this creates an
immutable copy for the test and then throws it away.  The PEP's
problem is that it's too easy to accidentally freeze a set; the
problem with your proposal is "merely" one of performance.  Yet I
think both are undesirable, although I still prefer your solution.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sat Jul 13 14:34:19 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 13 Jul 2002 09:34:19 -0400
Subject: [Python-Dev] Re: Alternative implementation of string interning
In-Reply-To: Your message of "Wed, 03 Jul 2002 07:07:35 EDT."
 <20020703110735.GA50268@hishome.net>
References: <20020703095915.GA43336@hishome.net> <Pine.LNX.4.44.0207030309320.5227-100000@ziggy>
 <20020703110735.GA50268@hishome.net>
Message-ID: <200207131334.g6DDYJD17519@pcp02138704pcs.reston01.va.comcast.net>

> The warm fuzzy feeling that you have a real symbol type :-)

Doesn't give me a warm fuzzy feeling at all.  A symbol type is just
another compiler implementation detail IMO.  Strings are natural to
designate identifiers.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From aleax@aleax.it  Sat Jul 13 14:42:46 2002
From: aleax@aleax.it (Alex Martelli)
Date: Sat, 13 Jul 2002 15:42:46 +0200
Subject: [Python-Dev] Dict constructor
In-Reply-To: <200207131304.g6DD4eE17350@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEFHAEAB.tim.one@comcast.net> <E17TIdj-0005XX-00@mail.python.org> <200207131304.g6DD4eE17350@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17TNB9-0008Gp-00@mail.python.org>

On Saturday 13 July 2002 03:04 pm, Guido van Rossum wrote:
	...
> > What about the following compromise: there are two set types,
> > ImmutableSet and MutableSet, with a common supertype Set.
> > ImmutableSet adds __hash__, while MutableSet adds insert and remove,
> > to the common core of methods inherited from Set, such as
> > __contains__ and __iter__.
>
> Reasonable.
	...
> This would run into similar problems as the PEP's auto-freeze approach
> when using "s1 in s2".  If s1 is a mutable set, this creates an
> immutable copy for the test and then throws it away.  The PEP's
> problem is that it's too easy to accidentally freeze a set; the
> problem with your proposal is "merely" one of performance.  Yet I
> think both are undesirable, although I still prefer your solution.

If performance is a problem (and I can well see it might be!) then
Set.__contains__(self, x) needs to use a specialized version of
the ad-hoc adaptation code I proposed for insertion:

>     def insert(self, item):
>         try: item = item.asSetItem()
>         except AttributeError: pass
>         self.data[item] = True

One possible route to such optimization is to introduce another class, called 
_TemporarilyImmutableSet, able to wrap a MutableSet x, have the same hash 
value that x would have if x were immutable, and compare == to whatever x
compares == to.

Set would then expose a private method _asTemporarilyImmutable.
ImmutableSet._asTemporarilyImmutable would just return self;
MutableSet._asTemporarilyImmutable would return _TemporarlyImmutableSet(self).

Then:

    class Set(object):
        ...
        def __contains__(self, item):
            try: item = item._asTemporarilyImmutable()
            except AttributeError: pass
            return item in self.data


Alex



From guido@python.org  Sat Jul 13 14:56:06 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 13 Jul 2002 09:56:06 -0400
Subject: [Python-Dev] Dict constructor
In-Reply-To: Your message of "Sat, 13 Jul 2002 15:42:46 +0200."
 <E17TNB9-0008Gp-00@mail.python.org>
References: <LNBBLJKPBEHFEDALKOLCMEFHAEAB.tim.one@comcast.net> <E17TIdj-0005XX-00@mail.python.org> <200207131304.g6DD4eE17350@pcp02138704pcs.reston01.va.comcast.net>
 <E17TNB9-0008Gp-00@mail.python.org>
Message-ID: <200207131356.g6DDu6u17726@pcp02138704pcs.reston01.va.comcast.net>

> > This would run into similar problems as the PEP's auto-freeze approach
> > when using "s1 in s2".  If s1 is a mutable set, this creates an
> > immutable copy for the test and then throws it away.  The PEP's
> > problem is that it's too easy to accidentally freeze a set; the
> > problem with your proposal is "merely" one of performance.  Yet I
> > think both are undesirable, although I still prefer your solution.
> 
> If performance is a problem (and I can well see it might be!) then
> Set.__contains__(self, x) needs to use a specialized version of
> the ad-hoc adaptation code I proposed for insertion:
> 
> >     def insert(self, item):
> >         try: item = item.asSetItem()
> >         except AttributeError: pass
> >         self.data[item] = True
> 
> One possible route to such optimization is to introduce another
> class, called _TemporarilyImmutableSet, able to wrap a MutableSet x,
> have the same hash value that x would have if x were immutable, and
> compare == to whatever x compares == to.
> 
> Set would then expose a private method _asTemporarilyImmutable.
> ImmutableSet._asTemporarilyImmutable would just return self;
> MutableSet._asTemporarilyImmutable would return
> _TemporarlyImmutableSet(self).
> 
> Then:
> 
>     class Set(object):
>         ...
>         def __contains__(self, item):
>             try: item = item._asTemporarilyImmutable()
>             except AttributeError: pass
>             return item in self.data

Sounds reasonable.  Who's gonna do an implementation?  There's Greg
Wilson's version, and there's an alternative by Aric Coady
<coady@bent-arrow.com> that could be used as a comparison.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From oren-py-d@hishome.net  Sat Jul 13 15:19:31 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sat, 13 Jul 2002 17:19:31 +0300
Subject: [Python-Dev] Re: Alternative implementation of string interning
In-Reply-To: <200207131334.g6DDYJD17519@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Sat, Jul 13, 2002 at 09:34:19AM -0400
References: <20020703095915.GA43336@hishome.net> <Pine.LNX.4.44.0207030309320.5227-100000@ziggy> <20020703110735.GA50268@hishome.net> <200207131334.g6DDYJD17519@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020713171931.A5083@hishome.net>

On Sat, Jul 13, 2002 at 09:34:19AM -0400, Guido van Rossum wrote:
> > The warm fuzzy feeling that you have a real symbol type :-)
> 
> Doesn't give me a warm fuzzy feeling at all.  A symbol type is just
> another compiler implementation detail IMO.  Strings are natural to
> designate identifiers.

Making interned strings a type was just idle speculation, don't take it 
too seriously...

	Oren



From aleax@aleax.it  Sat Jul 13 15:58:00 2002
From: aleax@aleax.it (Alex Martelli)
Date: Sat, 13 Jul 2002 16:58:00 +0200
Subject: [Python-Dev] Dict constructor
In-Reply-To: <200207131356.g6DDu6u17726@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEFHAEAB.tim.one@comcast.net> <E17TNB9-0008Gp-00@mail.python.org> <200207131356.g6DDu6u17726@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17TOLv-000272-00@mail.python.org>

On Saturday 13 July 2002 03:56 pm, Guido van Rossum wrote:
	...
> Sounds reasonable.  Who's gonna do an implementation?  There's Greg
> Wilson's version, and there's an alternative by Aric Coady
> <coady@bent-arrow.com> that could be used as a comparison.

I'm gonna give it a try, unless somebody more qualified volunteers -- Greg's
version's in nondist/sandbox/sets, right?  Where's Aric's?

What should I do with the modified set.py -- submit it as a patch, or ... ?


Alex



From guido@python.org  Sat Jul 13 16:04:32 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 13 Jul 2002 11:04:32 -0400
Subject: [Python-Dev] Dict constructor
In-Reply-To: Your message of "Sat, 13 Jul 2002 16:58:00 +0200."
 <E17TOLv-000272-00@mail.python.org>
References: <LNBBLJKPBEHFEDALKOLCMEFHAEAB.tim.one@comcast.net> <E17TNB9-0008Gp-00@mail.python.org> <200207131356.g6DDu6u17726@pcp02138704pcs.reston01.va.comcast.net>
 <E17TOLv-000272-00@mail.python.org>
Message-ID: <200207131504.g6DF4Xl18048@pcp02138704pcs.reston01.va.comcast.net>

> > Sounds reasonable.  Who's gonna do an implementation?  There's Greg
> > Wilson's version, and there's an alternative by Aric Coady
> > <coady@bent-arrow.com> that could be used as a comparison.
> 
> I'm gonna give it a try, unless somebody more qualified volunteers -- Greg's
> version's in nondist/sandbox/sets, right?  Where's Aric's?

http://bent-arrow.com/python
> 
> What should I do with the modified set.py -- submit it as a patch, or ... ?

I forget -- do you have SF commit permission?  If so, feel free to
add a competing version to the sandbox.  Otherwise, a SF submission
would be good (and post a link to python-dev when you upload it).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aleax@aleax.it  Sat Jul 13 17:07:26 2002
From: aleax@aleax.it (Alex Martelli)
Date: Sat, 13 Jul 2002 18:07:26 +0200
Subject: [Python-Dev] Dict constructor
In-Reply-To: <200207131504.g6DF4Xl18048@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEFHAEAB.tim.one@comcast.net> <E17TOLv-000272-00@mail.python.org> <200207131504.g6DF4Xl18048@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17TPR8-0007bC-00@mail.python.org>

On Saturday 13 July 2002 05:04 pm, Guido van Rossum wrote:
	...
> > Greg's version's in nondist/sandbox/sets, right?  Where's Aric's?
>
> http://bent-arrow.com/python

Ah, a C implementation.  It seems premature to me to consider such
optimization -- for now, it appears, we're still looking around for the
right architecture, and that's much more plastic and faster to
experiment with in Python.  So, I have not studied set.c in detail,
just browsed the readme to get an idea of the interface -- and that
seems even more peculiar to me than freeze-on-hashing, although
generally similar.  So, for now, I've stuck to Python, and I think it
will be time to move to C once the Python-level part appears good.


> > What should I do with the modified set.py -- submit it as a patch, or ...
> > ?
>
> I forget -- do you have SF commit permission?  If so, feel free to

Nope -- I may be the only PSF member without commit permission, I
suspect.

> add a competing version to the sandbox.  Otherwise, a SF submission
> would be good (and post a link to python-dev when you upload it).

Done -- it's patch 580995 (not sure how that translates to an URL --
the tracker's resulting URL is quite complicated:-).


Alex



From fredrik@pythonware.com  Sat Jul 13 17:15:35 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 13 Jul 2002 18:15:35 +0200
Subject: [Python-Dev] Dict constructor
References: <LNBBLJKPBEHFEDALKOLCMEFHAEAB.tim.one@comcast.net> <E17TOLv-000272-00@mail.python.org> <200207131504.g6DF4Xl18048@pcp02138704pcs.reston01.va.comcast.net> <E17TPR8-0007bC-00@mail.python.org>
Message-ID: <001501c22a88$87d53150$ced241d5@hagrid>

> Done -- it's patch 580995 (not sure how that translates to an URL --
> the tracker's resulting URL is quite complicated:-).

just prepend http://python.org/sf/ to the patch/bug identify.
the rest is magic (or perhaps barry dealing with 404 log entries
in real time):

    http://python.org/sf/580995

</F>




From martin@v.loewis.de  Sat Jul 13 18:29:42 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 13 Jul 2002 19:29:42 +0200
Subject: [Python-Dev] PEP 11: unsupported platforms
In-Reply-To: <200207131227.g6DCRkk17108@pcp02138704pcs.reston01.va.comcast.net>
References: <m3ele89dyj.fsf@mira.informatik.hu-berlin.de>
 <200207131227.g6DCRkk17108@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m38z4f35a1.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> > Following a recent discussion of the introduction of new platforms
> > (AtheOS in this case), I've written a PEP on removing support for
> > platforms that nobody is interested.
> 
> Did you post this to python-list too?

Not yet. I'll post it on python-list when I get no more comments here,
then I'll produce a patch to generate the build-time errors for the
unsupported platforms.

Regards,
Martin




From mal@lemburg.com  Sat Jul 13 18:58:38 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 13 Jul 2002 19:58:38 +0200
Subject: [Python-Dev] python package
References: <200207121842.g6CIgQo13399@pcp02138704pcs.reston01.va.comcast.net> <3D2F129C.12231.63AB67B@localhost>
Message-ID: <3D306A4E.5050703@lemburg.com>

Gordon McMillan wrote:
> On 12 Jul 2002 at 16:46, Michael McLay wrote:
> 
> 
>>... it might be reasonable to move all standard
>>distribution names into a single top level namespace
>>and grandfather the existing top level names into
>>the top level namespace for the remainder of
>>the 2.x series. 
> 
> 
> Getting
>  from <toplevelname> import urllib
> and 
>  import urllib
> 
> to return the same (is, not equals) object will
> require very delicate surgery on some very difficult
> code. And without it, most non-trivial scripts will
> break in very mysterious ways.

Not really. The following code does all it takes to
make this work for e.g. having 'import DateTime'
and 'from mx import DateTime' provide the same
symbols:

# Redirect all imports to the corresponding mx package
def _redirect(mx_subpackage):
     global __path__
     import os,mx
     __path__ = [os.path.join(mx.__path__[0],mx_subpackage)]
_redirect('DateTime')

# Now load all important symbols
from mx.DateTime import *
from mx.DateTime import __version__,_DT,_DTD

The module objects would be different, but that's just
about it.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From gmcm@hypernet.com  Sat Jul 13 19:21:07 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Sat, 13 Jul 2002 14:21:07 -0400
Subject: [Python-Dev] python package
In-Reply-To: <3D306A4E.5050703@lemburg.com>
Message-ID: <3D303753.27392.AB222A8@localhost>

On 13 Jul 2002 at 19:58, M.-A. Lemburg wrote:

> Gordon McMillan wrote:

> > Getting
> >  from <toplevelname> import urllib
> > and 
> >  import urllib
> > 
> > to return the same (is, not equals) object will
> > require very delicate surgery on some very difficult
> > code. And without it, most non-trivial scripts will
> > break in very mysterious ways.
> 
> Not really. The following code does all it takes to
> make this work for e.g. having 'import DateTime' and
> 'from mx import DateTime' provide the same symbols:

[snip hackery]

> The module objects would be different, but that's
> just about it. 

Which was exactly my point. Much code that does
*not* use "from ... import ..." in fact relies on
having the same module object.

-- Gordon
http://www.mcmillan-inc.com/




From mal@lemburg.com  Sat Jul 13 20:07:05 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 13 Jul 2002 21:07:05 +0200
Subject: [Python-Dev] python package
References: <3D303753.27392.AB222A8@localhost>
Message-ID: <3D307A59.5040707@lemburg.com>

Gordon McMillan wrote:
> On 13 Jul 2002 at 19:58, M.-A. Lemburg wrote:
> 
> 
>>Gordon McMillan wrote:
> 
> 
>>>Getting
>>> from <toplevelname> import urllib
>>>and 
>>> import urllib
>>>
>>>to return the same (is, not equals) object will
>>>require very delicate surgery on some very difficult
>>>code. And without it, most non-trivial scripts will
>>>break in very mysterious ways.
>>
>>Not really. The following code does all it takes to
>>make this work for e.g. having 'import DateTime' and
>>'from mx import DateTime' provide the same symbols:
> 
> 
> [snip hackery]
> 
> 
>>The module objects would be different, but that's
>>just about it. 
> 
> 
> Which was exactly my point. Much code that does
> *not* use "from ... import ..." in fact relies on
> having the same module object.

You mean for e.g. hacking the module's globals ? To solve
that, you'd probably need to manipulate sys.modules as
well... I'm just not sure whether this is possible from
within the module implementing the redirection.

Hmm, running this:

testmodload.py:
import sys, os
sys.modules['testmodload'] = os
print 'worked'


Python 2.1.3 (#1, May 16 2002, 18:59:26)
 >>> import testmodload
worked
 >>> testmodload
<module 'os' from '/usr/local/lib/python2.1/os.pyc'>
 >>>

Looks like this is possible, so you probably don't even
need the 'from mx.DateTime import *' in the code I posted.
A simple 'sys.modules['DateTime'] = mx.DateTime' would
give you an even better solution.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From guido@python.org  Sat Jul 13 20:18:36 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 13 Jul 2002 15:18:36 -0400
Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings
Message-ID: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>

There's a full implementation for PEP 263.  Martin von Loewis is ready
to commit it.  It's of course possible to let him do this and deal
with the consequences once they're in CVS, I'd like to see if there's
anyone who'd like to review the code before it goes in.  The patch is
at http://python.org/sf/534304.  I like the PEP fine, I just don't
have time to review the patch, and I'm not sure that review by just
Martin and Hisao (the original patch author) is enough.  If nobody
comes forward, Martin will commit it.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Sat Jul 13 20:24:55 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 13 Jul 2002 15:24:55 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <9aZX8.47377$n4.11526798@newsc.telia.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHAAEAB.tim.one@comcast.net>

A post on c.l.py raises an interesting issue, here illustrated in an
all-Python example:

"""
class C:
    def __init__(self):
        self.i = 0
    def get(self):
        self.i += 1
        if self.i <= 5:
            return self.i
        self.i = 1 / 0

x = iter(C().get, 5)
try:
    while 1:
        print x.next()
except StopIteration:
    pass
print x.next()
"""

That prints 1 thru 4, then dies with a ZeroDivisionError.  This is because
two-arg iter works as documented <wink>:

    The iterator created in this case [iter(o, sentinel)] will call o
    with no arguments for each call to its next() method; if the value
    returned is equal to sentinel, StopIteration will be raised, otherwise
    the value will be returned.

The question is whether this is intentional:  for all other iterators Python
supplies, StopIteration is a "sink state":  once an iterator raises
StopIteration, calling its next() method any number of times again will just
continue raising StopIteration.  Python's calliterobject doesn't arrange for
that, though.

PEP 234 doesn't explicitly say what happens if next() is called after
StopIteration has been raised, although it clearly has in mind a model where
iteration eventually "ends".

The use case from which two-arg iter() got generalized was

    iter(file.readline, "")

and in that case file.readline returns "" forever after hitting EOF the
first time.  So in this specific case, StopIteration acts like a sink too,
but for a reason that sheds no light on the question at hand.

The base question:  does the iteration protocol define what happens if an
iterator's next() method is called after the iterator has raised
StopIteration?  Or is that left up to the discretion of the iterator?

If the answer is that it's the iterator's choice, is 2-argument iter()
making the best choice?  The rub here is that 2-arg iter was (IMO)
introduced to help iteration-ignorant callables fit into the iteration
protocol, and *because* they're iteration-ignorant they may do something
foolish if called again after their "sentinel" value is seen.




From guido@python.org  Sat Jul 13 20:37:06 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 13 Jul 2002 15:37:06 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: Your message of "Sat, 13 Jul 2002 15:24:55 EDT."
 <LNBBLJKPBEHFEDALKOLCGEHAAEAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCGEHAAEAB.tim.one@comcast.net>
Message-ID: <200207131937.g6DJb6K18523@pcp02138704pcs.reston01.va.comcast.net>

> The base question:  does the iteration protocol define what happens if an
> iterator's next() method is called after the iterator has raised
> StopIteration?  Or is that left up to the discretion of the iterator?

The latter.  Believe it or not, I thought about this during the design
of the protocol, and decided that if someone wanted to create an
iterator that could somehow continue after raising StopIteration, that
should be their problem.  Basically, the effect of calling next()
after StopIteration is raised is undefined.

> If the answer is that it's the iterator's choice, is 2-argument iter()
> making the best choice?  The rub here is that 2-arg iter was (IMO)
> introduced to help iteration-ignorant callables fit into the iteration
> protocol, and *because* they're iteration-ignorant they may do something
> foolish if called again after their "sentinel" value is seen.

If the caller stops calling next(), nothing's wrong.  I don't think
the callable-iterator object should grow another state bit.

But I'm willing to be convinced by information you withheld.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik@pythonware.com  Sat Jul 13 20:31:53 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 13 Jul 2002 21:31:53 +0200
Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <018901c22aa4$e0f41190$ced241d5@hagrid>

guido wrote:


> There's a full implementation for PEP 263.  Martin von Loewis is ready
> to commit it.  It's of course possible to let him do this and deal
> with the consequences once they're in CVS, I'd like to see if there's
> anyone who'd like to review the code before it goes in.  The patch is
> at http://python.org/sf/534304.  I like the PEP fine, I just don't
> have time to review the patch

hmm.  I'm tempted to think that there's a major
flaw in the PEP, caused by the fact that

    compile(unicode(script, extract_encoding(script)))

will, from what I can tell, not compile to the same
thing as:

    compile(script)

but I've had too many holy [gr]ails [1] tonight to
be sure if that's really a flaw at all...

</F>

1) see http://www.blacksheepbrewery.com/




From fredrik@pythonware.com  Sat Jul 13 20:38:31 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sat, 13 Jul 2002 21:38:31 +0200
Subject: [Python-Dev] Termination of two-arg iter()
References: <LNBBLJKPBEHFEDALKOLCGEHAAEAB.tim.one@comcast.net>
Message-ID: <018a01c22aa4$e1344ee0$ced241d5@hagrid>

tim wrote:

> The question is whether this is intentional:  for all other iterators Python
> supplies, StopIteration is a "sink state":  once an iterator raises
> StopIteration, calling its next() method any number of times again will just
> continue raising StopIteration.

except SRE's finditer method, that is (also reported on c.l.python)

> Or is that left up to the discretion of the iterator?

if you don't know, it probably is undefined, which means that SRE's
finditer does the best thing possible: accept a few misakes, and then
punish the poor fool who cannot follow instructions. (but to be nice,
cut them a bit more slack if they're to cheap to buy a real operating
system ;-)

</F>




From tim.one@comcast.net  Sat Jul 13 20:50:18 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 13 Jul 2002 15:50:18 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <200207131937.g6DJb6K18523@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEHCAEAB.tim.one@comcast.net>

[Tim]
>> The base question:  does the iteration protocol define what
>> happens if an iterator's next() method is called after the iterator
>> has raised StopIteration?  Or is that left up to the discretion of the
>> iterator?

[Guido]
> The latter.  Believe it or not, I thought about this during the design
> of the protocol, and decided that if someone wanted to create an
> iterator that could somehow continue after raising StopIteration, that
> should be their problem.

I believe it -- I even vaguely recall discussions about it.  Unfortunately,
they don't seem to be recorded anywhere I can find now.

> Basically, the effect of calling next() after StopIteration is raised is
> undefined.

That's consistent with a lawyer's reading of the relevant PEP <wink>.

>> If the answer is that it's the iterator's choice, is 2-argument iter()
>> making the best choice?  The rub here is that 2-arg iter was (IMO)
>> introduced to help iteration-ignorant callables fit into the iteration
>> protocol, and *because* they're iteration-ignorant they may do something
>> foolish if called again after their "sentinel" value is seen.

> If the caller stops calling next(), nothing's wrong.  I don't think
> the callable-iterator object should grow another state bit.
>
> But I'm willing to be convinced by information you withheld.

I entered the c.l.py report as a bug against re.  The user provoked re into
hanging via using re's new finditer() method.  The connection to 2-arg iter
is buried in re's C implementation, via PyCallIter_New.  I didn't think it
added anything useful here to spell all that out.

It turns out (and unsurprisingly so with hindsight) that re can be provoked
into the same bad behavior without involving the iteration protocol at all,
so in this case I think finditer() just made it easier to expose a flaw that
was present regardless.  I'm happy to leave this be:  the docs match the
implemenation, I'm sure *someone* relies on that by now, and the behavior is
easy to explain as-is.




From mal@lemburg.com  Sat Jul 13 21:25:23 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 13 Jul 2002 22:25:23 +0200
Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <018901c22aa4$e0f41190$ced241d5@hagrid>
Message-ID: <3D308CB3.4000302@lemburg.com>

Fredrik Lundh wrote:
> guido wrote:
> 
> 
> 
>>There's a full implementation for PEP 263.  Martin von Loewis is ready
>>to commit it.  It's of course possible to let him do this and deal
>>with the consequences once they're in CVS, I'd like to see if there's
>>anyone who'd like to review the code before it goes in.  The patch is
>>at http://python.org/sf/534304.  I like the PEP fine, I just don't
>>have time to review the patch
> 
> 
> hmm.  I'm tempted to think that there's a major
> flaw in the PEP, caused by the fact that
> 
>     compile(unicode(script, extract_encoding(script)))
> 
> will, from what I can tell, not compile to the same
> thing as:
> 
>     compile(script)
> 
> but I've had too many holy [gr]ails [1] tonight to
> be sure if that's really a flaw at all...

Right.

The implementation is not a full implementation
of what is defined as step 2 in the PEP. However, I
don't think that we're that far away from that: all that's
needed is to encode a Unicode argument to compiler()
to UTF-8 and then either prepend it with a BOM mark or
a coding spec before passing it to the compiler.

Nice would be to add a new tokenizer API which treats
the input as UTF-8 without looking for the coding
comment or BOM at all.

BTW, the approach mentioned in that PEP is no longer needed
(converting the complete tokenizer to using Py_UNICODE
internally).

I think that the only way to give this code enough testing
is by letting Martin check it in and see what happens. Except
for the few XXX and CAUTION marks, the code looks OK.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From tim.one@comcast.net  Sat Jul 13 21:32:16 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 13 Jul 2002 16:32:16 -0400
Subject: [Python-Dev] python package
In-Reply-To: <3D307A59.5040707@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEHEAEAB.tim.one@comcast.net>

[MAL]
> The module objects would be different, but that's just about it.

[Gordon]
> Which was exactly my point.  Much code that does *not* use
> "from ... import ..." in fact relies on having the same module object.

[MAL]
> You mean for e.g. hacking the module's globals ?

If you consider a module maintaining pieces of its own state in its own
globals as an instance of hacking the module's globals, yes, that's the main
problem.  For example (there are many, this isn't stretching), if the user
ends up with two distinct copies of the tempfile module, its  "global"
_tempdir_lock becomes two distinct locks, and the truly global mutual
exclusion _tempdir_lock was supposed to supply is lost.  Ditto for the lock
used internally by tempfile's global _counter object.  The system-wide
uniqueness of some globals is crucial to some modules' correct functioning.




From tim.one@comcast.net  Sat Jul 13 21:51:47 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 13 Jul 2002 16:51:47 -0400
Subject: [Python-Dev] Dict constructor
In-Reply-To: <E17TPR8-0007bC-00@mail.python.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHGAEAB.tim.one@comcast.net>

[Alex Martelli]
> Greg's version's in nondist/sandbox/sets, right?  Where's Aric's?

[Guido]
> http://bent-arrow.com/python

[Alex]
> Ah, a C implementation.  It seems premature to me to consider such
> optimization -- for now, it appears, we're still looking around for the
> right architecture, and that's much more plastic and faster to
> experiment with in Python.  So, I have not studied set.c in detail,

I have, and I'm -1 on it:  it's largely a copy-paste-small-edit of massive
portions of dictobject.c.  If it has to be implemented via massive code
duplication, there are less maintenance-intense ways to do that.

> just browsed the readme to get an idea of the interface -- and that
> seems even more peculiar to me than freeze-on-hashing, although
> generally similar.

freeze-on-hashing was pioneered <wink> in the Python world by kjbuckets.
I've used it in my own Set code for years without particular pain.  Greg
Wilson seemed to hate it, though.

> So, for now, I've stuck to Python,

+1




From gmcm@hypernet.com  Sat Jul 13 23:44:59 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Sat, 13 Jul 2002 18:44:59 -0400
Subject: [Python-Dev] python package
In-Reply-To: <3D307A59.5040707@lemburg.com>
Message-ID: <3D30752B.9129.BA3B5E1@localhost>

Marc-Andre,

In this thread you have posted:

> python.py:
> __path__ = ['.']

and

> def _redirect(mx_subpackage):
>     global __path__
>     import os,mx
>     __path__ = \
>  [os.path.join(mx.__path__[0],mx_subpackage)]

and

> testmodload.py:
> import sys, os
> sys.modules['testmodload'] = os

None of these will freeze successfully.

Two of them appear to rely on an implementation
detail - that __path__ (only defined for
imp.PKG_DIRECTORY's) will be followed even in
a plain module.

The third is exactly what _xmlplus does, and
consensus appears to be that that was a 
mistake.

"Clever" does not mean "good".

-- Gordon
http://www.mcmillan-inc.com/




From guido@python.org  Sun Jul 14 00:07:36 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 13 Jul 2002 19:07:36 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: Your message of "Sat, 13 Jul 2002 15:50:18 EDT."
 <LNBBLJKPBEHFEDALKOLCMEHCAEAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEHCAEAB.tim.one@comcast.net>
Message-ID: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net>

> [Tim]
> >> The base question:  does the iteration protocol define what
> >> happens if an iterator's next() method is called after the iterator
> >> has raised StopIteration?  Or is that left up to the discretion of the
> >> iterator?
> 
> [Guido]
> > The latter.  Believe it or not, I thought about this during the design
> > of the protocol, and decided that if someone wanted to create an
> > iterator that could somehow continue after raising StopIteration, that
> > should be their problem.

[Tim]
> I believe it -- I even vaguely recall discussions about it.  Unfortunately,
> they don't seem to be recorded anywhere I can find now.
> 
> > Basically, the effect of calling next() after StopIteration is raised is
> > undefined.
> 
> That's consistent with a lawyer's reading of the relevant PEP <wink>.

Actually, not.  Under "Resolved Issues" the PEP has this:

    - Once a particular iterator object has raised StopIteration, will
      it also raise StopIteration on all subsequent next() calls?
      Some say that it would be useful to require this, others say
      that it is useful to leave this open to individual iterators.
      Note that this may require an additional state bit for some
      iterator implementations (e.g. function-wrapping iterators).

      Resolution: once StopIteration is raised, calling it.next()
      continues to raise StopIteration.

So I misremembered, and Tim didn't read the PEP closely enough. :-)

> I'm happy to leave this be: the docs match the implemenation, I'm
> sure *someone* relies on that by now, and the behavior is easy to
> explain as-is.

Hm.  Given what the PEP says, I'm ready to have this fixed (even in
2.2.2).  I can't call code relying on this sane.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jepler@unpythonic.net  Sun Jul 14 01:04:26 2002
From: jepler@unpythonic.net (jepler@unpythonic.net)
Date: Sat, 13 Jul 2002 19:04:26 -0500
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEHCAEAB.tim.one@comcast.net> <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020713190420.A2256@unpythonic.net>

On Sat, Jul 13, 2002 at 07:07:36PM -0400, Guido van Rossum wrote:
> Actually, not.  Under "Resolved Issues" the PEP has this:
> 
>     - Once a particular iterator object has raised StopIteration, will
>       it also raise StopIteration on all subsequent next() calls?
>       Some say that it would be useful to require this, others say
>       that it is useful to leave this open to individual iterators.
>       Note that this may require an additional state bit for some
>       iterator implementations (e.g. function-wrapping iterators).
> 
>       Resolution: once StopIteration is raised, calling it.next()
>       continues to raise StopIteration.
> 
> So I misremembered, and Tim didn't read the PEP closely enough. :-)
> 
> > I'm happy to leave this be: the docs match the implemenation, I'm
> > sure *someone* relies on that by now, and the behavior is easy to
> > explain as-is.
> 
> Hm.  Given what the PEP says, I'm ready to have this fixed (even in
> 2.2.2).  I can't call code relying on this sane.

What about this example?
>>> l = []
>>> li = iter(l)
>>> li.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
StopIteration
>>> l.extend([1, 2, 3])
>>> li.next()
1

does the list iterator violate the proposed behavior?

Jeff



From guido@python.org  Sun Jul 14 01:42:14 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 13 Jul 2002 20:42:14 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: Your message of "Sat, 13 Jul 2002 19:04:26 CDT."
 <20020713190420.A2256@unpythonic.net>
References: <LNBBLJKPBEHFEDALKOLCMEHCAEAB.tim.one@comcast.net> <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net>
 <20020713190420.A2256@unpythonic.net>
Message-ID: <200207140042.g6E0gEp19165@pcp02138704pcs.reston01.va.comcast.net>

> > So I misremembered, and Tim didn't read the PEP closely enough. :-)
> > 
> > > I'm happy to leave this be: the docs match the implemenation, I'm
> > > sure *someone* relies on that by now, and the behavior is easy to
> > > explain as-is.
> > 
> > Hm.  Given what the PEP says, I'm ready to have this fixed (even in
> > 2.2.2).  I can't call code relying on this sane.
> 
> What about this example?
> >>> l = []
> >>> li = iter(l)
> >>> li.next()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> StopIteration
> >>> l.extend([1, 2, 3])
> >>> li.next()
> 1
> 
> does the list iterator violate the proposed behavior?

Alternatively, we could change the PEP to make this officially
undefined (or at least up to the iterator used).  I'm not sure which I
like better -- the PEP or reality. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pinard@iro.umontreal.ca  Sun Jul 14 02:49:32 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 13 Jul 2002 21:49:32 -0400
Subject: [Python-Dev] Re: Termination of two-arg iter()
In-Reply-To: <200207140042.g6E0gEp19165@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEHCAEAB.tim.one@comcast.net>
 <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net>
 <20020713190420.A2256@unpythonic.net>
 <200207140042.g6E0gEp19165@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <oqu1n313kj.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

> > What about this example?
> > >>> l = []
> > >>> li = iter(l)
> > >>> li.next()
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in ?
> > StopIteration
> > >>> l.extend([1, 2, 3])
> > >>> li.next()
> > 1
> > 
> > does the list iterator violate the proposed behavior?

> Alternatively, we could change the PEP to make this officially
> undefined (or at least up to the iterator used).

If you change the PEP so the behaviour is undefined in the protocol, then,
you will have to separately document the behaviour for all iterators which
are produced by the various means available in standard Python, and people
will have to remember these differences.

Would it be perceived as shocking (or not?) in the example above, having
to produce another iterator "li = iter(l)" before reusing it?  If not,
then I presume regularity and consistency of behaviour should prevail.
Are there other problematic cases from the Python distribution itself?

Maybe the iteration protocol should invite implementors at returning forever,
if it has returned it once by a particular instance of an iterator, only for
the sake of consistency with all iterators provided by Python itself, but
without making this a hard requirement.  So if for some strange application,
users want to do differently, they could validly do nevertheless.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From mhammond@skippinet.com.au  Sun Jul 14 03:49:18 2002
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Sun, 14 Jul 2002 12:49:18 +1000
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: <200207121834.g6CIYYM13332@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LCEPIIGDJPKCOIHOBJEPEEKAFPAA.mhammond@skippinet.com.au>

> It seems we're still in the same boat.  It would be saner to change
> buffer slices to return buffer objects, except for backward
> compatibility.  I was hoping to hear from someone who uses buffer
> objects and knows that this would break his code.  Scott apparently
> doesn't have this problem with his own code, so his opinion doesn't
> help. :-(

There may be some breakage in the Win32 overlapped IO world.  A common
pattern is:

  buf = allocate_buffer_somehow(size)
  Perform_Async_Read(size)
  #  wait for notification of read completing
  nbytes = Wait_For_Read_Notification()
  data = buf[:nbytes]

Currently, "data" is a string.  Changing this to a buffer object will
presumably break this code once "data" is passed to some other function that
truly requires a string.

> Maybe the name 'buffer' suggests false expectations?  It's not a
> buffer, it's an alias for a memory area.

This distinction is a little gray.  In my example, it is truly a buffer -
but also an alias for a memory area.  In my example though, it is *not*
conceptually an alias for memory owned by another object.

> Maybe we should do something stronger, and deprecate the buffer type
> altogether.

Maybe.  However, as you have seen over the years, *something* from all this
mess is a real requirement.  This example of asynch IO is the only example I
have ever used, but IMO, it is a real and reasonable requirement.  My
example *could* have been done with array() (assuming the array module had a
C API exposed which it doesn't/didn't) but that too looks like a square peg
in a round hole - my requirements call for a pre-allocated byte buffer, not
an array.

All that said: if the worst came to the worst, I could ensure that the Win32
extensions are left compatible with the way they are.  All such buffers are
allocated using a function inside one of my modules.  Currently this just
returns a buffer() object, but could be changed to a private object with the
same semantics as the existing buffer() object.  So consider this more a
data point than an attempted veto.

Mark.




From tim.one@comcast.net  Sun Jul 14 04:15:50 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 13 Jul 2002 23:15:50 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEIBAEAB.tim.one@comcast.net>

[Guido]
> Actually, not.  Under "Resolved Issues" the PEP has this:
>
>     - Once a particular iterator object has raised StopIteration, will
>       it also raise StopIteration on all subsequent next() calls?
>       Some say that it would be useful to require this, others say
>       that it is useful to leave this open to individual iterators.
>       Note that this may require an additional state bit for some
>       iterator implementations (e.g. function-wrapping iterators).
>
>       Resolution: once StopIteration is raised, calling it.next()
>       continues to raise StopIteration.
>
> So I misremembered, and Tim didn't read the PEP closely enough. :-)

Not so.  I read the PEP *very* closely, twice even.  It's just that both
times, I gave up in boredom a few points above that one <wink>.  I think I
used to know it, though, and made sure StopIteration is a sink state for
generator-iterators because of it.

> ...
> Hm.  Given what the PEP says, I'm ready to have this fixed (even in
> 2.2.2).

Well, the PEP proper just doesn't say.  In a court of Standard Law, I'm
pretty sure the "Resolved Issues" section would be ruled to be in the nature
of a non-normative appendix.  Now that the PSF has some funds, I'm sure we
can buy that decision if need be <wink>.

> I can't call code relying on this sane.

Now that I've seen what it actually does, I think it's kind of cute.  Like

f = file('somefile')
get = iter(f.readline, '\n')

while 1:
    paragraph = list(get)
    if not paragraph:
        break
    # deal with paragraph, a list of lines

The only big problem is that once you hit the end of the file, this hangs in
an infinite loop inside the list() implementation, accumulating an unbounded
number of empty strings.  But that just makes it extra cute.  Cute enough to
be insane, probably.




From tim.one@comcast.net  Sun Jul 14 04:19:43 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 13 Jul 2002 23:19:43 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <20020713190420.A2256@unpythonic.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEIBAEAB.tim.one@comcast.net>

[jepler@unpythonic.net]
> What about this example?
> >>> l = []
> >>> li = iter(l)
> >>> li.next()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> StopIteration
> >>> l.extend([1, 2, 3])
> >>> li.next()
> 1
>
> does the list iterator violate the proposed behavior?

Oh yes.  OTOH, its current behavior isn't defined well enough anywhere
(short of reading the source code) that raising StopIteration on the second
call today could be called "a bug" either.




From tim.one@comcast.net  Sun Jul 14 04:33:21 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 13 Jul 2002 23:33:21 -0400
Subject: [Python-Dev] Re: Termination of two-arg iter()
In-Reply-To: <oqu1n313kj.fsf@titan.progiciels-bpi.ca>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEIDAEAB.tim.one@comcast.net>

[Fran=E7ois Pinard]
> If you change the PEP so the behaviour is undefined in the protocol=
,
> then, you will have to separately document the behaviour for all
> iterators which are produced by the various means available in stan=
dard
> Python, and people will have to remember these differences.

Not necessarily.  The standard dodge is to say "undefined" and just l=
eave it
at that.  This is a way of saying that the language so strongly disco=
urages
the practice that it refuses to saying anything about what happens if=
 you do
it, but that it's not going to stop you if you're determined to do it=
.  If
you do it anyway, it's at your own risk (as if anything you do is eve=
r done
at someone else's risk <wink>).

> Would it be perceived as shocking (or not?) in the example above, h=
aving
> to produce another iterator "li =3D iter(l)" before reusing it?

Jeff's example was too simple to make "the problem" here clear.  If y=
ou get
a new iterator, you'll start over from the beginning of the list.  As=
 is,
you continue where the last next() call left off:

>>> x =3D range(2)
>>> n =3D iter(x).next
>>> n()
0
>>> n()
1
>>> n()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
StopIteration
>>> x.extend([6, 7])
>>> n()
6
>>> n()
7
>>> n()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
StopIteration
>>>

*Some* code out there may be relying on that, despite that the behavi=
or
violates what the tail end of the PEP says.

thank-god-the-protocol-doesn't-have-three-methods<wink>-ly y'rs  - ti=
m





From tim.one@comcast.net  Sun Jul 14 05:00:04 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 14 Jul 2002 00:00:04 -0400
Subject: [Python-Dev] Dict constructor
In-Reply-To: <0GZ600EDOJAORQ@mtain01.icomcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEIDAEAB.tim.one@comcast.net>

[Alex Martelli]
> What about the following compromise: there are two set types,
> ImmutableSet and MutableSet, with a common supertype Set.
> ...

This sounds fine to me, except I'd call them Set (mutable) and something
else <wink>.  I'd also check the code into the library now, so lots of
people can hack on it before it "becomes real".  People just won't play with
branches or sandboxes in sufficient numbers to do any collaborative good.
If Guido hates what it turns into, we can pull it out again.




From tim.one@comcast.net  Sun Jul 14 05:17:00 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 14 Jul 2002 00:17:00 -0400
Subject: [Python-Dev] Dict constructor
In-Reply-To: <200207131234.g6DCYvj17144@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEIEAEAB.tim.one@comcast.net>

[Guido]
> IMO it's no odder than disallowing dicts as dict keys: it's a hack
> that allows a much faster implementation.

Except that sets are an extremely well-developed concept apart from Python,
and you can only go a little way using set-based approaches before sets of
sets are a screamingly natural occurrence.  In that respect, sets that can't
contain sets are akin to limiting integer arithmetic to 32 bits (also a hack
that allows a much faster implementation, but screaming speed just isn't
Python's forte -- this line of argument belongs more in Fortran-Dev).

>> That is, like sets of sets in Icon too, this is a notion of inclusion by
>> object identity (although Icon does that on purpose, while the
>> BTree-based set mostly inherits it from that BTrees don't implement any
>> comparison slots).  That's very easy to implement.  It's braindead if
>> you think of sets as collections of values, but that's what taking pain
>> too seriously leads to.

> I don't think it is acceptable to have sets-of-sets but test for
> membership (in that case) by object identity.
>
> If you really think object identity is all that's needed, I suggest we
> stick to disallowing sets of sets; algorithms needing
> sets-of-set-object-identities can use id() on the inner sets.

I called the object identity approach "braindead" for those who think of
sets as collections of values, and I previously identified myself as one of
those suffering the collection-of-values delusion.  You can do the modus
ponens bit from there <wink>.




From oren-py-d@hishome.net  Sun Jul 14 05:31:13 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sun, 14 Jul 2002 00:31:13 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEHCAEAB.tim.one@comcast.net> <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020714043113.GA53342@hishome.net>

On Sat, Jul 13, 2002 at 07:07:36PM -0400, Guido van Rossum wrote:
> Actually, not.  Under "Resolved Issues" the PEP has this:
> 
>     - Once a particular iterator object has raised StopIteration, will
>       it also raise StopIteration on all subsequent next() calls?
>       Some say that it would be useful to require this, others say
>       that it is useful to leave this open to individual iterators.

At the time this was discussed on the list has anyone considered the
possibility of raising an exception?  Something like 'IteratorExhausted'?

If the current definition is ruled to be 'undefined' then an iterator MAY
raise an exception in this case.  

Iterable objects can often serve as a replacement for lists in many places 
and even passed successfully to a lot of old code that was written before 
the iteration protocol. But an iterable object is not always a suitable 
replacement for a sequence when the code needs to iterate multiple times and 
the object is not re-iterable.  This will fail in a very nonobvious way 
without raising an exception because an exhausted iterator looks just like 
an empty sequence to a for loop.  

I think this kind of errors should not pass silently.

Yes, I have been bitten by this. Perhaps this was a result of overzealous 
use of iterators because I was so excited with them, but it's a real 
problem, not some contrived example.

	Oren




From tim.one@comcast.net  Sun Jul 14 05:44:16 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 14 Jul 2002 00:44:16 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <14fb01c2298c$71a145b0$6601a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEIGAEAB.tim.one@comcast.net>

[David Abrahams]
> Yep, I know about PySequence_Fast(), annd we're currently using that.
> However I have a bunch of numerics users who will undoubtedly be working
> with some kind of array from NumPy or something -- they'll be really
> unimpressed with me when PySequence_Fast() copies their huge multi-pass
> sequence without individual Python objects for the elements into a tuple
> with each double expressed as a separate Python float.

Now that you have a concrete use case (to the extent that "some kind of
NumPy array or something" can be called concrete <wink>), have you talked to
the NumPy people about it?  They're very clever about making things run fast
(that's the reason for NumPy's existence), and they may want a different
approach entirely.

averse-to-generalizing-from-0-examples-ly y'rs  - tim




From aleax@aleax.it  Sun Jul 14 07:09:27 2002
From: aleax@aleax.it (Alex Martelli)
Date: Sun, 14 Jul 2002 08:09:27 +0200
Subject: [Python-Dev] Dict constructor
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEIDAEAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCOEIDAEAB.tim.one@comcast.net>
Message-ID: <02071408092701.18713@arthur>

On Sunday 14 July 2002 06:00, Tim Peters wrote:
> [Alex Martelli]
>
> > What about the following compromise: there are two set types,
> > ImmutableSet and MutableSet, with a common supertype Set.
> > ...
>
> This sounds fine to me, except I'd call them Set (mutable) and something

Yes, that's what I did in the submission -- Set is the name of the mutable 
one, BaseSet the common base type (meant as abstract).  Please see
http://python.org/sf/580995 -- I'm sure there will be other glitches worth
fixing.


Alex



From martin@v.loewis.de  Sun Jul 14 09:02:15 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 14 Jul 2002 10:02:15 +0200
Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings
In-Reply-To: <018901c22aa4$e0f41190$ced241d5@hagrid>
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>
 <018901c22aa4$e0f41190$ced241d5@hagrid>
Message-ID: <m37kjyohyw.fsf@mira.informatik.hu-berlin.de>

"Fredrik Lundh" <fredrik@pythonware.com> writes:

> hmm.  I'm tempted to think that there's a major
> flaw in the PEP, caused by the fact that
> 
>     compile(unicode(script, extract_encoding(script)))
> 
> will, from what I can tell, not compile to the same
> thing as:
> 
>     compile(script)

Can you elaborate what you think the difference is? I believe the PEP
is silent on this specific aspect, but I think what should happen is
(in the Unicode case):

- compile will convert the script to UTF-8, which is then tokenized.
- in the process of parsing, the encoding declaration (that presumably
  extract_encoding was looking at as well) is recognized, if any.
- Unicode literals are left as-is; byte string literals are converted
  back to the original encoding.

So if there is an encoding declaration in script, then I cannot see a
difference. If there is none, the PEP does not elaborate what should
happen. Leaving the byte strings as UTF-8 seems safest, since the only
way to get "correct" non-ASCII strings without the encoding comment is
to use the UTF-8 signature.

In any case, this can't cause backwards compatibility
problems. compile accepts Unicode strings today only if they can be
converted to a byte string. In the standard installation, this will
fail today if there is non-ASCII in script. So allowing Unicode in
compile is a pure extension. If its precise meaning is underspecified,
it should be clarified before stage 2 is implemented.

Regards,
Martin




From drifty@bigfoot.com  Sun Jul 14 09:16:10 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Sun, 14 Jul 2002 01:16:10 -0700 (PDT)
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <20020714043113.GA53342@hishome.net>
Message-ID: <Pine.SOL.4.44.0207140057270.15930-100000@death.OCF.Berkeley.EDU>

[Oren Tirosh]

> At the time this was discussed on the list has anyone considered the
> possibility of raising an exception?  Something like 'IteratorExhausted'?
>

I have no idea whether this was discussed before or not, but I personally
don't like the idea of having another exception being raised by iterators.
Without reading the PEP this exact second, my gut response is that
iterators should have a single exception that signals it has reached its
end.  It seems like StopIteration is saying "stop please" and
IteratorExhausted would be like screaming "STOP CALLING .next()!!!".
Either you force them to get the clue the first time or you let them
continue being rude; Python shouldn't need to raise its voice and act like
an over-bearing parent.  If we wanted over-bearing parents we would be
yelling for typing of arguments.  =)

<snip>
> Iterable objects can often serve as a replacement for lists in many places
> and even passed successfully to a lot of old code that was written before
> the iteration protocol. But an iterable object is not always a suitable
> replacement for a sequence when the code needs to iterate multiple times and
> the object is not re-iterable.  This will fail in a very nonobvious way
> without raising an exception because an exhausted iterator looks just like
> an empty sequence to a for loop.
>

After reading this email I felt like not raising StopIteration
continuously was like warning that once you hit \0 in a C char array you
have reached the end of the string, but keep going if you care to.  We all
know that ain't a good idea.  =)

Personally, I say continuously raise StopIteration.  I feel that
StopIteration says the iterator is done, period.  Being able to go beyond
the signalled end seems like it is not a true once-through iterator with
an actual end but starting to seem like a stream.  I thought the point of
putting in something like the sentinel was so that you could force the end
of an iterator and just have it be a suggestion.

I can also see beginners being bitten by this; if people as experienced as
Oren are getting bitten by this we know some person starting out
definitely will be.  I know I thought that StopIteration was continuously
raised until the emails on this subject started.  When I blindly read
"StopIteration", I don't feel this is a warning that one shouldn't keep
going but a notice that the iterator is done, thanks for coming but please
don't come again (unless restartable iterators are supported =).  In other
words, I feel that StopIteration sounds like a notice that the end has
occured and you can't do any more then an advisement that you should stop.

>
> 	Oren
>

-Brett C.




From bsder@mail.allcaps.org  Sun Jul 14 11:23:25 2002
From: bsder@mail.allcaps.org (Andrew P. Lentvorski)
Date: Sun, 14 Jul 2002 03:23:25 -0700 (PDT)
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <Pine.SOL.4.44.0207140057270.15930-100000@death.OCF.Berkeley.EDU>
Message-ID: <20020714023729.Y79323-100000@mail.allcaps.org>

On Sun, 14 Jul 2002, Brett Cannon wrote:

> end.  It seems like StopIteration is saying "stop please" and
> IteratorExhausted would be like screaming "STOP CALLING .next()!!!".

What about raising IndexError by default when someone attempts to call
.next() on an iterator already raising StopIteration?

In the case of a list, StopIteration signals that the iterator is pointing
to just beyond the end of the list.  An attempt to call .next() when
StopIteration is already true is effectively an attempt to dereference
past the end of a list (since .next() normally wants to return a value).

List accesses via an index past list end currently raise an IndexError.
Doing something similar for iterators would seem to keep things
consistent.

-a






From oren-py-d@hishome.net  Sun Jul 14 12:05:13 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sun, 14 Jul 2002 07:05:13 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <Pine.SOL.4.44.0207140057270.15930-100000@death.OCF.Berkeley.EDU>
References: <20020714043113.GA53342@hishome.net> <Pine.SOL.4.44.0207140057270.15930-100000@death.OCF.Berkeley.EDU>
Message-ID: <20020714110513.GA2280@hishome.net>

On Sun, Jul 14, 2002 at 01:16:10AM -0700, Brett Cannon wrote:
> [Oren Tirosh]
> 
> > At the time this was discussed on the list has anyone considered the
> > possibility of raising an exception?  Something like 'IteratorExhausted'?
> >
> 
> I have no idea whether this was discussed before or not, but I personally
> don't like the idea of having another exception being raised by iterators.
> Without reading the PEP this exact second, my gut response is that
> iterators should have a single exception that signals it has reached its
> end.  It seems like StopIteration is saying "stop please" and
> IteratorExhausted would be like screaming "STOP CALLING .next()!!!".
> Either you force them to get the clue the first time or you let them
> continue being rude; Python shouldn't need to raise its voice and act like
> an over-bearing parent.  If we wanted over-bearing parents we would be
> yelling for typing of arguments.  =)

This anthropomorphic description has too many irrelevant associations.  Let's 
leave the parents out of this.

The logic is simple:  StopIteration is not an error. It's not even a warning,
it's a normal part of program operation. It uses the exception mechanism 
because it is the most convenient form of out-of-band signalling.  The 
hypothetical IteratorExhausted is an error. The fact that both of them
happen to be exceptions is almost a coincidence.

Unlike IndexError which is sometimes used to bail out of loops the 
IteratorExhausted exception is almost guaranteed to be a programmer error.
And it's error that would otherwise pass silently and produce strange 
results.

> definitely will be.  I know I thought that StopIteration was continuously
> raised until the emails on this subject started.  

For most Python iterators it is.  This behavior is OK but it could be changed 
to something stricter.  So far I thought this behavior was mandatory so I
didn't raise this proposal.  Now I learned that officially it is undefined
and that this behavior is just what most Python iterators do so it could be
possible to change it to something safer.

	Oren



From oren-py-d@hishome.net  Sun Jul 14 12:27:45 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sun, 14 Jul 2002 07:27:45 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <20020714023729.Y79323-100000@mail.allcaps.org>
References: <Pine.SOL.4.44.0207140057270.15930-100000@death.OCF.Berkeley.EDU> <20020714023729.Y79323-100000@mail.allcaps.org>
Message-ID: <20020714112745.GB2280@hishome.net>

On Sun, Jul 14, 2002 at 03:23:25AM -0700, Andrew P. Lentvorski wrote:
> On Sun, 14 Jul 2002, Brett Cannon wrote:
> 
> > end.  It seems like StopIteration is saying "stop please" and
> > IteratorExhausted would be like screaming "STOP CALLING .next()!!!".
> 
> What about raising IndexError by default when someone attempts to call
> .next() on an iterator already raising StopIteration?

+1

IndexError is probably better than inventing a new exception. The description 
of what actually happened would be in the exception text.

StopIteration means "That was the last item, thank you. Sorry I couldn't tell
you my length in advance -- I'm an iterator and I don't even know it myself."

This type of IndexError would mean "Hey, I told you it was the last item. 
This would have been an out-of-bounds index if I were a sequence".

	Oren



From guido@python.org  Sun Jul 14 14:20:51 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 14 Jul 2002 09:20:51 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: Your message of "Sun, 14 Jul 2002 07:27:45 EDT."
 <20020714112745.GB2280@hishome.net>
References: <Pine.SOL.4.44.0207140057270.15930-100000@death.OCF.Berkeley.EDU> <20020714023729.Y79323-100000@mail.allcaps.org>
 <20020714112745.GB2280@hishome.net>
Message-ID: <200207141320.g6EDKpJ27752@pcp02138704pcs.reston01.va.comcast.net>

> > What about raising IndexError by default when someone attempts to call
> > .next() on an iterator already raising StopIteration?
> 
> +1
> 
> IndexError is probably better than inventing a new exception. The
> description of what actually happened would be in the exception
> text.

-1.  IndexError belongs to sequences.  I don't like the idea of
raising another exception at all -- we should either keep things the
way they are, or continue to raise StopIteration forever once it's
been raised.  Other suggestions don't make sense to me.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From oren-py-d@hishome.net  Sun Jul 14 14:33:24 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sun, 14 Jul 2002 09:33:24 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <20020714043113.GA53342@hishome.net>
References: <LNBBLJKPBEHFEDALKOLCMEHCAEAB.tim.one@comcast.net> <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net> <20020714043113.GA53342@hishome.net>
Message-ID: <20020714133324.GA17215@hishome.net>

On Sun, Jul 14, 2002 at 12:31:13AM -0400, Oren Tirosh wrote:
> On Sat, Jul 13, 2002 at 07:07:36PM -0400, Guido van Rossum wrote:
> > Actually, not.  Under "Resolved Issues" the PEP has this:
> > 
> >     - Once a particular iterator object has raised StopIteration, will
> >       it also raise StopIteration on all subsequent next() calls?
> >       Some say that it would be useful to require this, others say
> >       that it is useful to leave this open to individual iterators.
> 
> At the time this was discussed on the list has anyone considered the
> possibility of raising an exception?  Something like 'IteratorExhausted'?

An alternative approach would be to raise an exception when calling iter() on 
an exhausted iterator. This is orthogonal to whatever .next() does on such
and iterator.

Suggested implementation: when an iterator raises StopIteration it will
immediately clean up and decref any referenced objects and then alter its 
ob_type field to a special closed iterator type. This is similar to the
way closed files are handled - any attempt to perform I/O on them raises an
IOError. The behavior of this type's tp_iter and tp_iternext is open for 
discussion. This will "fix" the behavior of list iterators, for example, 
that can be revived by extending the list.  It's a matter of interpretation 
whether this is a bug or a feature, though.

	Oren



From mal@lemburg.com  Sun Jul 14 15:32:09 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 14 Jul 2002 16:32:09 +0200
Subject: [Python-Dev] python package
References: <3D30752B.9129.BA3B5E1@localhost>
Message-ID: <3D318B69.4080401@lemburg.com>

Gordon McMillan wrote:
> Marc-Andre,
> 
> In this thread you have posted:
> 
> 
>>python.py:
>>__path__ = ['.']
> 
> 
> and
> 
> 
>>def _redirect(mx_subpackage):
>>    global __path__
>>    import os,mx
>>    __path__ = \
>> [os.path.join(mx.__path__[0],mx_subpackage)]
> 
> 
> and
> 
> 
>>testmodload.py:
>>import sys, os
>>sys.modules['testmodload'] = os
> 
> 
> None of these will freeze successfully.

Hmm, then how do you freeze _xmlplue ?

> Two of them appear to rely on an implementation
> detail - that __path__ (only defined for
> imp.PKG_DIRECTORY's) will be followed even in
> a plain module.

AFAIK, that's not an implementation detail, but a documented
way of finding out whether a module is a package or not.

> The third is exactly what _xmlplus does, and
> consensus appears to be that that was a 
> mistake.
> 
> "Clever" does not mean "good".

But it works (tm) :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From guido@python.org  Sun Jul 14 15:32:36 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 14 Jul 2002 10:32:36 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: Your message of "Sun, 14 Jul 2002 09:33:24 EDT."
 <20020714133324.GA17215@hishome.net>
References: <LNBBLJKPBEHFEDALKOLCMEHCAEAB.tim.one@comcast.net> <200207132307.g6DN7aA18799@pcp02138704pcs.reston01.va.comcast.net> <20020714043113.GA53342@hishome.net>
 <20020714133324.GA17215@hishome.net>
Message-ID: <200207141432.g6EEWa127865@pcp02138704pcs.reston01.va.comcast.net>

> An alternative approach would be to raise an exception when calling
> iter() on an exhausted iterator. This is orthogonal to whatever
> .next() does on such and iterator.

This just adds more complicated rules to no avail.  iter() on an
iterator should return that iterator itself.  The state of that
iterator is what it is.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aahz@pythoncraft.com  Sun Jul 14 15:44:23 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sun, 14 Jul 2002 10:44:23 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <Pine.SOL.4.44.0207140057270.15930-100000@death.OCF.Berkeley.EDU>
References: <20020714043113.GA53342@hishome.net> <Pine.SOL.4.44.0207140057270.15930-100000@death.OCF.Berkeley.EDU>
Message-ID: <20020714144423.GA29033@panix.com>

On Sun, Jul 14, 2002, Brett Cannon wrote:
>
> Personally, I say continuously raise StopIteration.  I feel that
> StopIteration says the iterator is done, period.  Being able to go
> beyond the signalled end seems like it is not a true once-through
> iterator with an actual end but starting to seem like a stream.  

+1
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From barry@zope.com  Sun Jul 14 15:58:02 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sun, 14 Jul 2002 10:58:02 -0400
Subject: [Python-Dev] Termination of two-arg iter()
References: <Pine.SOL.4.44.0207140057270.15930-100000@death.OCF.Berkeley.EDU>
 <20020714023729.Y79323-100000@mail.allcaps.org>
 <20020714112745.GB2280@hishome.net>
 <200207141320.g6EDKpJ27752@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15665.37242.446627.141013@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> -1.  IndexError belongs to sequences.  I don't like the idea
    GvR> of raising another exception at all -- we should either keep
    GvR> things the way they are, or continue to raise StopIteration
    GvR> forever once it's been raised.  Other suggestions don't make
    GvR> sense to me.

I think it would be fine to leave the situation as is
(i.e. undefined).  You can use the PEP to encourage a particular
behavior but I'm not sure it needs to be required ("SHOULD" in RFC
terms, but not "MUST").

-Barry



From oren-py-d@hishome.net  Sun Jul 14 17:06:11 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sun, 14 Jul 2002 12:06:11 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <15665.37242.446627.141013@anthem.wooz.org>
References: <Pine.SOL.4.44.0207140057270.15930-100000@death.OCF.Berkeley.EDU> <20020714023729.Y79323-100000@mail.allcaps.org> <20020714112745.GB2280@hishome.net> <200207141320.g6EDKpJ27752@pcp02138704pcs.reston01.va.comcast.net> <15665.37242.446627.141013@anthem.wooz.org>
Message-ID: <20020714160611.GA25950@hishome.net>

On Sun, Jul 14, 2002 at 10:58:02AM -0400, Barry A. Warsaw wrote:
> 
> >>>>> "GvR" == Guido van Rossum <guido@python.org> writes:
> 
>     GvR> -1.  IndexError belongs to sequences.  I don't like the idea
>     GvR> of raising another exception at all -- we should either keep
>     GvR> things the way they are, or continue to raise StopIteration
>     GvR> forever once it's been raised.  Other suggestions don't make
>     GvR> sense to me.
> 
> I think it would be fine to leave the situation as is
> (i.e. undefined).  You can use the PEP to encourage a particular
> behavior but I'm not sure it needs to be required ("SHOULD" in RFC
> terms, but not "MUST").

I'd like it to stay underfined. The issue is how should the iterators of 
builtin types actually behave within this undefined space.

Iterables are very similar to sequences.  A lot of code could use either
one without any changes.  It's precisely because of this similarity that I
hate it when they do behave differently - and don't even report it. Files 
and pipes are very similar too. A lot of code could work with either one 
but if this code tries to seek on a pipe it will get an exception. Just
imagine what would happen if pipes failed silently if you tried to seek 
back to the beginning of the file.

I have much respect for whatever makes or doesn't make sense to Guido but I 
have been using iterators and generator functions extensively (obsessively?) 
for over 8 months now and the current behavior doesn't make sense to me.

I guess the reason I ran into this has to do with my style of interactive 
use of the Python prompt.  I recall a previous command, change the paramters 
of one of the processing stages in the dataflow and repeat the process.  
Then I wonder why I get an empty result - one of the temporary results I 
stored to a variable wasn't re-iterable. Is it too much to expect an
exception?  "Errors should never pass silently."

	Oren




From mal@lemburg.com  Sun Jul 14 17:17:05 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 14 Jul 2002 18:17:05 +0200
Subject: [Python-Dev] python package
References: <LNBBLJKPBEHFEDALKOLCKEHEAEAB.tim.one@comcast.net>
Message-ID: <3D31A401.9000103@lemburg.com>

Tim Peters wrote:
> [MAL]
> 
>>The module objects would be different, but that's just about it.
> 
> 
> [Gordon]
> 
>>Which was exactly my point.  Much code that does *not* use
>>"from ... import ..." in fact relies on having the same module object.
> 
> 
> [MAL]
> 
>>You mean for e.g. hacking the module's globals ?
> 
> 
> If you consider a module maintaining pieces of its own state in its own
> globals as an instance of hacking the module's globals, yes, that's the main
> problem.  For example (there are many, this isn't stretching), if the user
> ends up with two distinct copies of the tempfile module, its  "global"
> _tempdir_lock becomes two distinct locks, and the truly global mutual
> exclusion _tempdir_lock was supposed to supply is lost.  Ditto for the lock
> used internally by tempfile's global _counter object.  The system-wide
> uniqueness of some globals is crucial to some modules' correct functioning.

Very true and that's why there is only one module containing the
actual code. Globals referenced by the code live in that module.
The other module only imports the symbols in the first solution
I posted. The second even avoids this extra step -- there's only
one module (the packaged one) left in sys.modules which is referenced
under two names.

pickles will gladly unpickle using this scheme while a pickle
operation automagically starts using the new packaged name.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Sun Jul 14 17:21:34 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 14 Jul 2002 18:21:34 +0200
Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>	<018901c22aa4$e0f41190$ced241d5@hagrid> <m37kjyohyw.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D31A50E.4050800@lemburg.com>

Martin v. Loewis wrote:
> "Fredrik Lundh" <fredrik@pythonware.com> writes:
> 
> 
>>hmm.  I'm tempted to think that there's a major
>>flaw in the PEP, caused by the fact that
>>
>>    compile(unicode(script, extract_encoding(script)))
>>
>>will, from what I can tell, not compile to the same
>>thing as:
>>
>>    compile(script)
> 
> 
> Can you elaborate what you think the difference is? I believe the PEP
> is silent on this specific aspect,

It does mention this as part of phase 2.

> but I think what should happen is
> (in the Unicode case):
> 
> - compile will convert the script to UTF-8, which is then tokenized.
> - in the process of parsing, the encoding declaration (that presumably
>   extract_encoding was looking at as well) is recognized, if any.
> - Unicode literals are left as-is; byte string literals are converted
>   back to the original encoding.

Right.

> So if there is an encoding declaration in script, then I cannot see a
> difference. If there is none, the PEP does not elaborate what should
> happen. Leaving the byte strings as UTF-8 seems safest, since the only
> way to get "correct" non-ASCII strings without the encoding comment is
> to use the UTF-8 signature.
> 
> In any case, this can't cause backwards compatibility
> problems. compile accepts Unicode strings today only if they can be
> converted to a byte string. In the standard installation, this will
> fail today if there is non-ASCII in script. So allowing Unicode in
> compile is a pure extension. If its precise meaning is underspecified,
> it should be clarified before stage 2 is implemented.

No need for this. The PEP already mentions it.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From martin@v.loewis.de  Sun Jul 14 17:29:20 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 14 Jul 2002 18:29:20 +0200
Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings
In-Reply-To: <3D31A50E.4050800@lemburg.com>
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>
 <018901c22aa4$e0f41190$ced241d5@hagrid>
 <m37kjyohyw.fsf@mira.informatik.hu-berlin.de>
 <3D31A50E.4050800@lemburg.com>
Message-ID: <m3it3i1den.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> > Can you elaborate what you think the difference is? I believe the PEP
> > is silent on this specific aspect,
> 
> It does mention this as part of phase 2.

All I can find is

<quote>
The builtin compile() API will be enhanced to accept Unicode as input.
</quote>

That leaves the question open what the compile function *does* beyond
merely accepting Unicode strings; it is canonical that it tries to
compile it, as it would with a byte string.

The unspecified aspect is the treatment of byte strings within the
Unicode string. The current compiler treats them "as-is"; this is
clearly no option. The reasonable options are:

1. convert to byte string using "ascii" encoding,
2. convert to byte string using "utf-8" encoding,
3. convert to byte string using system default encoding,
4. convert to byte string using encoding declared inside the code
   string. If that route is taken, the question is what happens
   if no encoding declaration is found.

> No need for this. The PEP already mentions it.

Can you please quote the precise words in the text of the PEP that
answer the question which of the four options above is taken?

Regards,
Martin



From mal@lemburg.com  Sun Jul 14 18:02:13 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 14 Jul 2002 19:02:13 +0200
Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>	<018901c22aa4$e0f41190$ced241d5@hagrid>	<m37kjyohyw.fsf@mira.informatik.hu-berlin.de>	<3D31A50E.4050800@lemburg.com> <m3it3i1den.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D31AE95.6070804@lemburg.com>

Martin v. Loewis wrote:
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
> 
>>>Can you elaborate what you think the difference is? I believe the PEP
>>>is silent on this specific aspect,
>>
>>It does mention this as part of phase 2.
> 
> 
> All I can find is
> 
> <quote>
> The builtin compile() API will be enhanced to accept Unicode as input.
> </quote>
> 
> That leaves the question open what the compile function *does* beyond
> merely accepting Unicode strings; it is canonical that it tries to
> compile it, as it would with a byte string.

Oh, I thought it would be natural from reading the complete
text:

"""
     2. Change the tokenizer/compiler base string type from char* to
        Py_UNICODE* and apply the encoding to the complete file.

        Source files which fail to decode cause an error to be raised
        during compilation.

        The builtin compile() API will be enhanced to accept Unicode as
        input. 8-bit string input is subject to the standard procedure
        for encoding detection as decsribed above.
"""

Of course, we no longer need to convert the tokenizer to
work on Py_UNICODE, so the updated text should mention
that compile() encodes Unicode input to UTF-8 to the continue
with the usual processing. (Also see my reply to Fredrik).

> The unspecified aspect is the treatment of byte strings within the
> Unicode string. The current compiler treats them "as-is"; this is
> clearly no option. The reasonable options are:
> 
> 1. convert to byte string using "ascii" encoding,
> 2. convert to byte string using "utf-8" encoding,
> 3. convert to byte string using system default encoding,
> 4. convert to byte string using encoding declared inside the code
>    string. If that route is taken, the question is what happens
>    if no encoding declaration is found.
> 
> 
>>No need for this. The PEP already mentions it.
> 
> 
> Can you please quote the precise words in the text of the PEP that
> answer the question which of the four options above is taken?

Option 2. Ideal would be to have the tokenizer skip the
encoding declaration detection and start directly with the
UTF-8 string (this also solves the problems you'd run into
in case the Unicode source code has a source code encoding
comment).

Is that possible with the implementation ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From gmcm@hypernet.com  Sun Jul 14 18:24:48 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Sun, 14 Jul 2002 13:24:48 -0400
Subject: [Python-Dev] python package
In-Reply-To: <3D318B69.4080401@lemburg.com>
Message-ID: <3D317BA0.3011.FA4F068@localhost>

On 14 Jul 2002 at 16:32, M.-A. Lemburg wrote:

> Gordon McMillan wrote:

[various cute hacks]

> > None of these will freeze successfully.
> 
> Hmm, then how do you freeze _xmlplue ?

Most people whine publicly until someone comes
up with a workaround. Installer has a way of hooking
modules & packages that play games like that, but
if you're using tools/freeze, you'll probably be told 
to overlay xml with _xmlplus.

If the package uses lots of nasty tricks (eg,
pyopengl), the answer is "you don't".

> > Two of them appear to rely on an implementation
> > detail - that __path__ (only defined for
> > imp.PKG_DIRECTORY's) will be followed even in
> > a plain module.
> 
> AFAIK, that's not an implementation detail, but a
> documented way of finding out whether a module is a
> package or not.

Correct. But stuffing a __path__ attribute into
a module does *not* make the module a package.

'''Whenever a submodule of a package is loaded, Python makes sure that the package itself is loaded first, loading its __init__.py file if necessary.'''

and

'''Once loaded, the difference between a package and a module is minimal.'''
 
> But it works (tm) :-)

For a sufficiently short-sighted definition of "work".

-- Gordon
http://www.mcmillan-inc.com/




From mal@lemburg.com  Sun Jul 14 18:53:32 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 14 Jul 2002 19:53:32 +0200
Subject: [Python-Dev] python package
References: <3D317BA0.3011.FA4F068@localhost>
Message-ID: <3D31BA9C.3030309@lemburg.com>

Gordon McMillan wrote:
>>>Two of them appear to rely on an implementation
>>>detail - that __path__ (only defined for
>>>imp.PKG_DIRECTORY's) will be followed even in
>>>a plain module.
>>
>>AFAIK, that's not an implementation detail, but a
>>documented way of finding out whether a module is a
>>package or not.
> 
> 
> Correct. But stuffing a __path__ attribute into
> a module does *not* make the module a package.
> 
> '''Whenever a submodule of a package is loaded, Python makes sure that the package itself is loaded first, loading its __init__.py file if necessary.'''
> 
> and
> 
> '''Once loaded, the difference between a package and a module is minimal.'''

Hmm, I know that Python itself uses __path__ to tell
whether it has a package or not, so I don't see why
a module can't be regarded as package. Moving the
module into a directory of the same name and then
renaming it to __init__.py has the same effect. And
in that case, hacking __path__ is perfectly legal.

>>But it works (tm) :-)
> 
> 
> For a sufficiently short-sighted definition of "work".

You haven't commented on the sys.modules trick yet. This
one doesn't even use the __path__ hackery :-)

DateTime.py:
import sys
import mx.DateTime
sys.modules[__name__] = mx.DateTime

Python 2.1.3 (#1, May 16 2002, 18:59:26)
 >>> import DateTime
 >>> DateTime.now()
<DateTime object for '2002-07-14 19:51:03.34' at 82307a0>
 >>> DateTime
<module 'mx.DateTime' from '/home/lemburg/projects/mx/DateTime/__init__.py'>
 >>> id(DateTime)
135726540
 >>> from mx import DateTime
 >>> id(DateTime)
135726540

See: it's the same module :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From skip@pobox.com  Sun Jul 14 16:50:20 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 14 Jul 2002 10:50:20 -0500
Subject: [Python-Dev] Minor socket timeout quibble - timeout raises socket.error
In-Reply-To: <200207121637.g6CGbAE12463@pcp02138704pcs.reston01.va.comcast.net>
References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com>
 <3D188C5D.D519DD90@3captus.com>
 <200207121637.g6CGbAE12463@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15665.40380.978210.579022@localhost.localdomain>

    >> > I just noticed in the development docs that when a timeout on a
    >> > socket occurs, socket.error is raised.  I rather liked the idea
    >> > that a different exception was raised for timeouts (I used Tim
    >> > O'Malley's timeout_socket module).

    Guido> I'd like to understand the use case better.  Why would you want
    Guido> to make this distinction?

In my application that uses Tim O'Malley's timeout_socket module, I do very
little different in the two cases other than to generate a different message
for the user.

As I mentioned in the subject, it is a minor quibble.  If it's a pain to
modify the code to raise a distinct error, I wouldn't bother.

Skip



From gmcm@hypernet.com  Sun Jul 14 20:19:29 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Sun, 14 Jul 2002 15:19:29 -0400
Subject: [Python-Dev] python package
In-Reply-To: <3D31BA9C.3030309@lemburg.com>
Message-ID: <3D319681.9530.100DED6A@localhost>

On 14 Jul 2002 at 19:53, M.-A. Lemburg wrote:

> Gordon McMillan wrote:

> > ... But stuffing a __path__ attribute into
> > a module does *not* make the module a package.

> Hmm, I know that Python itself uses __path__ to
> tell whether it has a package or not, so I don't see
> why a
> module can't be regarded as package.

If you put on a Richard M. Nixon mask, you might
be mistaken for ("regarded as") Richard M. Nixon.
That doesn't make you Richard M. Nixon.

Stuffing __path__ into a module means that 
*most* of Python's runtime will regard your module as
a package. It doesn't make it a package.

In particular, most introspection tools and most
programmers will not recognize your module as
a package.

> Moving the module into a directory of the same name
> and then renaming it to __init__.py has the same
> effect. And in that case, hacking __path__ is
> perfectly legal. 

Yes, it now *is* a package. One which violates
recommended practice, which is to keep __init__.py
simple, but still a package.
 
> You haven't commented on the sys.modules trick yet.
> This one doesn't even use the __path__ hackery :-)
> 
> DateTime.py:
> import sys
> import mx.DateTime
> sys.modules[__name__] = mx.DateTime
[...]
> See: it's the same module :-)

Anytime x != sys.modules[x].__name__,
someone, sometime will suffer.

Installer and (I believe) py2exe have hooks
so that this gets analyzed properly. The hook
is keyed by "DateTime".

If you really find it intolerable to stick your
users with making a one line change in their
code, you might consider contributing hooks 
to Installer (or patches to py2exe).

Particularly for your non-free packages, since
I'm not going to download those and reverse-engineer 
them.

Or perhaps you could do like Pmw, and
include a "bundle" script.

-- Gordon
http://www.mcmillan-inc.com/




From martin@v.loewis.de  Sun Jul 14 20:31:27 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 14 Jul 2002 21:31:27 +0200
Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings
In-Reply-To: <3D31AE95.6070804@lemburg.com>
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>
 <018901c22aa4$e0f41190$ced241d5@hagrid>
 <m37kjyohyw.fsf@mira.informatik.hu-berlin.de>
 <3D31A50E.4050800@lemburg.com>
 <m3it3i1den.fsf@mira.informatik.hu-berlin.de>
 <3D31AE95.6070804@lemburg.com>
Message-ID: <m3sn2myuls.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> Oh, I thought it would be natural from reading the complete
> text:

It still is not natural from reading the text you quote.

> 
> """
>      2. Change the tokenizer/compiler base string type from char* to
>         Py_UNICODE* and apply the encoding to the complete file.

As you say, this is more conveniently done with UTF-8 char*.

>         Source files which fail to decode cause an error to be raised
>         during compilation.

In the case of Unicode strings passed to compile(), this is
irrelevant; the string is already decoded.

>         The builtin compile() API will be enhanced to accept Unicode as
>         input. 8-bit string input is subject to the standard procedure
>         for encoding detection as decsribed above.
> """

That only says that Unicode strings are processed; it still does not
say how string literals appearing the source code are treated.

> Of course, we no longer need to convert the tokenizer to
> work on Py_UNICODE, so the updated text should mention
> that compile() encodes Unicode input to UTF-8 to the continue
> with the usual processing.

The PEP currently does not say that.

> > 2. convert to byte string using "utf-8" encoding,
[...]
> Option 2. 

I think this contradicts the current wording of the PEP. It says

"5. ... and creating string objects from the Unicode literal data by
first reencoding the UTF-8 data into 8-bit string data using the given
file encoding"

The phrasing "the given file encoding" is a bit lax, but given the
string

u"""
# -*- coding: iso-8859-1 -*-
s = 'some latin-1 text'
"""

I would expect that the encoding "given" is iso-8859-1, not utf-8.
Now, I interpret your message to mean that s will be encoded in
utf-8. Correct?

If so, I think Fredrik is right, and

  compile(unicode(script, extract_encoding(script)))

does indeed something different than

  compile(script)

as the latter would give the string value assigned to s in its
original encoding, i.e. latin-1.

> Ideal would be to have the tokenizer skip the encoding declaration
> detection and start directly with the UTF-8 string 

"skip the encoding declaration" can't really work; you have to parse
the source code line by line. You can tell the implementation to
ignore the encoding declaration, if desired.

> (this also solves the problems you'd run into in case the Unicode
> source code has a source code encoding comment).

Well, that is precisely the issue that I'm trying to address here. I
still believe that the resulting behaviour is not specified in the PEP
at the moment (which is no big deal, since the current implementation
does not touch compile() at all).

Regards,
Martin




From bernie@3captus.com  Fri Jul 12 20:29:17 2002
From: bernie@3captus.com (Bernard Yue)
Date: Fri, 12 Jul 2002 13:29:17 -0600
Subject: [Python-Dev] Minor socket timeout quibble - timeout raises
 socket.error
References: <15639.52525.481846.601961@12-248-8-148.client.attbi.com>
 <3D188C5D.D519DD90@3captus.com> <200207121637.g6CGbAE12463@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D2F2E0D.2C4FD92F@3captus.com>

This is a multi-part message in MIME format.
--------------67BDF2B7F2453AAB6129FCFC
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Guido van Rossum wrote:
> 
> [Skip Montanaro]
> > > I just noticed in the development docs that when a timeout on a socket
> > > occurs, socket.error is raised.  I rather liked the idea that a different
> > > exception was raised for timeouts (I used Tim O'Malley's timeout_socket
> > > module).  Making a TimeoutError exception a subclass of socket.error would
> > > be fine so you can catch it with existing code, but I could see recovering
> > > differently for a timeout as opposed to other possible errors:
> > >
> > >     sock.settimeout(5.0)
> > >     try:
> > >         data = sock.recv(8192)
> > >     except socket.TimeoutError:
> > >         # maybe requeue the request
> > >         ...
> > >     except socket.error, codes:
> > >         # some more drastic solution is needed
> > >         ...
> > >
> 
> [Bernard Yue]
> > +1 on your suggestion.  Anyway, under windows, the current
> > implementation returns incorrect socket.error code for timeout.  I am
> > working on the test suite as well as a fix for problem found.  Once the
> > code is bug free maybe we can put the TimeoutError in.
> >
> > I will leave it to Guido for the approval of the change.  When he comes
> > back from his holiday.
> 
> The way I restructured the code it is impossible to distinguish a
> timeout error from other errors; you simply get the "no data
> available" error from the socket operation.  This is the same error
> you'd get in non-blocking mode.
> 

To distinguish a timeout error, the caller can check s->sock_timeout
when a non-blocking mode error occured, or just return an error code
from internal_select() (I guess you must have your reason to taken it
out in the first place)

> Before I recomplicate the code so that it can raise a separate error
> when the select fails, I'd like to understand the use case better.
> Why would you want to make this distinction?  Requeueing the request
> (as in Skip's example) doesn't make sense IMO: you set the timeout for
> a reason, and that reason is that you want to give up if it takes too
> long.  If you really intend to retry you're better of disabling the
> timeout!
>

How about the following (assume we have socket.setDefaultTimeout()):

    import socket
    import urllib

    socket.setDefaultTimeout(5.0)
    retry = 0
    url = 'some url'

    while retry < 3:
        try:
            file = urllib.urlretrieve(url)
        except socket.TimeoutError:
            if retry == 2:
                print "Server too busy, given up!"
                raise
            else:
                print "Server busy, retry!"
                retry += 1
        else:
            break

MS IIS behave strangely to http request.  When the server is very busy,
it will randomly drop some requests without disconnecting the client. 
So the best approach for the client is to timeout and retry.  I guess
that might be the reason why people needed timeoutsocket in the first
place.

> If you really want to, you can already distinguish the timeout case,
> because you get an EAGAIN error then (maybe something else on Windows
> -- Bernard, if you have a fix for that, please send it to me).
>

I am struggling with the test case for the new socket code.  The timeout
test case I've send you works with the old socketmodule.c (attached),
but not with the lastest version (on linux or windows).  It's strange,
your new implementation looks much cleaner.

Please bear with me a bit longer for a patch  :.(

> So a -0 unless more evidence is brought forward.
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)


Bernie
--------------67BDF2B7F2453AAB6129FCFC
Content-Type: application/vnd.lotus-organizer;
 name="socketmodule.c.org"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
 filename="socketmodule.c.org"

LyogU29ja2V0IG1vZHVsZSAqLwoKLyoKClRoaXMgbW9kdWxlIHByb3ZpZGVzIGFuIGludGVy
ZmFjZSB0byBCZXJrZWxleSBzb2NrZXQgSVBDLgoKTGltaXRhdGlvbnM6CgotIE9ubHkgQUZf
SU5FVCwgQUZfSU5FVDYgYW5kIEFGX1VOSVggYWRkcmVzcyBmYW1pbGllcyBhcmUgc3VwcG9y
dGVkIGluIGEKICBwb3J0YWJsZSBtYW5uZXIsIHRob3VnaCBBRl9QQUNLRVQgaXMgc3VwcG9y
dGVkIHVuZGVyIExpbnV4LgotIE5vIHJlYWQvd3JpdGUgb3BlcmF0aW9ucyAodXNlIHNlbmRh
bGwvcmVjdiBvciBtYWtlZmlsZSBpbnN0ZWFkKS4KLSBBZGRpdGlvbmFsIHJlc3RyaWN0aW9u
cyBhcHBseSBvbiBzb21lIG5vbi1Vbml4IHBsYXRmb3JtcyAoY29tcGVuc2F0ZWQKICBmb3Ig
Ynkgc29ja2V0LnB5KS4KCk1vZHVsZSBpbnRlcmZhY2U6CgotIHNvY2tldC5lcnJvcjogZXhj
ZXB0aW9uIHJhaXNlZCBmb3Igc29ja2V0IHNwZWNpZmljIGVycm9ycwotIHNvY2tldC5nYWll
cnJvcjogZXhjZXB0aW9uIHJhaXNlZCBmb3IgZ2V0YWRkcmluZm8vZ2V0bmFtZWluZm8gZXJy
b3JzLAoJYSBzdWJjbGFzcyBvZiBzb2NrZXQuZXJyb3IKLSBzb2NrZXQuaGVycm9yOiBleGNl
cHRpb24gcmFpc2VkIGZvciBnZXRob3N0YnkqIGVycm9ycywKCWEgc3ViY2xhc3Mgb2Ygc29j
a2V0LmVycm9yCi0gc29ja2V0LmdldGhvc3RieW5hbWUoaG9zdG5hbWUpIC0tPiBob3N0IElQ
IGFkZHJlc3MgKHN0cmluZzogJ2RkLmRkLmRkLmRkJykKLSBzb2NrZXQuZ2V0aG9zdGJ5YWRk
cihJUCBhZGRyZXNzKSAtLT4gKGhvc3RuYW1lLCBbYWxpYXMsIC4uLl0sIFtJUCBhZGRyLCAu
Li5dKQotIHNvY2tldC5nZXRob3N0bmFtZSgpIC0tPiBob3N0IG5hbWUgKHN0cmluZzogJ3Nw
YW0nIG9yICdzcGFtLmRvbWFpbi5jb20nKQotIHNvY2tldC5nZXRwcm90b2J5bmFtZShwcm90
b2NvbG5hbWUpIC0tPiBwcm90b2NvbCBudW1iZXIKLSBzb2NrZXQuZ2V0c2VydmJ5bmFtZShz
ZXJ2aWNlbmFtZSwgcHJvdG9jb2xuYW1lKSAtLT4gcG9ydCBudW1iZXIKLSBzb2NrZXQuc29j
a2V0KGZhbWlseSwgdHlwZSBbLCBwcm90b10pIC0tPiBuZXcgc29ja2V0IG9iamVjdAotIHNv
Y2tldC5udG9ocygxNiBiaXQgdmFsdWUpIC0tPiBuZXcgaW50IG9iamVjdAotIHNvY2tldC5u
dG9obCgzMiBiaXQgdmFsdWUpIC0tPiBuZXcgaW50IG9iamVjdAotIHNvY2tldC5odG9ucygx
NiBiaXQgdmFsdWUpIC0tPiBuZXcgaW50IG9iamVjdAotIHNvY2tldC5odG9ubCgzMiBiaXQg
dmFsdWUpIC0tPiBuZXcgaW50IG9iamVjdAotIHNvY2tldC5nZXRhZGRyaW5mbyhob3N0LCBw
b3J0IFssIGZhbWlseSwgc29ja3R5cGUsIHByb3RvLCBmbGFnc10pCgktLT4gTGlzdCBvZiAo
ZmFtaWx5LCBzb2NrdHlwZSwgcHJvdG8sIGNhbm9ubmFtZSwgc29ja2FkZHIpCi0gc29ja2V0
LmdldG5hbWVpbmZvKHNvY2thZGRyLCBmbGFncykgLS0+IChob3N0LCBwb3J0KQotIHNvY2tl
dC5BRl9JTkVULCBzb2NrZXQuU09DS19TVFJFQU0sIGV0Yy46IGNvbnN0YW50cyBmcm9tIDxz
b2NrZXQuaD4KLSBzb2NrZXQuaW5ldF9hdG9uKElQIGFkZHJlc3MpIC0+IDMyLWJpdCBwYWNr
ZWQgSVAgcmVwcmVzZW50YXRpb24KLSBzb2NrZXQuaW5ldF9udG9hKHBhY2tlZCBJUCkgLT4g
SVAgYWRkcmVzcyBzdHJpbmcKLSBhbiBJbnRlcm5ldCBzb2NrZXQgYWRkcmVzcyBpcyBhIHBh
aXIgKGhvc3RuYW1lLCBwb3J0KQogIHdoZXJlIGhvc3RuYW1lIGNhbiBiZSBhbnl0aGluZyBy
ZWNvZ25pemVkIGJ5IGdldGhvc3RieW5hbWUoKQogIChpbmNsdWRpbmcgdGhlIGRkLmRkLmRk
LmRkIG5vdGF0aW9uKSBhbmQgcG9ydCBpcyBpbiBob3N0IGJ5dGUgb3JkZXIKLSB3aGVyZSBh
IGhvc3RuYW1lIGlzIHJldHVybmVkLCB0aGUgZGQuZGQuZGQuZGQgbm90YXRpb24gaXMgdXNl
ZAotIGEgVU5JWCBkb21haW4gc29ja2V0IGFkZHJlc3MgaXMgYSBzdHJpbmcgc3BlY2lmeWlu
ZyB0aGUgcGF0aG5hbWUKLSBhbiBBRl9QQUNLRVQgc29ja2V0IGFkZHJlc3MgaXMgYSB0dXBs
ZSBjb250YWluaW5nIGEgc3RyaW5nCiAgc3BlY2lmeWluZyB0aGUgZXRoZXJuZXQgaW50ZXJm
YWNlIGFuZCBhbiBpbnRlZ2VyIHNwZWNpZnlpbmcKICB0aGUgRXRoZXJuZXQgcHJvdG9jb2wg
bnVtYmVyIHRvIGJlIHJlY2VpdmVkLiBGb3IgZXhhbXBsZToKICAoImV0aDAiLDB4MTIzNCku
ICBPcHRpb25hbCAzcmQsNHRoLDV0aCBlbGVtZW50cyBpbiB0aGUgdHVwbGUKICBzcGVjaWZ5
IHBhY2tldC10eXBlIGFuZCBoYS10eXBlL2FkZHIgLS0gdGhlc2UgYXJlIGlnbm9yZWQgYnkK
ICBuZXR3b3JraW5nIGNvZGUsIGJ1dCBhY2NlcHRlZCBzaW5jZSB0aGV5IGFyZSByZXR1cm5l
ZCBieSB0aGUKICBnZXRzb2NrbmFtZSgpIG1ldGhvZC4KCkxvY2FsIG5hbWluZyBjb252ZW50
aW9uczoKCi0gbmFtZXMgc3RhcnRpbmcgd2l0aCBzb2NrXyBhcmUgc29ja2V0IG9iamVjdCBt
ZXRob2RzCi0gbmFtZXMgc3RhcnRpbmcgd2l0aCBzb2NrZXRfIGFyZSBtb2R1bGUtbGV2ZWwg
ZnVuY3Rpb25zCi0gbmFtZXMgc3RhcnRpbmcgd2l0aCBQeVNvY2tldCBhcmUgZXhwb3J0ZWQg
dGhyb3VnaCBzb2NrZXRtb2R1bGUuaAoKKi8KCi8qIFNvY2tldCBvYmplY3QgZG9jdW1lbnRh
dGlvbiAqLwpzdGF0aWMgY2hhciBzb2NrX2RvY1tdID0KInNvY2tldChbZmFtaWx5WywgdHlw
ZVssIHByb3RvXV1dKSAtPiBzb2NrZXQgb2JqZWN0XG5cClxuXApPcGVuIGEgc29ja2V0IG9m
IHRoZSBnaXZlbiB0eXBlLiAgVGhlIGZhbWlseSBhcmd1bWVudCBzcGVjaWZpZXMgdGhlXG5c
CmFkZHJlc3MgZmFtaWx5OyBpdCBkZWZhdWx0cyB0byBBRl9JTkVULiAgVGhlIHR5cGUgYXJn
dW1lbnQgc3BlY2lmaWVzXG5cCndoZXRoZXIgdGhpcyBpcyBhIHN0cmVhbSAoU09DS19TVFJF
QU0sIHRoaXMgaXMgdGhlIGRlZmF1bHQpXG5cCm9yIGRhdGFncmFtIChTT0NLX0RHUkFNKSBz
b2NrZXQuICBUaGUgcHJvdG9jb2wgYXJndW1lbnQgZGVmYXVsdHMgdG8gMCxcblwKc3BlY2lm
eWluZyB0aGUgZGVmYXVsdCBwcm90b2NvbC4gIEtleXdvcmQgYXJndW1lbnRzIGFyZSBhY2Nl
cHRlZC5cblwKXG5cCkEgc29ja2V0IG9iamVjdCByZXByZXNlbnRzIG9uZSBlbmRwb2ludCBv
ZiBhIG5ldHdvcmsgY29ubmVjdGlvbi5cblwKXG5cCk1ldGhvZHMgb2Ygc29ja2V0IG9iamVj
dHMgKGtleXdvcmQgYXJndW1lbnRzIG5vdCBhbGxvd2VkKTpcblwKXG5cCmFjY2VwdCgpIC0t
IGFjY2VwdCBhIGNvbm5lY3Rpb24sIHJldHVybmluZyBuZXcgc29ja2V0IGFuZCBjbGllbnQg
YWRkcmVzc1xuXApiaW5kKGFkZHIpIC0tIGJpbmQgdGhlIHNvY2tldCB0byBhIGxvY2FsIGFk
ZHJlc3NcblwKY2xvc2UoKSAtLSBjbG9zZSB0aGUgc29ja2V0XG5cCmNvbm5lY3QoYWRkcikg
LS0gY29ubmVjdCB0aGUgc29ja2V0IHRvIGEgcmVtb3RlIGFkZHJlc3NcblwKY29ubmVjdF9l
eChhZGRyKSAtLSBjb25uZWN0LCByZXR1cm4gYW4gZXJyb3IgY29kZSBpbnN0ZWFkIG9mIGFu
IGV4Y2VwdGlvblxuXApkdXAoKSAtLSByZXR1cm4gYSBuZXcgc29ja2V0IG9iamVjdCBpZGVu
dGljYWwgdG8gdGhlIGN1cnJlbnQgb25lIFsqXVxuXApmaWxlbm8oKSAtLSByZXR1cm4gdW5k
ZXJseWluZyBmaWxlIGRlc2NyaXB0b3JcblwKZ2V0cGVlcm5hbWUoKSAtLSByZXR1cm4gcmVt
b3RlIGFkZHJlc3MgWypdXG5cCmdldHNvY2tuYW1lKCkgLS0gcmV0dXJuIGxvY2FsIGFkZHJl
c3NcblwKZ2V0c29ja29wdChsZXZlbCwgb3B0bmFtZVssIGJ1Zmxlbl0pIC0tIGdldCBzb2Nr
ZXQgb3B0aW9uc1xuXApnZXR0aW1lb3V0KCkgLS0gcmV0dXJuIHRpbWVvdXQgb3IgTm9uZVxu
XApsaXN0ZW4obikgLS0gc3RhcnQgbGlzdGVuaW5nIGZvciBpbmNvbWluZyBjb25uZWN0aW9u
c1xuXAptYWtlZmlsZShbbW9kZSwgW2J1ZnNpemVdXSkgLS0gcmV0dXJuIGEgZmlsZSBvYmpl
Y3QgZm9yIHRoZSBzb2NrZXQgWypdXG5cCnJlY3YoYnVmbGVuWywgZmxhZ3NdKSAtLSByZWNl
aXZlIGRhdGFcblwKcmVjdmZyb20oYnVmbGVuWywgZmxhZ3NdKSAtLSByZWNlaXZlIGRhdGEg
YW5kIHNlbmRlcidzIGFkZHJlc3NcblwKc2VuZGFsbChkYXRhWywgZmxhZ3NdKSAtLSBzZW5k
IGFsbCBkYXRhXG5cCnNlbmQoZGF0YVssIGZsYWdzXSkgLS0gc2VuZCBkYXRhLCBtYXkgbm90
IHNlbmQgYWxsIG9mIGl0XG5cCnNlbmR0byhkYXRhWywgZmxhZ3NdLCBhZGRyKSAtLSBzZW5k
IGRhdGEgdG8gYSBnaXZlbiBhZGRyZXNzXG5cCnNldGJsb2NraW5nKDAgfCAxKSAtLSBzZXQg
b3IgY2xlYXIgdGhlIGJsb2NraW5nIEkvTyBmbGFnXG5cCnNldHNvY2tvcHQobGV2ZWwsIG9w
dG5hbWUsIHZhbHVlKSAtLSBzZXQgc29ja2V0IG9wdGlvbnNcblwKc2V0dGltZW91dChOb25l
IHwgZmxvYXQpIC0tIHNldCBvciBjbGVhciB0aGUgdGltZW91dFxuXApzaHV0ZG93bihob3cp
IC0tIHNodXQgZG93biB0cmFmZmljIGluIG9uZSBvciBib3RoIGRpcmVjdGlvbnNcblwKXG5c
CiBbKl0gbm90IGF2YWlsYWJsZSBvbiBhbGwgcGxhdGZvcm1zISI7CgojaW5jbHVkZSAiUHl0
aG9uLmgiCgovKiBYWFggVGhpcyBpcyBhIHRlcnJpYmxlIG1lc3Mgb2Ygb2YgcGxhdGZvcm0t
ZGVwZW5kZW50IHByZXByb2Nlc3NvciBoYWNrcy4KICAgSSBob3BlIHNvbWUgZGF5IHNvbWVv
bmUgY2FuIGNsZWFuIHRoaXMgdXAgcGxlYXNlLi4uICovCgovKiBIYWNrcyBmb3IgZ2V0aG9z
dGJ5bmFtZV9yKCkuICBPbiBzb21lIG5vbi1MaW51eCBwbGF0Zm9ybXMsIHRoZSBjb25maWd1
cmUKICAgc2NyaXB0IGRvZXNuJ3QgZ2V0IHRoaXMgcmlnaHQsIHNvIHdlIGhhcmRjb2RlIHNv
bWUgcGxhdGZvcm0gY2hlY2tzIGJlbG93LgogICBPbiB0aGUgb3RoZXIgaGFuZCwgbm90IGFs
bCBMaW51eCB2ZXJzaW9ucyBhZ3JlZSwgc28gdGhlcmUgdGhlIHNldHRpbmdzCiAgIGNvbXB1
dGVkIGJ5IHRoZSBjb25maWd1cmUgc2NyaXB0IGFyZSBuZWVkZWQhICovCgojaWZuZGVmIGxp
bnV4CiMgdW5kZWYgSEFWRV9HRVRIT1NUQllOQU1FX1JfM19BUkcKIyB1bmRlZiBIQVZFX0dF
VEhPU1RCWU5BTUVfUl81X0FSRwojIHVuZGVmIEhBVkVfR0VUSE9TVEJZTkFNRV9SXzZfQVJH
CiNlbmRpZgoKI2lmbmRlZiBXSVRIX1RIUkVBRAojIHVuZGVmIEhBVkVfR0VUSE9TVEJZTkFN
RV9SCiNlbmRpZgoKI2lmZGVmIEhBVkVfR0VUSE9TVEJZTkFNRV9SCiMgaWYgZGVmaW5lZChf
QUlYKSB8fCBkZWZpbmVkKF9fb3NmX18pCiMgIGRlZmluZSBIQVZFX0dFVEhPU1RCWU5BTUVf
Ul8zX0FSRwojIGVsaWYgZGVmaW5lZChfX3N1bikgfHwgZGVmaW5lZChfX3NnaSkKIyAgZGVm
aW5lIEhBVkVfR0VUSE9TVEJZTkFNRV9SXzVfQVJHCiMgZWxpZiBkZWZpbmVkKGxpbnV4KQov
KiBSZWx5IG9uIHRoZSBjb25maWd1cmUgc2NyaXB0ICovCiMgZWxzZQojICB1bmRlZiBIQVZF
X0dFVEhPU1RCWU5BTUVfUgojIGVuZGlmCiNlbmRpZgoKI2lmICFkZWZpbmVkKEhBVkVfR0VU
SE9TVEJZTkFNRV9SKSAmJiBkZWZpbmVkKFdJVEhfVEhSRUFEKSAmJiBcCiAgICAhZGVmaW5l
ZChNU19XSU5ET1dTKQojIGRlZmluZSBVU0VfR0VUSE9TVEJZTkFNRV9MT0NLCiNlbmRpZgoK
I2lmZGVmIFVTRV9HRVRIT1NUQllOQU1FX0xPQ0sKIyBpbmNsdWRlICJweXRocmVhZC5oIgoj
ZW5kaWYKCiNpZiBkZWZpbmVkKFBZQ0NfVkFDUFApCiMgaW5jbHVkZSA8dHlwZXMuaD4KIyBp
bmNsdWRlIDxpby5oPgojIGluY2x1ZGUgPHN5cy9pb2N0bC5oPgojIGluY2x1ZGUgPHV0aWxz
Lmg+CiMgaW5jbHVkZSA8Y3R5cGUuaD4KI2VuZGlmCgojaWYgZGVmaW5lZChQWU9TX09TMikK
IyBkZWZpbmUgIElOQ0xfRE9TCiMgZGVmaW5lICBJTkNMX0RPU0VSUk9SUwojIGRlZmluZSAg
SU5DTF9OT1BNQVBJCiMgaW5jbHVkZSA8b3MyLmg+CiNlbmRpZgoKLyogR2VuZXJpYyBpbmNs
dWRlcyAqLwojaW5jbHVkZSA8c3lzL3R5cGVzLmg+CiNpbmNsdWRlIDxzaWduYWwuaD4KCi8q
IEdlbmVyaWMgc29ja2V0IG9iamVjdCBkZWZpbml0aW9ucyBhbmQgaW5jbHVkZXMgKi8KI2Rl
ZmluZSBQeVNvY2tldF9CVUlMRElOR19TT0NLRVQKI2luY2x1ZGUgInNvY2tldG1vZHVsZS5o
IgoKLyogQWRkcmVzc2luZyBpbmNsdWRlcyAqLwoKI2lmbmRlZiBNU19XSU5ET1dTCgovKiBO
b24tTVMgV0lORE9XUyBpbmNsdWRlcyAqLwojIGluY2x1ZGUgPG5ldGRiLmg+CgovKiBIZWFk
ZXJzIG5lZWRlZCBmb3IgaW5ldF9udG9hKCkgYW5kIGluZXRfYWRkcigpICovCiMgaWZkZWYg
X19CRU9TX18KIyAgaW5jbHVkZSA8bmV0L25ldGRiLmg+CiMgZWxpZiBkZWZpbmVkKFBZT1Nf
T1MyKSAmJiBkZWZpbmVkKFBZQ0NfVkFDUFApCiMgIGluY2x1ZGUgPG5ldGRiLmg+CnR5cGVk
ZWYgc2l6ZV90IHNvY2tsZW5fdDsKIyBlbHNlCiMgICBpbmNsdWRlIDxhcnBhL2luZXQuaD4K
IyBlbmRpZgoKIyBpZm5kZWYgUklTQ09TCiMgIGluY2x1ZGUgPGZjbnRsLmg+CiMgZWxzZQoj
ICBpbmNsdWRlIDxzeXMvZmNudGwuaD4KIyAgZGVmaW5lIE5PX0RVUAppbnQgaF9lcnJubzsg
Lyogbm90IHVzZWQgKi8KIyBlbmRpZgoKI2Vsc2UKCi8qIE1TX1dJTkRPV1MgaW5jbHVkZXMg
Ki8KIyBpbmNsdWRlIDxmY250bC5oPgoKI2VuZGlmCgojaWZkZWYgSEFWRV9TVERERUZfSAoj
IGluY2x1ZGUgPHN0ZGRlZi5oPgojZW5kaWYKCiNpZm5kZWYgb2Zmc2V0b2YKIyBkZWZpbmUg
b2Zmc2V0b2YodHlwZSwgbWVtYmVyKQkoKHNpemVfdCkoJigodHlwZSAqKTApLT5tZW1iZXIp
KQojZW5kaWYKCiNpZm5kZWYgT19OREVMQVkKIyBkZWZpbmUgT19OREVMQVkgT19OT05CTE9D
SwkvKiBGb3IgUU5YIG9ubHk/ICovCiNlbmRpZgoKI2luY2x1ZGUgImFkZHJpbmZvLmgiCgoj
aWZuZGVmIEhBVkVfSU5FVF9QVE9OCmludCBpbmV0X3B0b24oaW50IGFmLCBjb25zdCBjaGFy
ICpzcmMsIHZvaWQgKmRzdCk7CmNvbnN0IGNoYXIgKmluZXRfbnRvcChpbnQgYWYsIGNvbnN0
IHZvaWQgKnNyYywgY2hhciAqZHN0LCBzb2NrbGVuX3Qgc2l6ZSk7CiNlbmRpZgoKI2lmZGVm
IF9fQVBQTEVfXwovKiBPbiBPUyBYLCBnZXRhZGRyaW5mbyByZXR1cm5zIG5vIGVycm9yIGlu
ZGljYXRpb24gb2YgbG9va3VwCiAgIGZhaWx1cmUsIHNvIHdlIG11c3QgdXNlIHRoZSBlbXVs
YXRpb24gaW5zdGVhZCBvZiB0aGUgbGliaW5mbwogICBpbXBsZW1lbnRhdGlvbi4gVW5mb3J0
dW5hdGVseSwgcGVyZm9ybWluZyBhbiBhdXRvY29uZiB0ZXN0CiAgIGZvciB0aGlzIGJ1ZyB3
b3VsZCByZXF1aXJlIEROUyBhY2Nlc3MgZm9yIHRoZSBtYWNoaW5lIHBlcmZvcm1pbmcKICAg
dGhlIGNvbmZpZ3VyYXRpb24sIHdoaWNoIGlzIG5vdCBhY2NlcHRhYmxlLiBUaGVyZWZvcmUs
IHdlCiAgIGRldGVybWluZSB0aGUgYnVnIGp1c3QgYnkgY2hlY2tpbmcgZm9yIF9fQVBQTEVf
Xy4gSWYgdGhpcyBidWcKICAgZ2V0cyBldmVyIGZpeGVkLCBwZXJoYXBzIGNoZWNraW5nIGZv
ciBzeXMvdmVyc2lvbi5oIHdvdWxkIGJlCiAgIGFwcHJvcHJpYXRlLCB3aGljaCBpcyAxMC8w
IG9uIHRoZSBzeXN0ZW0gd2l0aCB0aGUgYnVnLiAqLwojdW5kZWYgSEFWRV9HRVRBRERSSU5G
TwovKiBhdm9pZCBjbGFzaGVzIHdpdGggdGhlIEMgbGlicmFyeSBkZWZpbml0aW9uIG9mIHRo
ZSBzeW1ib2wuICovCiNkZWZpbmUgZ2V0YWRkcmluZm8gZmFrZV9nZXRhZGRyaW5mbwojZW5k
aWYKCi8qIEkga25vdyB0aGlzIGlzIGEgYmFkIHByYWN0aWNlLCBidXQgaXQgaXMgdGhlIGVh
c2llc3QuLi4gKi8KI2lmICFkZWZpbmVkKEhBVkVfR0VUQUREUklORk8pCiNpbmNsdWRlICJn
ZXRhZGRyaW5mby5jIgojZW5kaWYKI2lmICFkZWZpbmVkKEhBVkVfR0VUTkFNRUlORk8pCiNp
bmNsdWRlICJnZXRuYW1laW5mby5jIgojZW5kaWYKCiNpZiBkZWZpbmVkKE1TX1dJTkRPV1Mp
IHx8IGRlZmluZWQoX19CRU9TX18pCi8qIEJlT1Mgc3VmZmVycyBmcm9tIHRoZSBzYW1lIHNv
Y2tldCBkaWNob3RvbXkgYXMgV2luMzIuLi4gLSBbY2poXSAqLwovKiBzZWVtIHRvIGJlIGEg
ZmV3IGRpZmZlcmVuY2VzIGluIHRoZSBBUEkgKi8KI2RlZmluZSBTT0NLRVRDTE9TRSBjbG9z
ZXNvY2tldAojZGVmaW5lIE5PX0RVUCAvKiBBY3R1YWxseSBpdCBleGlzdHMgb24gTlQgMy41
LCBidXQgd2hhdCB0aGUgaGVjay4uLiAqLwojZW5kaWYKCiNpZmRlZiBNU19XSU4zMgojZGVm
aW5lIEVBRk5PU1VQUE9SVCBXU0FFQUZOT1NVUFBPUlQKI2RlZmluZSBzbnByaW50ZiBfc25w
cmludGYKI2VuZGlmCgojaWYgZGVmaW5lZChQWU9TX09TMikgJiYgIWRlZmluZWQoUFlDQ19H
Q0MpCiNkZWZpbmUgU09DS0VUQ0xPU0Ugc29jbG9zZQojZGVmaW5lIE5PX0RVUCAvKiBTb2Nr
ZXRzIGFyZSBOb3QgQWN0dWFsIEZpbGUgSGFuZGxlcyB1bmRlciBPUy8yICovCiNlbmRpZgoK
I2lmbmRlZiBTT0NLRVRDTE9TRQojZGVmaW5lIFNPQ0tFVENMT1NFIGNsb3NlCiNlbmRpZgoK
LyogWFhYIFRoZXJlJ3MgYSBwcm9ibGVtIGhlcmU6ICpzdGF0aWMqIGZ1bmN0aW9ucyBhcmUg
bm90IHN1cHBvc2VkIHRvIGhhdmUKICAgYSBQeSBwcmVmaXggKG9yIHVzZSBDYXBpdGFsaXpl
ZFdvcmRzKS4gIExhdGVyLi4uICovCgovKiBHbG9iYWwgdmFyaWFibGUgaG9sZGluZyB0aGUg
ZXhjZXB0aW9uIHR5cGUgZm9yIGVycm9ycyBkZXRlY3RlZAogICBieSB0aGlzIG1vZHVsZSAo
YnV0IG5vdCBhcmd1bWVudCB0eXBlIG9yIG1lbW9yeSBlcnJvcnMsIGV0Yy4pLiAqLwpzdGF0
aWMgUHlPYmplY3QgKnNvY2tldF9lcnJvcjsKc3RhdGljIFB5T2JqZWN0ICpzb2NrZXRfaGVy
cm9yOwpzdGF0aWMgUHlPYmplY3QgKnNvY2tldF9nYWllcnJvcjsKCiNpZmRlZiBSSVNDT1MK
LyogR2xvYmFsIHZhcmlhYmxlIHdoaWNoIGlzICE9MCBpZiBQeXRob24gaXMgcnVubmluZyBp
biBhIFJJU0MgT1MgdGFza3dpbmRvdyAqLwpzdGF0aWMgaW50IHRhc2t3aW5kb3c7CiNlbmRp
ZgoKLyogQSBmb3J3YXJkIHJlZmVyZW5jZSB0byB0aGUgc29ja2V0IHR5cGUgb2JqZWN0Lgog
ICBUaGUgc29ja190eXBlIHZhcmlhYmxlIGNvbnRhaW5zIHBvaW50ZXJzIHRvIHZhcmlvdXMg
ZnVuY3Rpb25zLAogICBzb21lIG9mIHdoaWNoIGNhbGwgbmV3X3NvY2tvYmplY3QoKSwgd2hp
Y2ggdXNlcyBzb2NrX3R5cGUsIHNvCiAgIHRoZXJlIGhhcyB0byBiZSBhIGNpcmN1bGFyIHJl
ZmVyZW5jZS4gKi8Kc3RhdGljZm9yd2FyZCBQeVR5cGVPYmplY3Qgc29ja190eXBlOwoKLyog
Q29udmVuaWVuY2UgZnVuY3Rpb24gdG8gcmFpc2UgYW4gZXJyb3IgYWNjb3JkaW5nIHRvIGVy
cm5vCiAgIGFuZCByZXR1cm4gYSBOVUxMIHBvaW50ZXIgZnJvbSBhIGZ1bmN0aW9uLiAqLwoK
c3RhdGljIFB5T2JqZWN0ICoKc2V0X2Vycm9yKHZvaWQpCnsKI2lmZGVmIE1TX1dJTkRPV1MK
CWludCBlcnJfbm8gPSBXU0FHZXRMYXN0RXJyb3IoKTsKCXN0YXRpYyBzdHJ1Y3QgewoJCWlu
dCBubzsKCQljb25zdCBjaGFyICptc2c7Cgl9ICptc2dwLCBtc2dzW10gPSB7CgkJe1dTQUVJ
TlRSLCAiSW50ZXJydXB0ZWQgc3lzdGVtIGNhbGwifSwKCQl7V1NBRUJBREYsICJCYWQgZmls
ZSBkZXNjcmlwdG9yIn0sCgkJe1dTQUVBQ0NFUywgIlBlcm1pc3Npb24gZGVuaWVkIn0sCgkJ
e1dTQUVGQVVMVCwgIkJhZCBhZGRyZXNzIn0sCgkJe1dTQUVJTlZBTCwgIkludmFsaWQgYXJn
dW1lbnQifSwKCQl7V1NBRU1GSUxFLCAiVG9vIG1hbnkgb3BlbiBmaWxlcyJ9LAoJCXtXU0FF
V09VTERCTE9DSywKCQkgICJUaGUgc29ja2V0IG9wZXJhdGlvbiBjb3VsZCBub3QgY29tcGxl
dGUgIgoJCSAgIndpdGhvdXQgYmxvY2tpbmcifSwKCQl7V1NBRUlOUFJPR1JFU1MsICJPcGVy
YXRpb24gbm93IGluIHByb2dyZXNzIn0sCgkJe1dTQUVBTFJFQURZLCAiT3BlcmF0aW9uIGFs
cmVhZHkgaW4gcHJvZ3Jlc3MifSwKCQl7V1NBRU5PVFNPQ0ssICJTb2NrZXQgb3BlcmF0aW9u
IG9uIG5vbi1zb2NrZXQifSwKCQl7V1NBRURFU1RBRERSUkVRLCAiRGVzdGluYXRpb24gYWRk
cmVzcyByZXF1aXJlZCJ9LAoJCXtXU0FFTVNHU0laRSwgIk1lc3NhZ2UgdG9vIGxvbmcifSwK
CQl7V1NBRVBST1RPVFlQRSwgIlByb3RvY29sIHdyb25nIHR5cGUgZm9yIHNvY2tldCJ9LAoJ
CXtXU0FFTk9QUk9UT09QVCwgIlByb3RvY29sIG5vdCBhdmFpbGFibGUifSwKCQl7V1NBRVBS
T1RPTk9TVVBQT1JULCAiUHJvdG9jb2wgbm90IHN1cHBvcnRlZCJ9LAoJCXtXU0FFU09DS1RO
T1NVUFBPUlQsICJTb2NrZXQgdHlwZSBub3Qgc3VwcG9ydGVkIn0sCgkJe1dTQUVPUE5PVFNV
UFAsICJPcGVyYXRpb24gbm90IHN1cHBvcnRlZCJ9LAoJCXtXU0FFUEZOT1NVUFBPUlQsICJQ
cm90b2NvbCBmYW1pbHkgbm90IHN1cHBvcnRlZCJ9LAoJCXtXU0FFQUZOT1NVUFBPUlQsICJB
ZGRyZXNzIGZhbWlseSBub3Qgc3VwcG9ydGVkIn0sCgkJe1dTQUVBRERSSU5VU0UsICJBZGRy
ZXNzIGFscmVhZHkgaW4gdXNlIn0sCgkJe1dTQUVBRERSTk9UQVZBSUwsICJDYW4ndCBhc3Np
Z24gcmVxdWVzdGVkIGFkZHJlc3MifSwKCQl7V1NBRU5FVERPV04sICJOZXR3b3JrIGlzIGRv
d24ifSwKCQl7V1NBRU5FVFVOUkVBQ0gsICJOZXR3b3JrIGlzIHVucmVhY2hhYmxlIn0sCgkJ
e1dTQUVORVRSRVNFVCwgIk5ldHdvcmsgZHJvcHBlZCBjb25uZWN0aW9uIG9uIHJlc2V0In0s
CgkJe1dTQUVDT05OQUJPUlRFRCwgIlNvZnR3YXJlIGNhdXNlZCBjb25uZWN0aW9uIGFib3J0
In0sCgkJe1dTQUVDT05OUkVTRVQsICJDb25uZWN0aW9uIHJlc2V0IGJ5IHBlZXIifSwKCQl7
V1NBRU5PQlVGUywgIk5vIGJ1ZmZlciBzcGFjZSBhdmFpbGFibGUifSwKCQl7V1NBRUlTQ09O
TiwgIlNvY2tldCBpcyBhbHJlYWR5IGNvbm5lY3RlZCJ9LAoJCXtXU0FFTk9UQ09OTiwgIlNv
Y2tldCBpcyBub3QgY29ubmVjdGVkIn0sCgkJe1dTQUVTSFVURE9XTiwgIkNhbid0IHNlbmQg
YWZ0ZXIgc29ja2V0IHNodXRkb3duIn0sCgkJe1dTQUVUT09NQU5ZUkVGUywgIlRvbyBtYW55
IHJlZmVyZW5jZXM6IGNhbid0IHNwbGljZSJ9LAoJCXtXU0FFVElNRURPVVQsICJPcGVyYXRp
b24gdGltZWQgb3V0In0sCgkJe1dTQUVDT05OUkVGVVNFRCwgIkNvbm5lY3Rpb24gcmVmdXNl
ZCJ9LAoJCXtXU0FFTE9PUCwgIlRvbyBtYW55IGxldmVscyBvZiBzeW1ib2xpYyBsaW5rcyJ9
LAoJCXtXU0FFTkFNRVRPT0xPTkcsICJGaWxlIG5hbWUgdG9vIGxvbmcifSwKCQl7V1NBRUhP
U1RET1dOLCAiSG9zdCBpcyBkb3duIn0sCgkJe1dTQUVIT1NUVU5SRUFDSCwgIk5vIHJvdXRl
IHRvIGhvc3QifSwKCQl7V1NBRU5PVEVNUFRZLCAiRGlyZWN0b3J5IG5vdCBlbXB0eSJ9LAoJ
CXtXU0FFUFJPQ0xJTSwgIlRvbyBtYW55IHByb2Nlc3NlcyJ9LAoJCXtXU0FFVVNFUlMsICJU
b28gbWFueSB1c2VycyJ9LAoJCXtXU0FFRFFVT1QsICJEaXNjIHF1b3RhIGV4Y2VlZGVkIn0s
CgkJe1dTQUVTVEFMRSwgIlN0YWxlIE5GUyBmaWxlIGhhbmRsZSJ9LAoJCXtXU0FFUkVNT1RF
LCAiVG9vIG1hbnkgbGV2ZWxzIG9mIHJlbW90ZSBpbiBwYXRoIn0sCgkJe1dTQVNZU05PVFJF
QURZLCAiTmV0d29yayBzdWJzeXN0ZW0gaXMgdW52YWlsYWJsZSJ9LAoJCXtXU0FWRVJOT1RT
VVBQT1JURUQsICJXaW5Tb2NrIHZlcnNpb24gaXMgbm90IHN1cHBvcnRlZCJ9LAoJCXtXU0FO
T1RJTklUSUFMSVNFRCwKCQkgICJTdWNjZXNzZnVsIFdTQVN0YXJ0dXAoKSBub3QgeWV0IHBl
cmZvcm1lZCJ9LAoJCXtXU0FFRElTQ09OLCAiR3JhY2VmdWwgc2h1dGRvd24gaW4gcHJvZ3Jl
c3MifSwKCQkvKiBSZXNvbHZlciBlcnJvcnMgKi8KCQl7V1NBSE9TVF9OT1RfRk9VTkQsICJO
byBzdWNoIGhvc3QgaXMga25vd24ifSwKCQl7V1NBVFJZX0FHQUlOLCAiSG9zdCBub3QgZm91
bmQsIG9yIHNlcnZlciBmYWlsZWQifSwKCQl7V1NBTk9fUkVDT1ZFUlksICJVbmV4cGVjdGVk
IHNlcnZlciBlcnJvciBlbmNvdW50ZXJlZCJ9LAoJCXtXU0FOT19EQVRBLCAiVmFsaWQgbmFt
ZSB3aXRob3V0IHJlcXVlc3RlZCBkYXRhIn0sCgkJe1dTQU5PX0FERFJFU1MsICJObyBhZGRy
ZXNzLCBsb29rIGZvciBNWCByZWNvcmQifSwKCQl7MCwgTlVMTH0KCX07CglpZiAoZXJyX25v
KSB7CgkJUHlPYmplY3QgKnY7CgkJY29uc3QgY2hhciAqbXNnID0gIndpbnNvY2sgZXJyb3Ii
OwoKCQlmb3IgKG1zZ3AgPSBtc2dzOyBtc2dwLT5tc2c7IG1zZ3ArKykgewoJCQlpZiAoZXJy
X25vID09IG1zZ3AtPm5vKSB7CgkJCQltc2cgPSBtc2dwLT5tc2c7CgkJCQlicmVhazsKCQkJ
fQoJCX0KCgkJdiA9IFB5X0J1aWxkVmFsdWUoIihpcykiLCBlcnJfbm8sIG1zZyk7CgkJaWYg
KHYgIT0gTlVMTCkgewoJCQlQeUVycl9TZXRPYmplY3Qoc29ja2V0X2Vycm9yLCB2KTsKCQkJ
UHlfREVDUkVGKHYpOwoJCX0KCQlyZXR1cm4gTlVMTDsKCX0KCWVsc2UKI2VuZGlmCgojaWYg
ZGVmaW5lZChQWU9TX09TMikgJiYgIWRlZmluZWQoUFlDQ19HQ0MpCglpZiAoc29ja19lcnJu
bygpICE9IE5PX0VSUk9SKSB7CgkJQVBJUkVUIHJjOwoJCVVMT05HICBtc2dsZW47CgkJY2hh
ciBvdXRidWZbMTAwXTsKCQlpbnQgbXllcnJvcmNvZGUgPSBzb2NrX2Vycm5vKCk7CgoJCS8q
IFJldHJpZXZlIHNvY2tldC1yZWxhdGVkIGVycm9yIG1lc3NhZ2UgZnJvbSBNUFROLk1TRyBm
aWxlICovCgkJcmMgPSBEb3NHZXRNZXNzYWdlKE5VTEwsIDAsIG91dGJ1Ziwgc2l6ZW9mKG91
dGJ1ZiksCgkJCQkgICBteWVycm9yY29kZSAtIFNPQ0JBU0VFUlIgKyAyNiwKCQkJCSAgICJt
cHRuLm1zZyIsCgkJCQkgICAmbXNnbGVuKTsKCQlpZiAocmMgPT0gTk9fRVJST1IpIHsKCQkJ
UHlPYmplY3QgKnY7CgoJCQkvKiBPUy8yIGRvZXNuJ3QgZ3VhcmFudGVlIGEgdGVybWluYXRv
ciAqLwoJCQlvdXRidWZbbXNnbGVuXSA9ICdcMCc7CgkJCWlmIChzdHJsZW4ob3V0YnVmKSA+
IDApIHsKCQkJCS8qIElmIG5vbi1lbXB0eSBtc2csIHRyaW0gQ1JMRiAqLwoJCQkJY2hhciAq
bGFzdGMgPSAmb3V0YnVmWyBzdHJsZW4ob3V0YnVmKS0xIF07CgkJCQl3aGlsZSAobGFzdGMg
PiBvdXRidWYgJiYgaXNzcGFjZSgqbGFzdGMpKSB7CgkJCQkJLyogVHJpbSB0cmFpbGluZyB3
aGl0ZXNwYWNlIChDUkxGKSAqLwoJCQkJCSpsYXN0Yy0tID0gJ1wwJzsKCQkJCX0KCQkJfQoJ
CQl2ID0gUHlfQnVpbGRWYWx1ZSgiKGlzKSIsIG15ZXJyb3Jjb2RlLCBvdXRidWYpOwoJCQlp
ZiAodiAhPSBOVUxMKSB7CgkJCQlQeUVycl9TZXRPYmplY3Qoc29ja2V0X2Vycm9yLCB2KTsK
CQkJCVB5X0RFQ1JFRih2KTsKCQkJfQoJCQlyZXR1cm4gTlVMTDsKCQl9Cgl9CiNlbmRpZgoK
CXJldHVybiBQeUVycl9TZXRGcm9tRXJybm8oc29ja2V0X2Vycm9yKTsKfQoKCnN0YXRpYyBQ
eU9iamVjdCAqCnNldF9oZXJyb3IoaW50IGhfZXJyb3IpCnsKCVB5T2JqZWN0ICp2OwoKI2lm
ZGVmIEhBVkVfSFNUUkVSUk9SCgl2ID0gUHlfQnVpbGRWYWx1ZSgiKGlzKSIsIGhfZXJyb3Is
IChjaGFyICopaHN0cmVycm9yKGhfZXJyb3IpKTsKI2Vsc2UKCXYgPSBQeV9CdWlsZFZhbHVl
KCIoaXMpIiwgaF9lcnJvciwgImhvc3Qgbm90IGZvdW5kIik7CiNlbmRpZgoJaWYgKHYgIT0g
TlVMTCkgewoJCVB5RXJyX1NldE9iamVjdChzb2NrZXRfaGVycm9yLCB2KTsKCQlQeV9ERUNS
RUYodik7Cgl9CgoJcmV0dXJuIE5VTEw7Cn0KCgpzdGF0aWMgUHlPYmplY3QgKgpzZXRfZ2Fp
ZXJyb3IoaW50IGVycm9yKQp7CglQeU9iamVjdCAqdjsKCiNpZmRlZiBFQUlfU1lTVEVNCgkv
KiBFQUlfU1lTVEVNIGlzIG5vdCBhdmFpbGFibGUgb24gV2luZG93cyBYUC4gKi8KCWlmIChl
cnJvciA9PSBFQUlfU1lTVEVNKQoJCXJldHVybiBzZXRfZXJyb3IoKTsKI2VuZGlmCgojaWZk
ZWYgSEFWRV9HQUlfU1RSRVJST1IKCXYgPSBQeV9CdWlsZFZhbHVlKCIoaXMpIiwgZXJyb3Is
IGdhaV9zdHJlcnJvcihlcnJvcikpOwojZWxzZQoJdiA9IFB5X0J1aWxkVmFsdWUoIihpcyki
LCBlcnJvciwgImdldGFkZHJpbmZvIGZhaWxlZCIpOwojZW5kaWYKCWlmICh2ICE9IE5VTEwp
IHsKCQlQeUVycl9TZXRPYmplY3Qoc29ja2V0X2dhaWVycm9yLCB2KTsKCQlQeV9ERUNSRUYo
dik7Cgl9CgoJcmV0dXJuIE5VTEw7Cn0KCi8qIEZvciB0aW1lb3V0IGVycm9ycyAqLwpzdGF0
aWMgUHlPYmplY3QgKgp0aW1lb3V0X2Vycih2b2lkKQp7CglQeU9iamVjdCAqdjsKCiNpZmRl
ZiBNU19XSU5ET1dTCgl2ID0gUHlfQnVpbGRWYWx1ZSgiKGlzKSIsIFdTQUVUSU1FRE9VVCwg
IlNvY2tldCBvcGVyYXRpb24gdGltZWQgb3V0Iik7CiNlbHNlCgl2ID0gUHlfQnVpbGRWYWx1
ZSgiKGlzKSIsIEVUSU1FRE9VVCwgIlNvY2tldCBvcGVyYXRpb24gdGltZWQgb3V0Iik7CiNl
bmRpZgoKCWlmICh2ICE9IE5VTEwpIHsKCQlQeUVycl9TZXRPYmplY3Qoc29ja2V0X2Vycm9y
LCB2KTsKCQlQeV9ERUNSRUYodik7Cgl9CgoJcmV0dXJuIE5VTEw7Cn0KCi8qIEZ1bmN0aW9u
IHRvIHBlcmZvcm0gdGhlIHNldHRpbmcgb2Ygc29ja2V0IGJsb2NraW5nIG1vZGUKICAgaW50
ZXJuYWxseS4gYmxvY2sgPSAoMSB8IDApLiAqLwpzdGF0aWMgaW50CmludGVybmFsX3NldGJs
b2NraW5nKFB5U29ja2V0U29ja09iamVjdCAqcywgaW50IGJsb2NrKQp7CiNpZm5kZWYgUklT
Q09TCiNpZm5kZWYgTVNfV0lORE9XUwoJaW50IGRlbGF5X2ZsYWc7CiNlbmRpZgojZW5kaWYK
CglQeV9CRUdJTl9BTExPV19USFJFQURTCiNpZmRlZiBfX0JFT1NfXwoJYmxvY2sgPSAhYmxv
Y2s7CglzZXRzb2Nrb3B0KHMtPnNvY2tfZmQsIFNPTF9TT0NLRVQsIFNPX05PTkJMT0NLLAoJ
CSAgICh2b2lkICopKCZibG9jayksIHNpemVvZihpbnQpKTsKI2Vsc2UKI2lmbmRlZiBSSVND
T1MKI2lmbmRlZiBNU19XSU5ET1dTCiNpZiBkZWZpbmVkKFBZT1NfT1MyKSAmJiAhZGVmaW5l
ZChQWUNDX0dDQykKCWJsb2NrID0gIWJsb2NrOwoJaW9jdGwocy0+c29ja19mZCwgRklPTkJJ
TywgKGNhZGRyX3QpJmJsb2NrLCBzaXplb2YoYmxvY2spKTsKI2Vsc2UgLyogIVBZT1NfT1My
ICovCglkZWxheV9mbGFnID0gZmNudGwocy0+c29ja19mZCwgRl9HRVRGTCwgMCk7CglpZiAo
YmxvY2spCgkJZGVsYXlfZmxhZyAmPSAofk9fTkRFTEFZKTsKCWVsc2UKCQlkZWxheV9mbGFn
IHw9IE9fTkRFTEFZOwoJZmNudGwocy0+c29ja19mZCwgRl9TRVRGTCwgZGVsYXlfZmxhZyk7
CiNlbmRpZiAvKiAhUFlPU19PUzIgKi8KI2Vsc2UgLyogTVNfV0lORE9XUyAqLwoJYmxvY2sg
PSAhYmxvY2s7Cglpb2N0bHNvY2tldChzLT5zb2NrX2ZkLCBGSU9OQklPLCAodV9sb25nKikm
YmxvY2spOwojZW5kaWYgLyogTVNfV0lORE9XUyAqLwojZW5kaWYgLyogX19CRU9TX18gKi8K
I2VuZGlmIC8qIFJJU0NPUyAqLwoJUHlfRU5EX0FMTE9XX1RIUkVBRFMKCgkvKiBTaW5jZSB0
aGVzZSBkb24ndCByZXR1cm4gYW55dGhpbmcgKi8KCXJldHVybiAxOwp9CgovKiBGb3IgYWNj
ZXNzIHRvIHRoZSBzZWxlY3QgbW9kdWxlIHRvIHBvbGwgdGhlIHNvY2tldCBmb3IgdGltZW91
dAogICBmdW5jdGlvbmFsaXR5LiB3cml0aW5nIGlzIDEgZm9yIHdyaXRpbmcsIDAgZm9yIHJl
YWRpbmcuCiAgIFJldHVybiB2YWx1ZTogLTEgaWYgZXJyb3IsIDAgaWYgbm90IHJlYWR5LCA+
PSAxIGlmIHJlYWR5LgogICBBbiBleGNlcHRpb24gaXMgc2V0IHdoZW4gdGhlIHJldHVybiB2
YWx1ZSBpcyA8PSAwICghKS4gKi8Kc3RhdGljIGludAppbnRlcm5hbF9zZWxlY3QoUHlTb2Nr
ZXRTb2NrT2JqZWN0ICpzLCBpbnQgd3JpdGluZykKewoJZmRfc2V0IGZkczsKCXN0cnVjdCB0
aW1ldmFsIHR2OwoJaW50IGNvdW50OwoKCS8qIENvbnN0cnVjdCB0aGUgYXJndW1lbnRzIHRv
IHNlbGVjdCAqLwoJdHYudHZfc2VjID0gKGludClzLT5zb2NrX3RpbWVvdXQ7Cgl0di50dl91
c2VjID0gKGludCkoKHMtPnNvY2tfdGltZW91dCAtIHR2LnR2X3NlYykgKiAxZTYpOwoJRkRf
WkVSTygmZmRzKTsKCUZEX1NFVChzLT5zb2NrX2ZkLCAmZmRzKTsKCgkvKiBTZWUgaWYgdGhl
IHNvY2tldCBpcyByZWFkeSAqLwoJaWYgKHdyaXRpbmcpCgkJY291bnQgPSBzZWxlY3Qocy0+
c29ja19mZCsxLCBOVUxMLCAmZmRzLCBOVUxMLCAmdHYpOwoJZWxzZQoJCWNvdW50ID0gc2Vs
ZWN0KHMtPnNvY2tfZmQrMSwgJmZkcywgTlVMTCwgTlVMTCwgJnR2KTsKCgkvKiBDaGVjayBm
b3IgZXJyb3JzICovCglpZiAoY291bnQgPCAwKSB7CgkJcy0+ZXJyb3JoYW5kbGVyKCk7CgkJ
cmV0dXJuIC0xOwoJfQoKCS8qIFNldCB0aGUgZXJyb3IgaWYgdGhlIHRpbWVvdXQgaGFzIGVs
YXBzZWQsIGkuZSwgd2Ugd2VyZSBub3QKCSAgcG9sbGVkLiAqLwoJaWYgKGNvdW50ID09IDAp
CgkJdGltZW91dF9lcnIoKTsKCglyZXR1cm4gY291bnQ7Cn0KCi8qIEluaXRpYWxpemUgYSBu
ZXcgc29ja2V0IG9iamVjdC4gKi8KCnN0YXRpYyB2b2lkCmluaXRfc29ja29iamVjdChQeVNv
Y2tldFNvY2tPYmplY3QgKnMsCgkJU09DS0VUX1QgZmQsIGludCBmYW1pbHksIGludCB0eXBl
LCBpbnQgcHJvdG8pCnsKI2lmZGVmIFJJU0NPUwoJaW50IGJsb2NrID0gMTsKI2VuZGlmCglz
LT5zb2NrX2ZkID0gZmQ7CglzLT5zb2NrX2ZhbWlseSA9IGZhbWlseTsKCXMtPnNvY2tfdHlw
ZSA9IHR5cGU7CglzLT5zb2NrX3Byb3RvID0gcHJvdG87CglzLT5zb2NrX2Jsb2NraW5nID0g
MTsgLyogU3RhcnQgaW4gYmxvY2tpbmcgbW9kZSAqLwoJcy0+c29ja190aW1lb3V0ID0gLTEu
MDsgLyogU3RhcnQgd2l0aG91dCB0aW1lb3V0ICovCgoJcy0+ZXJyb3JoYW5kbGVyID0gJnNl
dF9lcnJvcjsKI2lmZGVmIFJJU0NPUwoJaWYgKHRhc2t3aW5kb3cpCgkJc29ja2V0aW9jdGwo
cy0+c29ja19mZCwgMHg4MDA0NjY3OSwgKHVfbG9uZyopJmJsb2NrKTsKI2VuZGlmCn0KCgov
KiBDcmVhdGUgYSBuZXcgc29ja2V0IG9iamVjdC4KICAgVGhpcyBqdXN0IGNyZWF0ZXMgdGhl
IG9iamVjdCBhbmQgaW5pdGlhbGl6ZXMgaXQuCiAgIElmIHRoZSBjcmVhdGlvbiBmYWlscywg
cmV0dXJuIE5VTEwgYW5kIHNldCBhbiBleGNlcHRpb24gKGltcGxpY2l0CiAgIGluIE5FV09C
SigpKS4gKi8KCnN0YXRpYyBQeVNvY2tldFNvY2tPYmplY3QgKgpuZXdfc29ja29iamVjdChT
T0NLRVRfVCBmZCwgaW50IGZhbWlseSwgaW50IHR5cGUsIGludCBwcm90bykKewoJUHlTb2Nr
ZXRTb2NrT2JqZWN0ICpzOwoJcyA9IChQeVNvY2tldFNvY2tPYmplY3QgKikKCQlQeVR5cGVf
R2VuZXJpY05ldygmc29ja190eXBlLCBOVUxMLCBOVUxMKTsKCWlmIChzICE9IE5VTEwpCgkJ
aW5pdF9zb2Nrb2JqZWN0KHMsIGZkLCBmYW1pbHksIHR5cGUsIHByb3RvKTsKCXJldHVybiBz
Owp9CgoKLyogTG9jayB0byBhbGxvdyBweXRob24gaW50ZXJwcmV0ZXIgdG8gY29udGludWUs
IGJ1dCBvbmx5IGFsbG93IG9uZQogICB0aHJlYWQgdG8gYmUgaW4gZ2V0aG9zdGJ5bmFtZSAq
LwojaWZkZWYgVVNFX0dFVEhPU1RCWU5BTUVfTE9DSwpQeVRocmVhZF90eXBlX2xvY2sgZ2V0
aG9zdGJ5bmFtZV9sb2NrOwojZW5kaWYKCgovKiBDb252ZXJ0IGEgc3RyaW5nIHNwZWNpZnlp
bmcgYSBob3N0IG5hbWUgb3Igb25lIG9mIGEgZmV3IHN5bWJvbGljCiAgIG5hbWVzIHRvIGEg
bnVtZXJpYyBJUCBhZGRyZXNzLiAgVGhpcyB1c3VhbGx5IGNhbGxzIGdldGhvc3RieW5hbWUo
KQogICB0byBkbyB0aGUgd29yazsgdGhlIG5hbWVzICIiIGFuZCAiPGJyb2FkY2FzdD4iIGFy
ZSBzcGVjaWFsLgogICBSZXR1cm4gdGhlIGxlbmd0aCAoSVB2NCBzaG91bGQgYmUgNCBieXRl
cyksIG9yIG5lZ2F0aXZlIGlmCiAgIGFuIGVycm9yIG9jY3VycmVkOyB0aGVuIGFuIGV4Y2Vw
dGlvbiBpcyByYWlzZWQuICovCgpzdGF0aWMgaW50CnNldGlwYWRkcihjaGFyICpuYW1lLCBz
dHJ1Y3Qgc29ja2FkZHIgKmFkZHJfcmV0LCBpbnQgYWYpCnsKCXN0cnVjdCBhZGRyaW5mbyBo
aW50cywgKnJlczsKCWludCBlcnJvcjsKCgltZW1zZXQoKHZvaWQgKikgYWRkcl9yZXQsICdc
MCcsIHNpemVvZigqYWRkcl9yZXQpKTsKCWlmIChuYW1lWzBdID09ICdcMCcpIHsKCQlpbnQg
c2l6OwoJCW1lbXNldCgmaGludHMsIDAsIHNpemVvZihoaW50cykpOwoJCWhpbnRzLmFpX2Zh
bWlseSA9IGFmOwoJCWhpbnRzLmFpX3NvY2t0eXBlID0gU09DS19ER1JBTTsJLypkdW1teSov
CgkJaGludHMuYWlfZmxhZ3MgPSBBSV9QQVNTSVZFOwoJCWVycm9yID0gZ2V0YWRkcmluZm8o
TlVMTCwgIjAiLCAmaGludHMsICZyZXMpOwoJCWlmIChlcnJvcikgewoJCQlzZXRfZ2FpZXJy
b3IoZXJyb3IpOwoJCQlyZXR1cm4gLTE7CgkJfQoJCXN3aXRjaCAocmVzLT5haV9mYW1pbHkp
IHsKCQljYXNlIEFGX0lORVQ6CgkJCXNpeiA9IDQ7CgkJCWJyZWFrOwojaWZkZWYgRU5BQkxF
X0lQVjYKCQljYXNlIEFGX0lORVQ2OgoJCQlzaXogPSAxNjsKCQkJYnJlYWs7CiNlbmRpZgoJ
CWRlZmF1bHQ6CgkJCWZyZWVhZGRyaW5mbyhyZXMpOwoJCQlQeUVycl9TZXRTdHJpbmcoc29j
a2V0X2Vycm9yLAoJCQkJInVuc3VwcG9ydGVkIGFkZHJlc3MgZmFtaWx5Iik7CgkJCXJldHVy
biAtMTsKCQl9CgkJaWYgKHJlcy0+YWlfbmV4dCkgewoJCQlmcmVlYWRkcmluZm8ocmVzKTsK
CQkJUHlFcnJfU2V0U3RyaW5nKHNvY2tldF9lcnJvciwKCQkJCSJ3aWxkY2FyZCByZXNvbHZl
ZCB0byBtdWx0aXBsZSBhZGRyZXNzIik7CgkJCXJldHVybiAtMTsKCQl9CgkJbWVtY3B5KGFk
ZHJfcmV0LCByZXMtPmFpX2FkZHIsIHJlcy0+YWlfYWRkcmxlbik7CgkJZnJlZWFkZHJpbmZv
KHJlcyk7CgkJcmV0dXJuIHNpejsKCX0KCWlmIChuYW1lWzBdID09ICc8JyAmJiBzdHJjbXAo
bmFtZSwgIjxicm9hZGNhc3Q+IikgPT0gMCkgewoJCXN0cnVjdCBzb2NrYWRkcl9pbiAqc2lu
OwoJCWlmIChhZiAhPSBQRl9JTkVUICYmIGFmICE9IFBGX1VOU1BFQykgewoJCQlQeUVycl9T
ZXRTdHJpbmcoc29ja2V0X2Vycm9yLAoJCQkJImFkZHJlc3MgZmFtaWx5IG1pc21hdGNoZWQi
KTsKCQkJcmV0dXJuIC0xOwoJCX0KCQlzaW4gPSAoc3RydWN0IHNvY2thZGRyX2luICopYWRk
cl9yZXQ7CgkJbWVtc2V0KCh2b2lkICopIHNpbiwgJ1wwJywgc2l6ZW9mKCpzaW4pKTsKCQlz
aW4tPnNpbl9mYW1pbHkgPSBBRl9JTkVUOwojaWZkZWYgSEFWRV9TT0NLQUREUl9TQV9MRU4K
CQlzaW4tPnNpbl9sZW4gPSBzaXplb2YoKnNpbik7CiNlbmRpZgoJCXNpbi0+c2luX2FkZHIu
c19hZGRyID0gSU5BRERSX0JST0FEQ0FTVDsKCQlyZXR1cm4gc2l6ZW9mKHNpbi0+c2luX2Fk
ZHIpOwoJfQoJbWVtc2V0KCZoaW50cywgMCwgc2l6ZW9mKGhpbnRzKSk7CgloaW50cy5haV9m
YW1pbHkgPSBhZjsKCWVycm9yID0gZ2V0YWRkcmluZm8obmFtZSwgTlVMTCwgJmhpbnRzLCAm
cmVzKTsKI2lmIGRlZmluZWQoX19kaWdpdGFsX18pICYmIGRlZmluZWQoX191bml4X18pCglp
ZiAoZXJyb3IgPT0gRUFJX05PTkFNRSAmJiBhZiA9PSBBRl9VTlNQRUMpIHsKCQkvKiBPbiBU
cnU2NCBWNS4xLCBudW1lcmljLXRvLWFkZHIgY29udmVyc2lvbiBmYWlscwoJCSAgIGlmIG5v
IGFkZHJlc3MgZmFtaWx5IGlzIGdpdmVuLiBBc3N1bWUgSVB2NCBmb3Igbm93LiovCgkJaGlu
dHMuYWlfZmFtaWx5ID0gQUZfSU5FVDsKCQllcnJvciA9IGdldGFkZHJpbmZvKG5hbWUsIE5V
TEwsICZoaW50cywgJnJlcyk7Cgl9CiNlbmRpZgoJaWYgKGVycm9yKSB7CgkJc2V0X2dhaWVy
cm9yKGVycm9yKTsKCQlyZXR1cm4gLTE7Cgl9CgltZW1jcHkoKGNoYXIgKikgYWRkcl9yZXQs
IHJlcy0+YWlfYWRkciwgcmVzLT5haV9hZGRybGVuKTsKCWZyZWVhZGRyaW5mbyhyZXMpOwoJ
c3dpdGNoIChhZGRyX3JldC0+c2FfZmFtaWx5KSB7CgljYXNlIEFGX0lORVQ6CgkJcmV0dXJu
IDQ7CiNpZmRlZiBFTkFCTEVfSVBWNgoJY2FzZSBBRl9JTkVUNjoKCQlyZXR1cm4gMTY7CiNl
bmRpZgoJZGVmYXVsdDoKCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9yLCAidW5rbm93
biBhZGRyZXNzIGZhbWlseSIpOwoJCXJldHVybiAtMTsKCX0KfQoKCi8qIENyZWF0ZSBhIHN0
cmluZyBvYmplY3QgcmVwcmVzZW50aW5nIGFuIElQIGFkZHJlc3MuCiAgIFRoaXMgaXMgYWx3
YXlzIGEgc3RyaW5nIG9mIHRoZSBmb3JtICdkZC5kZC5kZC5kZCcgKHdpdGggdmFyaWFibGUK
ICAgc2l6ZSBudW1iZXJzKS4gKi8KCnN0YXRpYyBQeU9iamVjdCAqCm1ha2VpcGFkZHIoc3Ry
dWN0IHNvY2thZGRyICphZGRyLCBpbnQgYWRkcmxlbikKewoJY2hhciBidWZbTklfTUFYSE9T
VF07CglpbnQgZXJyb3I7CgoJZXJyb3IgPSBnZXRuYW1laW5mbyhhZGRyLCBhZGRybGVuLCBi
dWYsIHNpemVvZihidWYpLCBOVUxMLCAwLAoJCU5JX05VTUVSSUNIT1NUKTsKCWlmIChlcnJv
cikgewoJCXNldF9nYWllcnJvcihlcnJvcik7CgkJcmV0dXJuIE5VTEw7Cgl9CglyZXR1cm4g
UHlTdHJpbmdfRnJvbVN0cmluZyhidWYpOwp9CgoKLyogQ3JlYXRlIGFuIG9iamVjdCByZXBy
ZXNlbnRpbmcgdGhlIGdpdmVuIHNvY2tldCBhZGRyZXNzLAogICBzdWl0YWJsZSBmb3IgcGFz
c2luZyBpdCBiYWNrIHRvIGJpbmQoKSwgY29ubmVjdCgpIGV0Yy4KICAgVGhlIGZhbWlseSBm
aWVsZCBvZiB0aGUgc29ja2FkZHIgc3RydWN0dXJlIGlzIGluc3BlY3RlZAogICB0byBkZXRl
cm1pbmUgd2hhdCBraW5kIG9mIGFkZHJlc3MgaXQgcmVhbGx5IGlzLiAqLwoKLypBUkdTVVNF
RCovCnN0YXRpYyBQeU9iamVjdCAqCm1ha2Vzb2NrYWRkcihpbnQgc29ja2ZkLCBzdHJ1Y3Qg
c29ja2FkZHIgKmFkZHIsIGludCBhZGRybGVuKQp7CglpZiAoYWRkcmxlbiA9PSAwKSB7CgkJ
LyogTm8gYWRkcmVzcyAtLSBtYXkgYmUgcmVjdmZyb20oKSBmcm9tIGtub3duIHNvY2tldCAq
LwoJCVB5X0lOQ1JFRihQeV9Ob25lKTsKCQlyZXR1cm4gUHlfTm9uZTsKCX0KCiNpZmRlZiBf
X0JFT1NfXwoJLyogWFhYOiBCZU9TIHZlcnNpb24gb2YgYWNjZXB0KCkgZG9lc24ndCBzZXQg
ZmFtaWx5IGNvcnJlY3RseSAqLwoJYWRkci0+c2FfZmFtaWx5ID0gQUZfSU5FVDsKI2VuZGlm
CgoJc3dpdGNoIChhZGRyLT5zYV9mYW1pbHkpIHsKCgljYXNlIEFGX0lORVQ6Cgl7CgkJc3Ry
dWN0IHNvY2thZGRyX2luICphOwoJCVB5T2JqZWN0ICphZGRyb2JqID0gbWFrZWlwYWRkcihh
ZGRyLCBzaXplb2YoKmEpKTsKCQlQeU9iamVjdCAqcmV0ID0gTlVMTDsKCQlpZiAoYWRkcm9i
aikgewoJCQlhID0gKHN0cnVjdCBzb2NrYWRkcl9pbiAqKWFkZHI7CgkJCXJldCA9IFB5X0J1
aWxkVmFsdWUoIk9pIiwgYWRkcm9iaiwgbnRvaHMoYS0+c2luX3BvcnQpKTsKCQkJUHlfREVD
UkVGKGFkZHJvYmopOwoJCX0KCQlyZXR1cm4gcmV0OwoJfQoKI2lmZGVmIEFGX1VOSVgKCWNh
c2UgQUZfVU5JWDoKCXsKCQlzdHJ1Y3Qgc29ja2FkZHJfdW4gKmEgPSAoc3RydWN0IHNvY2th
ZGRyX3VuICopIGFkZHI7CgkJcmV0dXJuIFB5U3RyaW5nX0Zyb21TdHJpbmcoYS0+c3VuX3Bh
dGgpOwoJfQojZW5kaWYgLyogQUZfVU5JWCAqLwoKI2lmZGVmIEVOQUJMRV9JUFY2CgljYXNl
IEFGX0lORVQ2OgoJewoJCXN0cnVjdCBzb2NrYWRkcl9pbjYgKmE7CgkJUHlPYmplY3QgKmFk
ZHJvYmogPSBtYWtlaXBhZGRyKGFkZHIsIHNpemVvZigqYSkpOwoJCVB5T2JqZWN0ICpyZXQg
PSBOVUxMOwoJCWlmIChhZGRyb2JqKSB7CgkJCWEgPSAoc3RydWN0IHNvY2thZGRyX2luNiAq
KWFkZHI7CgkJCXJldCA9IFB5X0J1aWxkVmFsdWUoIk9paWkiLAoJCQkJCSAgICBhZGRyb2Jq
LAoJCQkJCSAgICBudG9ocyhhLT5zaW42X3BvcnQpLAoJCQkJCSAgICBhLT5zaW42X2Zsb3dp
bmZvLAoJCQkJCSAgICBhLT5zaW42X3Njb3BlX2lkKTsKCQkJUHlfREVDUkVGKGFkZHJvYmop
OwoJCX0KCQlyZXR1cm4gcmV0OwoJfQojZW5kaWYKCiNpZmRlZiBIQVZFX05FVFBBQ0tFVF9Q
QUNLRVRfSAoJY2FzZSBBRl9QQUNLRVQ6Cgl7CgkJc3RydWN0IHNvY2thZGRyX2xsICphID0g
KHN0cnVjdCBzb2NrYWRkcl9sbCAqKWFkZHI7CgkJY2hhciAqaWZuYW1lID0gIiI7CgkJc3Ry
dWN0IGlmcmVxIGlmcjsKCQkvKiBuZWVkIHRvIGxvb2sgdXAgaW50ZXJmYWNlIG5hbWUgZ2l2
ZSBpbmRleCAqLwoJCWlmIChhLT5zbGxfaWZpbmRleCkgewoJCQlpZnIuaWZyX2lmaW5kZXgg
PSBhLT5zbGxfaWZpbmRleDsKCQkJaWYgKGlvY3RsKHNvY2tmZCwgU0lPQ0dJRk5BTUUsICZp
ZnIpID09IDApCgkJCQlpZm5hbWUgPSBpZnIuaWZyX25hbWU7CgkJfQoJCXJldHVybiBQeV9C
dWlsZFZhbHVlKCJzaGJocyMiLAoJCQkJICAgICBpZm5hbWUsCgkJCQkgICAgIG50b2hzKGEt
PnNsbF9wcm90b2NvbCksCgkJCQkgICAgIGEtPnNsbF9wa3R0eXBlLAoJCQkJICAgICBhLT5z
bGxfaGF0eXBlLAoJCQkJICAgICBhLT5zbGxfYWRkciwKCQkJCSAgICAgYS0+c2xsX2hhbGVu
KTsKCX0KI2VuZGlmCgoJLyogTW9yZSBjYXNlcyBoZXJlLi4uICovCgoJZGVmYXVsdDoKCQkv
KiBJZiB3ZSBkb24ndCBrbm93IHRoZSBhZGRyZXNzIGZhbWlseSwgZG9uJ3QgcmFpc2UgYW4K
CQkgICBleGNlcHRpb24gLS0gcmV0dXJuIGl0IGFzIGEgdHVwbGUuICovCgkJcmV0dXJuIFB5
X0J1aWxkVmFsdWUoImlzIyIsCgkJCQkgICAgIGFkZHItPnNhX2ZhbWlseSwKCQkJCSAgICAg
YWRkci0+c2FfZGF0YSwKCQkJCSAgICAgc2l6ZW9mKGFkZHItPnNhX2RhdGEpKTsKCgl9Cn0K
CgovKiBQYXJzZSBhIHNvY2tldCBhZGRyZXNzIGFyZ3VtZW50IGFjY29yZGluZyB0byB0aGUg
c29ja2V0IG9iamVjdCdzCiAgIGFkZHJlc3MgZmFtaWx5LiAgUmV0dXJuIDEgaWYgdGhlIGFk
ZHJlc3Mgd2FzIGluIHRoZSBwcm9wZXIgZm9ybWF0LAogICAwIG9mIG5vdC4gIFRoZSBhZGRy
ZXNzIGlzIHJldHVybmVkIHRocm91Z2ggYWRkcl9yZXQsIGl0cyBsZW5ndGgKICAgdGhyb3Vn
aCBsZW5fcmV0LiAqLwoKc3RhdGljIGludApnZXRzb2NrYWRkcmFyZyhQeVNvY2tldFNvY2tP
YmplY3QgKnMsIFB5T2JqZWN0ICphcmdzLAoJICAgICAgIHN0cnVjdCBzb2NrYWRkciAqKmFk
ZHJfcmV0LCBpbnQgKmxlbl9yZXQpCnsKCXN3aXRjaCAocy0+c29ja19mYW1pbHkpIHsKCiNp
ZmRlZiBBRl9VTklYCgljYXNlIEFGX1VOSVg6Cgl7CgkJc3RydWN0IHNvY2thZGRyX3VuKiBh
ZGRyOwoJCWNoYXIgKnBhdGg7CgkJaW50IGxlbjsKCQlhZGRyID0gKHN0cnVjdCBzb2NrYWRk
cl91biopJihzLT5zb2NrX2FkZHIpLnVuOwoJCWlmICghUHlBcmdfUGFyc2UoYXJncywgInQj
IiwgJnBhdGgsICZsZW4pKQoJCQlyZXR1cm4gMDsKCQlpZiAobGVuID4gc2l6ZW9mIGFkZHIt
PnN1bl9wYXRoKSB7CgkJCVB5RXJyX1NldFN0cmluZyhzb2NrZXRfZXJyb3IsCgkJCQkJIkFG
X1VOSVggcGF0aCB0b28gbG9uZyIpOwoJCQlyZXR1cm4gMDsKCQl9CgkJYWRkci0+c3VuX2Zh
bWlseSA9IHMtPnNvY2tfZmFtaWx5OwoJCW1lbWNweShhZGRyLT5zdW5fcGF0aCwgcGF0aCwg
bGVuKTsKCQlhZGRyLT5zdW5fcGF0aFtsZW5dID0gMDsKCQkqYWRkcl9yZXQgPSAoc3RydWN0
IHNvY2thZGRyICopIGFkZHI7CgkJKmxlbl9yZXQgPSBsZW4gKyBzaXplb2YoKmFkZHIpIC0g
c2l6ZW9mKGFkZHItPnN1bl9wYXRoKTsKCQlyZXR1cm4gMTsKCX0KI2VuZGlmIC8qIEFGX1VO
SVggKi8KCgljYXNlIEFGX0lORVQ6Cgl7CgkJc3RydWN0IHNvY2thZGRyX2luKiBhZGRyOwoJ
CWNoYXIgKmhvc3Q7CgkJaW50IHBvcnQ7CiAJCWFkZHI9KHN0cnVjdCBzb2NrYWRkcl9pbiop
JihzLT5zb2NrX2FkZHIpLmluOwoJCWlmICghUHlUdXBsZV9DaGVjayhhcmdzKSkgewoJCQlQ
eUVycl9Gb3JtYXQoCgkJCQlQeUV4Y19UeXBlRXJyb3IsCgkJCQkiZ2V0c29ja2FkZHJhcmc6
ICIKCQkJCSJBRl9JTkVUIGFkZHJlc3MgbXVzdCBiZSB0dXBsZSwgbm90ICUuNTAwcyIsCgkJ
CQlhcmdzLT5vYl90eXBlLT50cF9uYW1lKTsKCQkJcmV0dXJuIDA7CgkJfQoJCWlmICghUHlB
cmdfUGFyc2VUdXBsZShhcmdzLCAic2k6Z2V0c29ja2FkZHJhcmciLCAmaG9zdCwgJnBvcnQp
KQoJCQlyZXR1cm4gMDsKCQlpZiAoc2V0aXBhZGRyKGhvc3QsIChzdHJ1Y3Qgc29ja2FkZHIg
KilhZGRyLCBBRl9JTkVUKSA8IDApCgkJCXJldHVybiAwOwoJCWFkZHItPnNpbl9mYW1pbHkg
PSBBRl9JTkVUOwoJCWFkZHItPnNpbl9wb3J0ID0gaHRvbnMoKHNob3J0KXBvcnQpOwoJCSph
ZGRyX3JldCA9IChzdHJ1Y3Qgc29ja2FkZHIgKikgYWRkcjsKCQkqbGVuX3JldCA9IHNpemVv
ZiAqYWRkcjsKCQlyZXR1cm4gMTsKCX0KCiNpZmRlZiBFTkFCTEVfSVBWNgoJY2FzZSBBRl9J
TkVUNjoKCXsKCQlzdHJ1Y3Qgc29ja2FkZHJfaW42KiBhZGRyOwoJCWNoYXIgKmhvc3Q7CgkJ
aW50IHBvcnQsIGZsb3dpbmZvLCBzY29wZV9pZDsKIAkJYWRkciA9IChzdHJ1Y3Qgc29ja2Fk
ZHJfaW42KikmKHMtPnNvY2tfYWRkcikuaW42OwoJCWZsb3dpbmZvID0gc2NvcGVfaWQgPSAw
OwoJCWlmICghUHlBcmdfUGFyc2VUdXBsZShhcmdzLCAic2l8aWkiLCAmaG9zdCwgJnBvcnQs
ICZmbG93aW5mbywKCQkJCSAgICAgICZzY29wZV9pZCkpIHsKCQkJcmV0dXJuIDA7CgkJfQoJ
CWlmIChzZXRpcGFkZHIoaG9zdCwgKHN0cnVjdCBzb2NrYWRkciAqKWFkZHIsIEFGX0lORVQ2
KSA8IDApCgkJCXJldHVybiAwOwoJCWFkZHItPnNpbjZfZmFtaWx5ID0gcy0+c29ja19mYW1p
bHk7CgkJYWRkci0+c2luNl9wb3J0ID0gaHRvbnMoKHNob3J0KXBvcnQpOwoJCWFkZHItPnNp
bjZfZmxvd2luZm8gPSBmbG93aW5mbzsKCQlhZGRyLT5zaW42X3Njb3BlX2lkID0gc2NvcGVf
aWQ7CgkJKmFkZHJfcmV0ID0gKHN0cnVjdCBzb2NrYWRkciAqKSBhZGRyOwoJCSpsZW5fcmV0
ID0gc2l6ZW9mICphZGRyOwoJCXJldHVybiAxOwoJfQojZW5kaWYKCiNpZmRlZiBIQVZFX05F
VFBBQ0tFVF9QQUNLRVRfSAoJY2FzZSBBRl9QQUNLRVQ6Cgl7CgkJc3RydWN0IHNvY2thZGRy
X2xsKiBhZGRyOwoJCXN0cnVjdCBpZnJlcSBpZnI7CgkJY2hhciAqaW50ZXJmYWNlTmFtZTsK
CQlpbnQgcHJvdG9OdW1iZXI7CgkJaW50IGhhdHlwZSA9IDA7CgkJaW50IHBrdHR5cGUgPSAw
OwoJCWNoYXIgKmhhZGRyOwoKCQlpZiAoIVB5QXJnX1BhcnNlVHVwbGUoYXJncywgInNpfGlp
cyIsICZpbnRlcmZhY2VOYW1lLAoJCQkJICAgICAgJnByb3RvTnVtYmVyLCAmcGt0dHlwZSwg
JmhhdHlwZSwgJmhhZGRyKSkKCQkJcmV0dXJuIDA7CgkJc3RybmNweShpZnIuaWZyX25hbWUs
IGludGVyZmFjZU5hbWUsIHNpemVvZihpZnIuaWZyX25hbWUpKTsKCQlpZnIuaWZyX25hbWVb
KHNpemVvZihpZnIuaWZyX25hbWUpKS0xXSA9ICdcMCc7CgkJaWYgKGlvY3RsKHMtPnNvY2tf
ZmQsIFNJT0NHSUZJTkRFWCwgJmlmcikgPCAwKSB7CgkJICAgICAgICBzLT5lcnJvcmhhbmRs
ZXIoKTsKCQkJcmV0dXJuIDA7CgkJfQoJCWFkZHIgPSAmKHMtPnNvY2tfYWRkci5sbCk7CgkJ
YWRkci0+c2xsX2ZhbWlseSA9IEFGX1BBQ0tFVDsKCQlhZGRyLT5zbGxfcHJvdG9jb2wgPSBo
dG9ucygoc2hvcnQpcHJvdG9OdW1iZXIpOwoJCWFkZHItPnNsbF9pZmluZGV4ID0gaWZyLmlm
cl9pZmluZGV4OwoJCWFkZHItPnNsbF9wa3R0eXBlID0gcGt0dHlwZTsKCQlhZGRyLT5zbGxf
aGF0eXBlID0gaGF0eXBlOwoJCSphZGRyX3JldCA9IChzdHJ1Y3Qgc29ja2FkZHIgKikgYWRk
cjsKCQkqbGVuX3JldCA9IHNpemVvZiAqYWRkcjsKCQlyZXR1cm4gMTsKCX0KI2VuZGlmCgoJ
LyogTW9yZSBjYXNlcyBoZXJlLi4uICovCgoJZGVmYXVsdDoKCQlQeUVycl9TZXRTdHJpbmco
c29ja2V0X2Vycm9yLCAiZ2V0c29ja2FkZHJhcmc6IGJhZCBmYW1pbHkiKTsKCQlyZXR1cm4g
MDsKCgl9Cn0KCgovKiBHZXQgdGhlIGFkZHJlc3MgbGVuZ3RoIGFjY29yZGluZyB0byB0aGUg
c29ja2V0IG9iamVjdCdzIGFkZHJlc3MgZmFtaWx5LgogICBSZXR1cm4gMSBpZiB0aGUgZmFt
aWx5IGlzIGtub3duLCAwIG90aGVyd2lzZS4gIFRoZSBsZW5ndGggaXMgcmV0dXJuZWQKICAg
dGhyb3VnaCBsZW5fcmV0LiAqLwoKc3RhdGljIGludApnZXRzb2NrYWRkcmxlbihQeVNvY2tl
dFNvY2tPYmplY3QgKnMsIHNvY2tsZW5fdCAqbGVuX3JldCkKewoJc3dpdGNoIChzLT5zb2Nr
X2ZhbWlseSkgewoKI2lmZGVmIEFGX1VOSVgKCWNhc2UgQUZfVU5JWDoKCXsKCQkqbGVuX3Jl
dCA9IHNpemVvZiAoc3RydWN0IHNvY2thZGRyX3VuKTsKCQlyZXR1cm4gMTsKCX0KI2VuZGlm
IC8qIEFGX1VOSVggKi8KCgljYXNlIEFGX0lORVQ6Cgl7CgkJKmxlbl9yZXQgPSBzaXplb2Yg
KHN0cnVjdCBzb2NrYWRkcl9pbik7CgkJcmV0dXJuIDE7Cgl9CgojaWZkZWYgRU5BQkxFX0lQ
VjYKCWNhc2UgQUZfSU5FVDY6Cgl7CgkJKmxlbl9yZXQgPSBzaXplb2YgKHN0cnVjdCBzb2Nr
YWRkcl9pbjYpOwoJCXJldHVybiAxOwoJfQojZW5kaWYKCiNpZmRlZiBIQVZFX05FVFBBQ0tF
VF9QQUNLRVRfSAoJY2FzZSBBRl9QQUNLRVQ6Cgl7CgkJKmxlbl9yZXQgPSBzaXplb2YgKHN0
cnVjdCBzb2NrYWRkcl9sbCk7CgkJcmV0dXJuIDE7Cgl9CiNlbmRpZgoKCS8qIE1vcmUgY2Fz
ZXMgaGVyZS4uLiAqLwoKCWRlZmF1bHQ6CgkJUHlFcnJfU2V0U3RyaW5nKHNvY2tldF9lcnJv
ciwgImdldHNvY2thZGRybGVuOiBiYWQgZmFtaWx5Iik7CgkJcmV0dXJuIDA7CgoJfQp9CgoK
Lyogcy5hY2NlcHQoKSBtZXRob2QgKi8KCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tfYWNjZXB0
KFB5U29ja2V0U29ja09iamVjdCAqcykKewoJY2hhciBhZGRyYnVmWzI1Nl07CglTT0NLRVRf
VCBuZXdmZDsKCXNvY2tsZW5fdCBhZGRybGVuOwoJUHlPYmplY3QgKnNvY2sgPSBOVUxMOwoJ
UHlPYmplY3QgKmFkZHIgPSBOVUxMOwoJUHlPYmplY3QgKnJlcyA9IE5VTEw7CgoJaWYgKCFn
ZXRzb2NrYWRkcmxlbihzLCAmYWRkcmxlbikpCgkJcmV0dXJuIE5VTEw7CgltZW1zZXQoYWRk
cmJ1ZiwgMCwgYWRkcmxlbik7CgoJZXJybm8gPSAwOyAvKiBSZXNldCBpbmRpY2F0b3IgZm9y
IHVzZSB3aXRoIHRpbWVvdXQgYmVoYXZpb3IgKi8KCglQeV9CRUdJTl9BTExPV19USFJFQURT
CgluZXdmZCA9IGFjY2VwdChzLT5zb2NrX2ZkLCAoc3RydWN0IHNvY2thZGRyICopIGFkZHJi
dWYsICZhZGRybGVuKTsKCVB5X0VORF9BTExPV19USFJFQURTCgoJaWYgKHMtPnNvY2tfdGlt
ZW91dCA+PSAwLjApIHsKI2lmZGVmIE1TX1dJTkRPV1MKCQlpZiAobmV3ZmQgPT0gSU5WQUxJ
RF9TT0NLRVQpCgkJCWlmICghcy0+c29ja19ibG9ja2luZykKCQkJCXJldHVybiBzLT5lcnJv
cmhhbmRsZXIoKTsKCQkJLyogQ2hlY2sgaWYgd2UgaGF2ZSBhIHRydWUgZmFpbHVyZQoJCQkg
ICBmb3IgYSBibG9ja2luZyBzb2NrZXQgKi8KCQkJaWYgKGVycm5vICE9IFdTQUVXT1VMREJM
T0NLKQoJCQkJcmV0dXJuIHMtPmVycm9yaGFuZGxlcigpOwojZWxzZQoJCWlmIChuZXdmZCA8
IDApIHsKCQkJaWYgKCFzLT5zb2NrX2Jsb2NraW5nKQoJCQkJcmV0dXJuIHMtPmVycm9yaGFu
ZGxlcigpOwoJCQkvKiBDaGVjayBpZiB3ZSBoYXZlIGEgdHJ1ZSBmYWlsdXJlCgkJCSAgIGZv
ciBhIGJsb2NraW5nIHNvY2tldCAqLwoJCQlpZiAoZXJybm8gIT0gRUFHQUlOICYmIGVycm5v
ICE9IEVXT1VMREJMT0NLKQoJCQkJcmV0dXJuIHMtPmVycm9yaGFuZGxlcigpOwoJCX0KI2Vu
ZGlmCgoJCS8qIHRyeSB3YWl0aW5nIHRoZSB0aW1lb3V0IHBlcmlvZCAqLwoJCWlmIChpbnRl
cm5hbF9zZWxlY3QocywgMCkgPD0gMCkKCQkJcmV0dXJuIE5VTEw7CgoJCVB5X0JFR0lOX0FM
TE9XX1RIUkVBRFMKCQluZXdmZCA9IGFjY2VwdChzLT5zb2NrX2ZkLAoJCQkgICAgICAgKHN0
cnVjdCBzb2NrYWRkciAqKWFkZHJidWYsCgkJCSAgICAgICAmYWRkcmxlbik7CgkJUHlfRU5E
X0FMTE9XX1RIUkVBRFMKCX0KCgkvKiBBdCB0aGlzIHBvaW50LCB3ZSByZWFsbHkgaGF2ZSBh
biBlcnJvciwgd2hldGhlciB1c2luZyB0aW1lb3V0CgkgICBiZWhhdmlvciBvciByZWd1bGFy
IHNvY2tldCBiZWhhdmlvciAqLwojaWZkZWYgTVNfV0lORE9XUwoJaWYgKG5ld2ZkID09IElO
VkFMSURfU09DS0VUKQojZWxzZQoJaWYgKG5ld2ZkIDwgMCkKI2VuZGlmCgkJcmV0dXJuIHMt
PmVycm9yaGFuZGxlcigpOwoKCS8qIENyZWF0ZSB0aGUgbmV3IG9iamVjdCB3aXRoIHVuc3Bl
Y2lmaWVkIGZhbWlseSwKCSAgIHRvIGF2b2lkIGNhbGxzIHRvIGJpbmQoKSBldGMuIG9uIGl0
LiAqLwoJc29jayA9IChQeU9iamVjdCAqKSBuZXdfc29ja29iamVjdChuZXdmZCwKCQkJCQkg
ICBzLT5zb2NrX2ZhbWlseSwKCQkJCQkgICBzLT5zb2NrX3R5cGUsCgkJCQkJICAgcy0+c29j
a19wcm90byk7CgoJaWYgKHNvY2sgPT0gTlVMTCkgewoJCVNPQ0tFVENMT1NFKG5ld2ZkKTsK
CQlnb3RvIGZpbmFsbHk7Cgl9CglhZGRyID0gbWFrZXNvY2thZGRyKHMtPnNvY2tfZmQsIChz
dHJ1Y3Qgc29ja2FkZHIgKilhZGRyYnVmLAoJCQkgICAgYWRkcmxlbik7CglpZiAoYWRkciA9
PSBOVUxMKQoJCWdvdG8gZmluYWxseTsKCglyZXMgPSBQeV9CdWlsZFZhbHVlKCJPTyIsIHNv
Y2ssIGFkZHIpOwoKZmluYWxseToKCVB5X1hERUNSRUYoc29jayk7CglQeV9YREVDUkVGKGFk
ZHIpOwoJcmV0dXJuIHJlczsKfQoKc3RhdGljIGNoYXIgYWNjZXB0X2RvY1tdID0KImFjY2Vw
dCgpIC0+IChzb2NrZXQgb2JqZWN0LCBhZGRyZXNzIGluZm8pXG5cClxuXApXYWl0IGZvciBh
biBpbmNvbWluZyBjb25uZWN0aW9uLiAgUmV0dXJuIGEgbmV3IHNvY2tldCByZXByZXNlbnRp
bmcgdGhlXG5cCmNvbm5lY3Rpb24sIGFuZCB0aGUgYWRkcmVzcyBvZiB0aGUgY2xpZW50LiAg
Rm9yIElQIHNvY2tldHMsIHRoZSBhZGRyZXNzXG5cCmluZm8gaXMgYSBwYWlyIChob3N0YWRk
ciwgcG9ydCkuIjsKCi8qIHMuc2V0YmxvY2tpbmcoMSB8IDApIG1ldGhvZCAqLwoKc3RhdGlj
IFB5T2JqZWN0ICoKc29ja19zZXRibG9ja2luZyhQeVNvY2tldFNvY2tPYmplY3QgKnMsIFB5
T2JqZWN0ICphcmcpCnsKCWludCBibG9jazsKCglibG9jayA9IFB5SW50X0FzTG9uZyhhcmcp
OwoJaWYgKGJsb2NrID09IC0xICYmIFB5RXJyX09jY3VycmVkKCkpCgkJcmV0dXJuIE5VTEw7
CgoJcy0+c29ja19ibG9ja2luZyA9IGJsb2NrOwoJcy0+c29ja190aW1lb3V0ID0gLTEuMDsg
LyogQWx3YXlzIGNsZWFyIHRoZSB0aW1lb3V0ICovCglpbnRlcm5hbF9zZXRibG9ja2luZyhz
LCBibG9jayk7CgoJUHlfSU5DUkVGKFB5X05vbmUpOwoJcmV0dXJuIFB5X05vbmU7Cn0KCnN0
YXRpYyBjaGFyIHNldGJsb2NraW5nX2RvY1tdID0KInNldGJsb2NraW5nKGZsYWcpXG5cClxu
XApTZXQgdGhlIHNvY2tldCB0byBibG9ja2luZyAoZmxhZyBpcyB0cnVlKSBvciBub24tYmxv
Y2tpbmcgKGZhbHNlKS5cblwKVGhpcyB1c2VzIHRoZSBGSU9OQklPIGlvY3RsIHdpdGggdGhl
IE9fTkRFTEFZIGZsYWcuIjsKCi8qIHMuc2V0dGltZW91dChOb25lIHwgZmxvYXQpIG1ldGhv
ZC4KICAgQ2F1c2VzIGFuIGV4Y2VwdGlvbiB0byBiZSByYWlzZWQgd2hlbiB0aGUgZ2l2ZW4g
dGltZSBoYXMKICAgZWxhcHNlZCB3aGVuIHBlcmZvcm1pbmcgYSBibG9ja2luZyBzb2NrZXQg
b3BlcmF0aW9uLiAqLwpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX3NldHRpbWVvdXQoUHlTb2Nr
ZXRTb2NrT2JqZWN0ICpzLCBQeU9iamVjdCAqYXJnKQp7Cglkb3VibGUgdmFsdWU7CgoJaWYg
KGFyZyA9PSBQeV9Ob25lKQoJCXZhbHVlID0gLTEuMDsKCWVsc2UgewoJCXZhbHVlID0gUHlG
bG9hdF9Bc0RvdWJsZShhcmcpOwoJCWlmICh2YWx1ZSA8IDAuMCkgewoJCQlpZiAoIVB5RXJy
X09jY3VycmVkKCkpCgkJCQlQeUVycl9TZXRTdHJpbmcoUHlFeGNfVmFsdWVFcnJvciwKCQkJ
CQkJIkludmFsaWQgdGltZW91dCB2YWx1ZSIpOwoJCQlyZXR1cm4gTlVMTDsKCQl9Cgl9CgoJ
cy0+c29ja190aW1lb3V0ID0gdmFsdWU7CgoJLyogVGhlIHNlbWFudGljcyBvZiBzZXR0aW5n
IHNvY2tldCB0aW1lb3V0cyBhcmU6CgkgICBJZiB5b3Ugc2V0dGltZW91dCghPU5vbmUpOgoJ
ICAgICAgIFRoZSBhY3R1YWwgc29ja2V0IGdldHMgcHV0IGluIG5vbi1ibG9ja2luZyBtb2Rl
IGFuZCB0aGUgc2VsZWN0CgkgICAgICAgaXMgdXNlZCB0byBjb250cm9sIHRpbWVvdXRzLgoJ
ICAgRWxzZSBpZiB5b3Ugc2V0dGltZW91dChOb25lKSBbdGhlbiB2YWx1ZSBpcyAtMS4wXToK
CSAgICAgICBUaGUgb2xkIGJlaGF2aW9yIGlzIHVzZWQgQU5EIGF1dG9tYXRpY2FsbHksIHRo
ZSBzb2NrZXQgaXMgc2V0CgkgICAgICAgdG8gYmxvY2tpbmcgbW9kZS4gVGhhdCBtZWFucyB0
aGF0IHNvbWVvbmUgd2hvIHdhcyBkb2luZwoJICAgICAgIG5vbi1ibG9ja2luZyBzdHVmZiBi
ZWZvcmUsIHNldHMgYSB0aW1lb3V0LCBhbmQgdGhlbiB1bnNldHMKCSAgICAgICBvbmUsIHdp
bGwgaGF2ZSB0byBjYWxsIHNldGJsb2NraW5nKDApIGFnYWluIGlmIGhlIHdhbnRzCgkgICAg
ICAgbm9uLWJsb2NraW5nIHN0dWZmLiBUaGlzIG1ha2VzIHNlbnNlIGJlY2F1c2UgdGltZW91
dCBzdHVmZiBpcwoJICAgICAgIGJsb2NraW5nIGJ5IG5hdHVyZS4gKi8KCWludGVybmFsX3Nl
dGJsb2NraW5nKHMsIHZhbHVlIDwgMC4wKTsKCglzLT5zb2NrX2Jsb2NraW5nID0gMTsgLyog
QWx3YXlzIG5lZ2F0ZSBzZXRibG9ja2luZygpICovCgoJUHlfSU5DUkVGKFB5X05vbmUpOwoJ
cmV0dXJuIFB5X05vbmU7Cn0KCnN0YXRpYyBjaGFyIHNldHRpbWVvdXRfZG9jW10gPQoic2V0
dGltZW91dCh0aW1lb3V0KVxuXApcblwKU2V0IGEgdGltZW91dCBvbiBibG9ja2luZyBzb2Nr
ZXQgb3BlcmF0aW9ucy4gICd0aW1lb3V0JyBjYW4gYmUgYSBmbG9hdCxcblwKZ2l2aW5nIHNl
Y29uZHMsIG9yIE5vbmUuICBTZXR0aW5nIGEgdGltZW91dCBvZiBOb25lIGRpc2FibGVzIHRp
bWVvdXQuIjsKCi8qIHMuZ2V0dGltZW91dCgpIG1ldGhvZC4KICAgUmV0dXJucyB0aGUgdGlt
ZW91dCBhc3NvY2lhdGVkIHdpdGggYSBzb2NrZXQuICovCnN0YXRpYyBQeU9iamVjdCAqCnNv
Y2tfZ2V0dGltZW91dChQeVNvY2tldFNvY2tPYmplY3QgKnMpCnsKCWlmIChzLT5zb2NrX3Rp
bWVvdXQgPCAwLjApIHsKCQlQeV9JTkNSRUYoUHlfTm9uZSk7CgkJcmV0dXJuIFB5X05vbmU7
Cgl9CgllbHNlCgkJcmV0dXJuIFB5RmxvYXRfRnJvbURvdWJsZShzLT5zb2NrX3RpbWVvdXQp
Owp9CgpzdGF0aWMgY2hhciBnZXR0aW1lb3V0X2RvY1tdID0KImdldHRpbWVvdXQoKVxuXApc
blwKUmV0dXJucyB0aGUgdGltZW91dCBpbiBmbG9hdGluZyBzZWNvbmRzIGFzc29jaWF0ZWQg
d2l0aCBzb2NrZXQgXG5cCm9wZXJhdGlvbnMuIEEgdGltZW91dCBvZiBOb25lIGluZGljYXRl
cyB0aGF0IHRpbWVvdXRzIG9uIHNvY2tldCBcblwKb3BlcmF0aW9ucyBhcmUgZGlzYWJsZWQu
IjsKCiNpZmRlZiBSSVNDT1MKLyogcy5zbGVlcHRhc2t3KDEgfCAwKSBtZXRob2QgKi8KCnN0
YXRpYyBQeU9iamVjdCAqCnNvY2tfc2xlZXB0YXNrdyhQeVNvY2tldFNvY2tPYmplY3QgKnMs
UHlPYmplY3QgKmFyZ3MpCnsKCWludCBibG9jazsKCWludCBkZWxheV9mbGFnOwoJaWYgKCFQ
eUFyZ19QYXJzZShhcmdzLCAiaSIsICZibG9jaykpCgkJcmV0dXJuIE5VTEw7CglQeV9CRUdJ
Tl9BTExPV19USFJFQURTCglzb2NrZXRpb2N0bChzLT5zb2NrX2ZkLCAweDgwMDQ2Njc5LCAo
dV9sb25nKikmYmxvY2spOwoJUHlfRU5EX0FMTE9XX1RIUkVBRFMKCglQeV9JTkNSRUYoUHlf
Tm9uZSk7CglyZXR1cm4gUHlfTm9uZTsKfQpzdGF0aWMgY2hhciBzbGVlcHRhc2t3X2RvY1td
ID0KInNsZWVwdGFza3coZmxhZylcblwKXG5cCkFsbG93IHNsZWVwcyBpbiB0YXNrd2luZG93
cy4iOwojZW5kaWYKCgovKiBzLnNldHNvY2tvcHQoKSBtZXRob2QuCiAgIFdpdGggYW4gaW50
ZWdlciB0aGlyZCBhcmd1bWVudCwgc2V0cyBhbiBpbnRlZ2VyIG9wdGlvbi4KICAgV2l0aCBh
IHN0cmluZyB0aGlyZCBhcmd1bWVudCwgc2V0cyBhbiBvcHRpb24gZnJvbSBhIGJ1ZmZlcjsK
ICAgdXNlIG9wdGlvbmFsIGJ1aWx0LWluIG1vZHVsZSAnc3RydWN0JyB0byBlbmNvZGUgdGhl
IHN0cmluZy4gKi8KCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tfc2V0c29ja29wdChQeVNvY2tl
dFNvY2tPYmplY3QgKnMsIFB5T2JqZWN0ICphcmdzKQp7CglpbnQgbGV2ZWw7CglpbnQgb3B0
bmFtZTsKCWludCByZXM7CgljaGFyICpidWY7CglpbnQgYnVmbGVuOwoJaW50IGZsYWc7CgoJ
aWYgKFB5QXJnX1BhcnNlVHVwbGUoYXJncywgImlpaTpzZXRzb2Nrb3B0IiwKCQkJICAgICAm
bGV2ZWwsICZvcHRuYW1lLCAmZmxhZykpIHsKCQlidWYgPSAoY2hhciAqKSAmZmxhZzsKCQli
dWZsZW4gPSBzaXplb2YgZmxhZzsKCX0KCWVsc2UgewoJCVB5RXJyX0NsZWFyKCk7CgkJaWYg
KCFQeUFyZ19QYXJzZVR1cGxlKGFyZ3MsICJpaXMjOnNldHNvY2tvcHQiLAoJCQkJICAgICAg
JmxldmVsLCAmb3B0bmFtZSwgJmJ1ZiwgJmJ1ZmxlbikpCgkJCXJldHVybiBOVUxMOwoJfQoJ
cmVzID0gc2V0c29ja29wdChzLT5zb2NrX2ZkLCBsZXZlbCwgb3B0bmFtZSwgKHZvaWQgKili
dWYsIGJ1Zmxlbik7CglpZiAocmVzIDwgMCkKCQlyZXR1cm4gcy0+ZXJyb3JoYW5kbGVyKCk7
CglQeV9JTkNSRUYoUHlfTm9uZSk7CglyZXR1cm4gUHlfTm9uZTsKfQoKc3RhdGljIGNoYXIg
c2V0c29ja29wdF9kb2NbXSA9CiJzZXRzb2Nrb3B0KGxldmVsLCBvcHRpb24sIHZhbHVlKVxu
XApcblwKU2V0IGEgc29ja2V0IG9wdGlvbi4gIFNlZSB0aGUgVW5peCBtYW51YWwgZm9yIGxl
dmVsIGFuZCBvcHRpb24uXG5cClRoZSB2YWx1ZSBhcmd1bWVudCBjYW4gZWl0aGVyIGJlIGFu
IGludGVnZXIgb3IgYSBzdHJpbmcuIjsKCgovKiBzLmdldHNvY2tvcHQoKSBtZXRob2QuCiAg
IFdpdGggdHdvIGFyZ3VtZW50cywgcmV0cmlldmVzIGFuIGludGVnZXIgb3B0aW9uLgogICBX
aXRoIGEgdGhpcmQgaW50ZWdlciBhcmd1bWVudCwgcmV0cmlldmVzIGEgc3RyaW5nIGJ1ZmZl
ciBvZiB0aGF0IHNpemU7CiAgIHVzZSBvcHRpb25hbCBidWlsdC1pbiBtb2R1bGUgJ3N0cnVj
dCcgdG8gZGVjb2RlIHRoZSBzdHJpbmcuICovCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX2dl
dHNvY2tvcHQoUHlTb2NrZXRTb2NrT2JqZWN0ICpzLCBQeU9iamVjdCAqYXJncykKewoJaW50
IGxldmVsOwoJaW50IG9wdG5hbWU7CglpbnQgcmVzOwoJUHlPYmplY3QgKmJ1ZjsKCXNvY2ts
ZW5fdCBidWZsZW4gPSAwOwoKI2lmZGVmIF9fQkVPU19fCgkvKiBXZSBoYXZlIGluY29tcGxl
dGUgc29ja2V0IHN1cHBvcnQuICovCglQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9yLCAi
Z2V0c29ja29wdCBub3Qgc3VwcG9ydGVkIik7CglyZXR1cm4gTlVMTDsKI2Vsc2UKCglpZiAo
IVB5QXJnX1BhcnNlVHVwbGUoYXJncywgImlpfGk6Z2V0c29ja29wdCIsCgkJCSAgICAgICZs
ZXZlbCwgJm9wdG5hbWUsICZidWZsZW4pKQoJCXJldHVybiBOVUxMOwoKCWlmIChidWZsZW4g
PT0gMCkgewoJCWludCBmbGFnID0gMDsKCQlzb2NrbGVuX3QgZmxhZ3NpemUgPSBzaXplb2Yg
ZmxhZzsKCQlyZXMgPSBnZXRzb2Nrb3B0KHMtPnNvY2tfZmQsIGxldmVsLCBvcHRuYW1lLAoJ
CQkJICh2b2lkICopJmZsYWcsICZmbGFnc2l6ZSk7CgkJaWYgKHJlcyA8IDApCgkJCXJldHVy
biBzLT5lcnJvcmhhbmRsZXIoKTsKCQlyZXR1cm4gUHlJbnRfRnJvbUxvbmcoZmxhZyk7Cgl9
CglpZiAoYnVmbGVuIDw9IDAgfHwgYnVmbGVuID4gMTAyNCkgewoJCVB5RXJyX1NldFN0cmlu
Zyhzb2NrZXRfZXJyb3IsCgkJCQkiZ2V0c29ja29wdCBidWZsZW4gb3V0IG9mIHJhbmdlIik7
CgkJcmV0dXJuIE5VTEw7Cgl9CglidWYgPSBQeVN0cmluZ19Gcm9tU3RyaW5nQW5kU2l6ZSgo
Y2hhciAqKU5VTEwsIGJ1Zmxlbik7CglpZiAoYnVmID09IE5VTEwpCgkJcmV0dXJuIE5VTEw7
CglyZXMgPSBnZXRzb2Nrb3B0KHMtPnNvY2tfZmQsIGxldmVsLCBvcHRuYW1lLAoJCQkgKHZv
aWQgKilQeVN0cmluZ19BU19TVFJJTkcoYnVmKSwgJmJ1Zmxlbik7CglpZiAocmVzIDwgMCkg
ewoJCVB5X0RFQ1JFRihidWYpOwoJCXJldHVybiBzLT5lcnJvcmhhbmRsZXIoKTsKCX0KCV9Q
eVN0cmluZ19SZXNpemUoJmJ1ZiwgYnVmbGVuKTsKCXJldHVybiBidWY7CiNlbmRpZiAvKiBf
X0JFT1NfXyAqLwp9CgpzdGF0aWMgY2hhciBnZXRzb2Nrb3B0X2RvY1tdID0KImdldHNvY2tv
cHQobGV2ZWwsIG9wdGlvblssIGJ1ZmZlcnNpemVdKSAtPiB2YWx1ZVxuXApcblwKR2V0IGEg
c29ja2V0IG9wdGlvbi4gIFNlZSB0aGUgVW5peCBtYW51YWwgZm9yIGxldmVsIGFuZCBvcHRp
b24uXG5cCklmIGEgbm9uemVybyBidWZmZXJzaXplIGFyZ3VtZW50IGlzIGdpdmVuLCB0aGUg
cmV0dXJuIHZhbHVlIGlzIGFcblwKc3RyaW5nIG9mIHRoYXQgbGVuZ3RoOyBvdGhlcndpc2Ug
aXQgaXMgYW4gaW50ZWdlci4iOwoKCi8qIHMuYmluZChzb2NrYWRkcikgbWV0aG9kICovCgpz
dGF0aWMgUHlPYmplY3QgKgpzb2NrX2JpbmQoUHlTb2NrZXRTb2NrT2JqZWN0ICpzLCBQeU9i
amVjdCAqYWRkcm8pCnsKCXN0cnVjdCBzb2NrYWRkciAqYWRkcjsKCWludCBhZGRybGVuOwoJ
aW50IHJlczsKCglpZiAoIWdldHNvY2thZGRyYXJnKHMsIGFkZHJvLCAmYWRkciwgJmFkZHJs
ZW4pKQoJCXJldHVybiBOVUxMOwoJUHlfQkVHSU5fQUxMT1dfVEhSRUFEUwoJcmVzID0gYmlu
ZChzLT5zb2NrX2ZkLCBhZGRyLCBhZGRybGVuKTsKCVB5X0VORF9BTExPV19USFJFQURTCglp
ZiAocmVzIDwgMCkKCQlyZXR1cm4gcy0+ZXJyb3JoYW5kbGVyKCk7CglQeV9JTkNSRUYoUHlf
Tm9uZSk7CglyZXR1cm4gUHlfTm9uZTsKfQoKc3RhdGljIGNoYXIgYmluZF9kb2NbXSA9CiJi
aW5kKGFkZHJlc3MpXG5cClxuXApCaW5kIHRoZSBzb2NrZXQgdG8gYSBsb2NhbCBhZGRyZXNz
LiAgRm9yIElQIHNvY2tldHMsIHRoZSBhZGRyZXNzIGlzIGFcblwKcGFpciAoaG9zdCwgcG9y
dCk7IHRoZSBob3N0IG11c3QgcmVmZXIgdG8gdGhlIGxvY2FsIGhvc3QuIEZvciByYXcgcGFj
a2V0XG5cCnNvY2tldHMgdGhlIGFkZHJlc3MgaXMgYSB0dXBsZSAoaWZuYW1lLCBwcm90byBb
LHBrdHR5cGUgWyxoYXR5cGVdXSkiOwoKCi8qIHMuY2xvc2UoKSBtZXRob2QuCiAgIFNldCB0
aGUgZmlsZSBkZXNjcmlwdG9yIHRvIC0xIHNvIG9wZXJhdGlvbnMgdHJpZWQgc3Vic2VxdWVu
dGx5CiAgIHdpbGwgc3VyZWx5IGZhaWwuICovCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX2Ns
b3NlKFB5U29ja2V0U29ja09iamVjdCAqcykKewoJU09DS0VUX1QgZmQ7CgoJaWYgKChmZCA9
IHMtPnNvY2tfZmQpICE9IC0xKSB7CgkJcy0+c29ja19mZCA9IC0xOwoJCVB5X0JFR0lOX0FM
TE9XX1RIUkVBRFMKCQkodm9pZCkgU09DS0VUQ0xPU0UoZmQpOwoJCVB5X0VORF9BTExPV19U
SFJFQURTCgl9CglQeV9JTkNSRUYoUHlfTm9uZSk7CglyZXR1cm4gUHlfTm9uZTsKfQoKc3Rh
dGljIGNoYXIgY2xvc2VfZG9jW10gPQoiY2xvc2UoKVxuXApcblwKQ2xvc2UgdGhlIHNvY2tl
dC4gIEl0IGNhbm5vdCBiZSB1c2VkIGFmdGVyIHRoaXMgY2FsbC4iOwoKCi8qIHMuY29ubmVj
dChzb2NrYWRkcikgbWV0aG9kICovCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX2Nvbm5lY3Qo
UHlTb2NrZXRTb2NrT2JqZWN0ICpzLCBQeU9iamVjdCAqYWRkcm8pCnsKCXN0cnVjdCBzb2Nr
YWRkciAqYWRkcjsKCWludCBhZGRybGVuOwoJaW50IHJlczsKCglpZiAoIWdldHNvY2thZGRy
YXJnKHMsIGFkZHJvLCAmYWRkciwgJmFkZHJsZW4pKQoJCXJldHVybiBOVUxMOwoKCWVycm5v
ID0gMDsgLyogUmVzZXQgdGhlIGVyciBpbmRpY2F0b3IgZm9yIHVzZSB3aXRoIHRpbWVvdXRz
ICovCgoJUHlfQkVHSU5fQUxMT1dfVEhSRUFEUwoJcmVzID0gY29ubmVjdChzLT5zb2NrX2Zk
LCBhZGRyLCBhZGRybGVuKTsKCVB5X0VORF9BTExPV19USFJFQURTCgoJaWYgKHMtPnNvY2tf
dGltZW91dCA+PSAwLjApIHsKCQlpZiAocmVzIDwgMCkgewoJCQkvKiBSZXR1cm4gaWYgd2Un
cmUgYWxyZWFkeSBjb25uZWN0ZWQgKi8KI2lmZGVmIE1TX1dJTkRPV1MKCQkJaWYgKGVycm5v
ID09IFdTQUVJTlZBTCB8fCBlcnJubyA9PSBXU0FFSVNDT05OKQojZWxzZQoJCQlpZiAoZXJy
bm8gPT0gRUlTQ09OTikKI2VuZGlmCgkJCQlnb3RvIGNvbm5lY3RlZDsKCgkJCS8qIENoZWNr
IGlmIHdlIGhhdmUgYW4gZXJyb3IgKi8KCQkJaWYgKCFzLT5zb2NrX2Jsb2NraW5nKQoJCQkJ
cmV0dXJuIHMtPmVycm9yaGFuZGxlcigpOwoJCQkvKiBDaGVjayBpZiB3ZSBoYXZlIGEgdHJ1
ZSBmYWlsdXJlCgkJCSAgIGZvciBhIGJsb2NraW5nIHNvY2tldCAqLwojaWZkZWYgTVNfV0lO
RE9XUwoJCQlpZiAoZXJybm8gIT0gV1NBRVdPVUxEQkxPQ0spCiNlbHNlCgkJCWlmIChlcnJu
byAhPSBFSU5QUk9HUkVTUyAmJiBlcnJubyAhPSBFQUxSRUFEWSAmJgoJCQkgICAgZXJybm8g
IT0gRVdPVUxEQkxPQ0spCiNlbmRpZgoJCQkJcmV0dXJuIHMtPmVycm9yaGFuZGxlcigpOwoJ
CX0KCgkJLyogQ2hlY2sgaWYgd2UncmUgcmVhZHkgZm9yIHRoZSBjb25uZWN0IHZpYSBzZWxl
Y3QgKi8KCQlpZiAoaW50ZXJuYWxfc2VsZWN0KHMsIDEpIDw9IDApCgkJCXJldHVybiBOVUxM
OwoKCQkvKiBDb21wbGV0ZSB0aGUgY29ubmVjdGlvbiBub3cgKi8KCQlQeV9CRUdJTl9BTExP
V19USFJFQURTCgkJcmVzID0gY29ubmVjdChzLT5zb2NrX2ZkLCBhZGRyLCBhZGRybGVuKTsK
CQlQeV9FTkRfQUxMT1dfVEhSRUFEUwoJfQoKCWlmIChyZXMgPCAwKQoJCXJldHVybiBzLT5l
cnJvcmhhbmRsZXIoKTsKCmNvbm5lY3RlZDoKCVB5X0lOQ1JFRihQeV9Ob25lKTsKCXJldHVy
biBQeV9Ob25lOwp9CgpzdGF0aWMgY2hhciBjb25uZWN0X2RvY1tdID0KImNvbm5lY3QoYWRk
cmVzcylcblwKXG5cCkNvbm5lY3QgdGhlIHNvY2tldCB0byBhIHJlbW90ZSBhZGRyZXNzLiAg
Rm9yIElQIHNvY2tldHMsIHRoZSBhZGRyZXNzXG5cCmlzIGEgcGFpciAoaG9zdCwgcG9ydCku
IjsKCgovKiBzLmNvbm5lY3RfZXgoc29ja2FkZHIpIG1ldGhvZCAqLwoKc3RhdGljIFB5T2Jq
ZWN0ICoKc29ja19jb25uZWN0X2V4KFB5U29ja2V0U29ja09iamVjdCAqcywgUHlPYmplY3Qg
KmFkZHJvKQp7CglzdHJ1Y3Qgc29ja2FkZHIgKmFkZHI7CglpbnQgYWRkcmxlbjsKCWludCBy
ZXM7CgoJaWYgKCFnZXRzb2NrYWRkcmFyZyhzLCBhZGRybywgJmFkZHIsICZhZGRybGVuKSkK
CQlyZXR1cm4gTlVMTDsKCgllcnJubyA9IDA7IC8qIFJlc2V0IHRoZSBlcnIgaW5kaWNhdG9y
IGZvciB1c2Ugd2l0aCB0aW1lb3V0cyAqLwoKCVB5X0JFR0lOX0FMTE9XX1RIUkVBRFMKCXJl
cyA9IGNvbm5lY3Qocy0+c29ja19mZCwgYWRkciwgYWRkcmxlbik7CglQeV9FTkRfQUxMT1df
VEhSRUFEUwoKCWlmIChzLT5zb2NrX3RpbWVvdXQgPj0gMC4wKSB7CgkJaWYgKHJlcyA8IDAp
IHsKCQkJLyogUmV0dXJuIGlmIHdlJ3JlIGFscmVhZHkgY29ubmVjdGVkICovCiNpZmRlZiBN
U19XSU5ET1dTCgkJCWlmIChlcnJubyA9PSBXU0FFSU5WQUwgfHwgZXJybm8gPT0gV1NBRUlT
Q09OTikKI2Vsc2UKCQkJaWYgKGVycm5vID09IEVJU0NPTk4pCiNlbmRpZgoJCQkJZ290byBj
b25leF9maW5hbGx5OwoKCQkJLyogQ2hlY2sgaWYgd2UgaGF2ZSBhbiBlcnJvciAqLwoJCQlp
ZiAoIXMtPnNvY2tfYmxvY2tpbmcpCgkJCQlnb3RvIGNvbmV4X2ZpbmFsbHk7CgkJCS8qIENo
ZWNrIGlmIHdlIGhhdmUgYSB0cnVlIGZhaWx1cmUKCQkJICAgZm9yIGEgYmxvY2tpbmcgc29j
a2V0ICovCiNpZmRlZiBNU19XSU5ET1dTCgkJCWlmIChlcnJubyAhPSBXU0FFV09VTERCTE9D
SykKI2Vsc2UKCQkJaWYgKGVycm5vICE9IEVJTlBST0dSRVNTICYmIGVycm5vICE9IEVBTFJF
QURZICYmCgkJCSAgICBlcnJubyAhPSBFV09VTERCTE9DSykKI2VuZGlmCgkJCQlnb3RvIGNv
bmV4X2ZpbmFsbHk7CgkJfQoKCQkvKiBDaGVjayBpZiB3ZSdyZSByZWFkeSBmb3IgdGhlIGNv
bm5lY3QgdmlhIHNlbGVjdCAqLwoJCWlmIChpbnRlcm5hbF9zZWxlY3QocywgMSkgPD0gMCkK
CQkJcmV0dXJuIE5VTEw7CgoJCS8qIENvbXBsZXRlIHRoZSBjb25uZWN0aW9uIG5vdyAqLwoJ
CVB5X0JFR0lOX0FMTE9XX1RIUkVBRFMKCQlyZXMgPSBjb25uZWN0KHMtPnNvY2tfZmQsIGFk
ZHIsIGFkZHJsZW4pOwoJCVB5X0VORF9BTExPV19USFJFQURTCgl9CgoJaWYgKHJlcyAhPSAw
KSB7CiNpZmRlZiBNU19XSU5ET1dTCgkJcmVzID0gV1NBR2V0TGFzdEVycm9yKCk7CiNlbHNl
CgkJcmVzID0gZXJybm87CiNlbmRpZgoJfQoKY29uZXhfZmluYWxseToKCXJldHVybiBQeUlu
dF9Gcm9tTG9uZygobG9uZykgcmVzKTsKfQoKc3RhdGljIGNoYXIgY29ubmVjdF9leF9kb2Nb
XSA9CiJjb25uZWN0X2V4KGFkZHJlc3MpXG5cClxuXApUaGlzIGlzIGxpa2UgY29ubmVjdChh
ZGRyZXNzKSwgYnV0IHJldHVybnMgYW4gZXJyb3IgY29kZSAodGhlIGVycm5vIHZhbHVlKVxu
XAppbnN0ZWFkIG9mIHJhaXNpbmcgYW4gZXhjZXB0aW9uIHdoZW4gYW4gZXJyb3Igb2NjdXJz
LiI7CgoKLyogcy5maWxlbm8oKSBtZXRob2QgKi8KCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tf
ZmlsZW5vKFB5U29ja2V0U29ja09iamVjdCAqcykKewojaWYgU0laRU9GX1NPQ0tFVF9UIDw9
IFNJWkVPRl9MT05HCglyZXR1cm4gUHlJbnRfRnJvbUxvbmcoKGxvbmcpIHMtPnNvY2tfZmQp
OwojZWxzZQoJcmV0dXJuIFB5TG9uZ19Gcm9tTG9uZ0xvbmcoKExPTkdfTE9ORylzLT5zb2Nr
X2ZkKTsKI2VuZGlmCn0KCnN0YXRpYyBjaGFyIGZpbGVub19kb2NbXSA9CiJmaWxlbm8oKSAt
PiBpbnRlZ2VyXG5cClxuXApSZXR1cm4gdGhlIGludGVnZXIgZmlsZSBkZXNjcmlwdG9yIG9m
IHRoZSBzb2NrZXQuIjsKCgojaWZuZGVmIE5PX0RVUAovKiBzLmR1cCgpIG1ldGhvZCAqLwoK
c3RhdGljIFB5T2JqZWN0ICoKc29ja19kdXAoUHlTb2NrZXRTb2NrT2JqZWN0ICpzKQp7CglT
T0NLRVRfVCBuZXdmZDsKCVB5T2JqZWN0ICpzb2NrOwoKCW5ld2ZkID0gZHVwKHMtPnNvY2tf
ZmQpOwoJaWYgKG5ld2ZkIDwgMCkKCQlyZXR1cm4gcy0+ZXJyb3JoYW5kbGVyKCk7Cglzb2Nr
ID0gKFB5T2JqZWN0ICopIG5ld19zb2Nrb2JqZWN0KG5ld2ZkLAoJCQkJCSAgIHMtPnNvY2tf
ZmFtaWx5LAoJCQkJCSAgIHMtPnNvY2tfdHlwZSwKCQkJCQkgICBzLT5zb2NrX3Byb3RvKTsK
CWlmIChzb2NrID09IE5VTEwpCgkJU09DS0VUQ0xPU0UobmV3ZmQpOwoJcmV0dXJuIHNvY2s7
Cn0KCnN0YXRpYyBjaGFyIGR1cF9kb2NbXSA9CiJkdXAoKSAtPiBzb2NrZXQgb2JqZWN0XG5c
ClxuXApSZXR1cm4gYSBuZXcgc29ja2V0IG9iamVjdCBjb25uZWN0ZWQgdG8gdGhlIHNhbWUg
c3lzdGVtIHJlc291cmNlLiI7CgojZW5kaWYKCgovKiBzLmdldHNvY2tuYW1lKCkgbWV0aG9k
ICovCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX2dldHNvY2tuYW1lKFB5U29ja2V0U29ja09i
amVjdCAqcykKewoJY2hhciBhZGRyYnVmWzI1Nl07CglpbnQgcmVzOwoJc29ja2xlbl90IGFk
ZHJsZW47CgoJaWYgKCFnZXRzb2NrYWRkcmxlbihzLCAmYWRkcmxlbikpCgkJcmV0dXJuIE5V
TEw7CgltZW1zZXQoYWRkcmJ1ZiwgMCwgYWRkcmxlbik7CglQeV9CRUdJTl9BTExPV19USFJF
QURTCglyZXMgPSBnZXRzb2NrbmFtZShzLT5zb2NrX2ZkLCAoc3RydWN0IHNvY2thZGRyICop
IGFkZHJidWYsICZhZGRybGVuKTsKCVB5X0VORF9BTExPV19USFJFQURTCglpZiAocmVzIDwg
MCkKCQlyZXR1cm4gcy0+ZXJyb3JoYW5kbGVyKCk7CglyZXR1cm4gbWFrZXNvY2thZGRyKHMt
PnNvY2tfZmQsIChzdHJ1Y3Qgc29ja2FkZHIgKikgYWRkcmJ1ZiwgYWRkcmxlbik7Cn0KCnN0
YXRpYyBjaGFyIGdldHNvY2tuYW1lX2RvY1tdID0KImdldHNvY2tuYW1lKCkgLT4gYWRkcmVz
cyBpbmZvXG5cClxuXApSZXR1cm4gdGhlIGFkZHJlc3Mgb2YgdGhlIGxvY2FsIGVuZHBvaW50
LiAgRm9yIElQIHNvY2tldHMsIHRoZSBhZGRyZXNzXG5cCmluZm8gaXMgYSBwYWlyIChob3N0
YWRkciwgcG9ydCkuIjsKCgojaWZkZWYgSEFWRV9HRVRQRUVSTkFNRQkJLyogQ3JheSBBUFAg
ZG9lc24ndCBoYXZlIHRoaXMgOi0oICovCi8qIHMuZ2V0cGVlcm5hbWUoKSBtZXRob2QgKi8K
CnN0YXRpYyBQeU9iamVjdCAqCnNvY2tfZ2V0cGVlcm5hbWUoUHlTb2NrZXRTb2NrT2JqZWN0
ICpzKQp7CgljaGFyIGFkZHJidWZbMjU2XTsKCWludCByZXM7Cglzb2NrbGVuX3QgYWRkcmxl
bjsKCglpZiAoIWdldHNvY2thZGRybGVuKHMsICZhZGRybGVuKSkKCQlyZXR1cm4gTlVMTDsK
CW1lbXNldChhZGRyYnVmLCAwLCBhZGRybGVuKTsKCVB5X0JFR0lOX0FMTE9XX1RIUkVBRFMK
CXJlcyA9IGdldHBlZXJuYW1lKHMtPnNvY2tfZmQsIChzdHJ1Y3Qgc29ja2FkZHIgKikgYWRk
cmJ1ZiwgJmFkZHJsZW4pOwoJUHlfRU5EX0FMTE9XX1RIUkVBRFMKCWlmIChyZXMgPCAwKQoJ
CXJldHVybiBzLT5lcnJvcmhhbmRsZXIoKTsKCXJldHVybiBtYWtlc29ja2FkZHIocy0+c29j
a19mZCwgKHN0cnVjdCBzb2NrYWRkciAqKSBhZGRyYnVmLCBhZGRybGVuKTsKfQoKc3RhdGlj
IGNoYXIgZ2V0cGVlcm5hbWVfZG9jW10gPQoiZ2V0cGVlcm5hbWUoKSAtPiBhZGRyZXNzIGlu
Zm9cblwKXG5cClJldHVybiB0aGUgYWRkcmVzcyBvZiB0aGUgcmVtb3RlIGVuZHBvaW50LiAg
Rm9yIElQIHNvY2tldHMsIHRoZSBhZGRyZXNzXG5cCmluZm8gaXMgYSBwYWlyIChob3N0YWRk
ciwgcG9ydCkuIjsKCiNlbmRpZiAvKiBIQVZFX0dFVFBFRVJOQU1FICovCgoKLyogcy5saXN0
ZW4obikgbWV0aG9kICovCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX2xpc3RlbihQeVNvY2tl
dFNvY2tPYmplY3QgKnMsIFB5T2JqZWN0ICphcmcpCnsKCWludCBiYWNrbG9nOwoJaW50IHJl
czsKCgliYWNrbG9nID0gUHlJbnRfQXNMb25nKGFyZyk7CglpZiAoYmFja2xvZyA9PSAtMSAm
JiBQeUVycl9PY2N1cnJlZCgpKQoJCXJldHVybiBOVUxMOwoJUHlfQkVHSU5fQUxMT1dfVEhS
RUFEUwoJaWYgKGJhY2tsb2cgPCAxKQoJCWJhY2tsb2cgPSAxOwoJcmVzID0gbGlzdGVuKHMt
PnNvY2tfZmQsIGJhY2tsb2cpOwoJUHlfRU5EX0FMTE9XX1RIUkVBRFMKCWlmIChyZXMgPCAw
KQoJCXJldHVybiBzLT5lcnJvcmhhbmRsZXIoKTsKCVB5X0lOQ1JFRihQeV9Ob25lKTsKCXJl
dHVybiBQeV9Ob25lOwp9CgpzdGF0aWMgY2hhciBsaXN0ZW5fZG9jW10gPQoibGlzdGVuKGJh
Y2tsb2cpXG5cClxuXApFbmFibGUgYSBzZXJ2ZXIgdG8gYWNjZXB0IGNvbm5lY3Rpb25zLiAg
VGhlIGJhY2tsb2cgYXJndW1lbnQgbXVzdCBiZSBhdFxuXApsZWFzdCAxOyBpdCBzcGVjaWZp
ZXMgdGhlIG51bWJlciBvZiB1bmFjY2VwdGVkIGNvbm5lY3Rpb24gdGhhdCB0aGUgc3lzdGVt
XG5cCndpbGwgYWxsb3cgYmVmb3JlIHJlZnVzaW5nIG5ldyBjb25uZWN0aW9ucy4iOwoKCiNp
Zm5kZWYgTk9fRFVQCi8qIHMubWFrZWZpbGUobW9kZSkgbWV0aG9kLgogICBDcmVhdGUgYSBu
ZXcgb3BlbiBmaWxlIG9iamVjdCByZWZlcnJpbmcgdG8gYSBkdXBwZWQgdmVyc2lvbiBvZgog
ICB0aGUgc29ja2V0J3MgZmlsZSBkZXNjcmlwdG9yLiAgKFRoZSBkdXAoKSBjYWxsIGlzIG5l
Y2Vzc2FyeSBzbwogICB0aGF0IHRoZSBvcGVuIGZpbGUgYW5kIHNvY2tldCBvYmplY3RzIG1h
eSBiZSBjbG9zZWQgaW5kZXBlbmRlbnQKICAgb2YgZWFjaCBvdGhlci4pCiAgIFRoZSBtb2Rl
IGFyZ3VtZW50IHNwZWNpZmllcyAncicgb3IgJ3cnIHBhc3NlZCB0byBmZG9wZW4oKS4gKi8K
CnN0YXRpYyBQeU9iamVjdCAqCnNvY2tfbWFrZWZpbGUoUHlTb2NrZXRTb2NrT2JqZWN0ICpz
LCBQeU9iamVjdCAqYXJncykKewoJZXh0ZXJuIGludCBmY2xvc2UoRklMRSAqKTsKCWNoYXIg
Km1vZGUgPSAiciI7CglpbnQgYnVmc2l6ZSA9IC0xOwojaWZkZWYgTVNfV0lOMzIKCVB5X2lu
dHB0cl90IGZkOwojZWxzZQoJaW50IGZkOwojZW5kaWYKCUZJTEUgKmZwOwoJUHlPYmplY3Qg
KmY7CgoJaWYgKCFQeUFyZ19QYXJzZVR1cGxlKGFyZ3MsICJ8c2k6bWFrZWZpbGUiLCAmbW9k
ZSwgJmJ1ZnNpemUpKQoJCXJldHVybiBOVUxMOwojaWZkZWYgTVNfV0lOMzIKCWlmICgoKGZk
ID0gX29wZW5fb3NmaGFuZGxlKHMtPnNvY2tfZmQsIF9PX0JJTkFSWSkpIDwgMCkgfHwKCSAg
ICAoKGZkID0gZHVwKGZkKSkgPCAwKSB8fCAoKGZwID0gZmRvcGVuKGZkLCBtb2RlKSkgPT0g
TlVMTCkpCiNlbHNlCglpZiAoKGZkID0gZHVwKHMtPnNvY2tfZmQpKSA8IDAgfHwgKGZwID0g
ZmRvcGVuKGZkLCBtb2RlKSkgPT0gTlVMTCkKI2VuZGlmCgl7CgkJaWYgKGZkID49IDApCgkJ
CVNPQ0tFVENMT1NFKGZkKTsKCQlyZXR1cm4gcy0+ZXJyb3JoYW5kbGVyKCk7Cgl9CiNpZmRl
ZiBVU0VfR1VTSTIKCS8qIFdvcmthcm91bmQgZm9yIGJ1ZyBpbiBNZXRyb3dlcmtzIE1TTCB2
cy4gR1VTSSBJL08gbGlicmFyeSAqLwoJaWYgKHN0cmNocihtb2RlLCAnYicpICE9IE5VTEwp
CgkJYnVmc2l6ZSA9IDA7CiNlbmRpZgoJZiA9IFB5RmlsZV9Gcm9tRmlsZShmcCwgIjxzb2Nr
ZXQ+IiwgbW9kZSwgZmNsb3NlKTsKCWlmIChmICE9IE5VTEwpCgkJUHlGaWxlX1NldEJ1ZlNp
emUoZiwgYnVmc2l6ZSk7CglyZXR1cm4gZjsKfQoKc3RhdGljIGNoYXIgbWFrZWZpbGVfZG9j
W10gPQoibWFrZWZpbGUoW21vZGVbLCBidWZmZXJzaXplXV0pIC0+IGZpbGUgb2JqZWN0XG5c
ClxuXApSZXR1cm4gYSByZWd1bGFyIGZpbGUgb2JqZWN0IGNvcnJlc3BvbmRpbmcgdG8gdGhl
IHNvY2tldC5cblwKVGhlIG1vZGUgYW5kIGJ1ZmZlcnNpemUgYXJndW1lbnRzIGFyZSBhcyBm
b3IgdGhlIGJ1aWx0LWluIG9wZW4oKSBmdW5jdGlvbi4iOwoKI2VuZGlmIC8qIE5PX0RVUCAq
LwoKCi8qIHMucmVjdihuYnl0ZXMgWyxmbGFnc10pIG1ldGhvZCAqLwoKc3RhdGljIFB5T2Jq
ZWN0ICoKc29ja19yZWN2KFB5U29ja2V0U29ja09iamVjdCAqcywgUHlPYmplY3QgKmFyZ3Mp
CnsKCWludCBsZW4sIG4sIGZsYWdzID0gMDsKCVB5T2JqZWN0ICpidWY7CgoJaWYgKCFQeUFy
Z19QYXJzZVR1cGxlKGFyZ3MsICJpfGk6cmVjdiIsICZsZW4sICZmbGFncykpCgkJcmV0dXJu
IE5VTEw7CgoJaWYgKGxlbiA8IDApIHsKCQlQeUVycl9TZXRTdHJpbmcoUHlFeGNfVmFsdWVF
cnJvciwKCQkJCSJuZWdhdGl2ZSBidWZmZXJzaXplIGluIGNvbm5lY3QiKTsKCQlyZXR1cm4g
TlVMTDsKCX0KCglidWYgPSBQeVN0cmluZ19Gcm9tU3RyaW5nQW5kU2l6ZSgoY2hhciAqKSAw
LCBsZW4pOwoJaWYgKGJ1ZiA9PSBOVUxMKQoJCXJldHVybiBOVUxMOwoKCWlmIChzLT5zb2Nr
X3RpbWVvdXQgPj0gMC4wKSB7CgkJaWYgKHMtPnNvY2tfYmxvY2tpbmcpIHsKCQkJaWYgKGlu
dGVybmFsX3NlbGVjdChzLCAwKSA8PSAwKQoJCQkJcmV0dXJuIE5VTEw7CgkJfQoJfQoKCVB5
X0JFR0lOX0FMTE9XX1RIUkVBRFMKCW4gPSByZWN2KHMtPnNvY2tfZmQsIFB5U3RyaW5nX0FT
X1NUUklORyhidWYpLCBsZW4sIGZsYWdzKTsKCVB5X0VORF9BTExPV19USFJFQURTCgoJaWYg
KG4gPCAwKSB7CgkJUHlfREVDUkVGKGJ1Zik7CgkJcmV0dXJuIHMtPmVycm9yaGFuZGxlcigp
OwoJfQoJaWYgKG4gIT0gbGVuKQoJCV9QeVN0cmluZ19SZXNpemUoJmJ1Ziwgbik7CglyZXR1
cm4gYnVmOwp9CgpzdGF0aWMgY2hhciByZWN2X2RvY1tdID0KInJlY3YoYnVmZmVyc2l6ZVss
IGZsYWdzXSkgLT4gZGF0YVxuXApcblwKUmVjZWl2ZSB1cCB0byBidWZmZXJzaXplIGJ5dGVz
IGZyb20gdGhlIHNvY2tldC4gIEZvciB0aGUgb3B0aW9uYWwgZmxhZ3NcblwKYXJndW1lbnQs
IHNlZSB0aGUgVW5peCBtYW51YWwuICBXaGVuIG5vIGRhdGEgaXMgYXZhaWxhYmxlLCBibG9j
ayB1bnRpbFxuXAphdCBsZWFzdCBvbmUgYnl0ZSBpcyBhdmFpbGFibGUgb3IgdW50aWwgdGhl
IHJlbW90ZSBlbmQgaXMgY2xvc2VkLiAgV2hlblxuXAp0aGUgcmVtb3RlIGVuZCBpcyBjbG9z
ZWQgYW5kIGFsbCBkYXRhIGlzIHJlYWQsIHJldHVybiB0aGUgZW1wdHkgc3RyaW5nLiI7CgoK
Lyogcy5yZWN2ZnJvbShuYnl0ZXMgWyxmbGFnc10pIG1ldGhvZCAqLwoKc3RhdGljIFB5T2Jq
ZWN0ICoKc29ja19yZWN2ZnJvbShQeVNvY2tldFNvY2tPYmplY3QgKnMsIFB5T2JqZWN0ICph
cmdzKQp7CgljaGFyIGFkZHJidWZbMjU2XTsKCVB5T2JqZWN0ICpidWYgPSBOVUxMOwoJUHlP
YmplY3QgKmFkZHIgPSBOVUxMOwoJUHlPYmplY3QgKnJldCA9IE5VTEw7CglpbnQgbGVuLCBu
LCBmbGFncyA9IDA7Cglzb2NrbGVuX3QgYWRkcmxlbjsKCglpZiAoIVB5QXJnX1BhcnNlVHVw
bGUoYXJncywgIml8aTpyZWN2ZnJvbSIsICZsZW4sICZmbGFncykpCgkJcmV0dXJuIE5VTEw7
CgoJaWYgKCFnZXRzb2NrYWRkcmxlbihzLCAmYWRkcmxlbikpCgkJcmV0dXJuIE5VTEw7Cgli
dWYgPSBQeVN0cmluZ19Gcm9tU3RyaW5nQW5kU2l6ZSgoY2hhciAqKSAwLCBsZW4pOwoJaWYg
KGJ1ZiA9PSBOVUxMKQoJCXJldHVybiBOVUxMOwoKCWlmIChzLT5zb2NrX3RpbWVvdXQgPj0g
MC4wKSB7CgkJaWYgKHMtPnNvY2tfYmxvY2tpbmcpIHsKCQkJaWYgKGludGVybmFsX3NlbGVj
dChzLCAwKSA8PSAwKQoJCQkJcmV0dXJuIE5VTEw7CgkJfQoJfQoKCVB5X0JFR0lOX0FMTE9X
X1RIUkVBRFMKCW1lbXNldChhZGRyYnVmLCAwLCBhZGRybGVuKTsKCW4gPSByZWN2ZnJvbShz
LT5zb2NrX2ZkLCBQeVN0cmluZ19BU19TVFJJTkcoYnVmKSwgbGVuLCBmbGFncywKI2lmbmRl
ZiBNU19XSU5ET1dTCiNpZiBkZWZpbmVkKFBZT1NfT1MyKSAmJiAhZGVmaW5lZChQWUNDX0dD
QykKCQkgICAgIChzdHJ1Y3Qgc29ja2FkZHIgKilhZGRyYnVmLCAmYWRkcmxlbgojZWxzZQoJ
CSAgICAgKHZvaWQgKilhZGRyYnVmLCAmYWRkcmxlbgojZW5kaWYKI2Vsc2UKCQkgICAgIChz
dHJ1Y3Qgc29ja2FkZHIgKilhZGRyYnVmLCAmYWRkcmxlbgojZW5kaWYKCQkgICAgICk7CglQ
eV9FTkRfQUxMT1dfVEhSRUFEUwoKCWlmIChuIDwgMCkgewoJCVB5X0RFQ1JFRihidWYpOwoJ
CXJldHVybiBzLT5lcnJvcmhhbmRsZXIoKTsKCX0KCglpZiAobiAhPSBsZW4gJiYgX1B5U3Ry
aW5nX1Jlc2l6ZSgmYnVmLCBuKSA8IDApCgkJcmV0dXJuIE5VTEw7CgoJaWYgKCEoYWRkciA9
IG1ha2Vzb2NrYWRkcihzLT5zb2NrX2ZkLCAoc3RydWN0IHNvY2thZGRyICopYWRkcmJ1ZiwK
CQkJCSAgYWRkcmxlbikpKQoJCWdvdG8gZmluYWxseTsKCglyZXQgPSBQeV9CdWlsZFZhbHVl
KCJPTyIsIGJ1ZiwgYWRkcik7CgpmaW5hbGx5OgoJUHlfWERFQ1JFRihhZGRyKTsKCVB5X1hE
RUNSRUYoYnVmKTsKCXJldHVybiByZXQ7Cn0KCnN0YXRpYyBjaGFyIHJlY3Zmcm9tX2RvY1td
ID0KInJlY3Zmcm9tKGJ1ZmZlcnNpemVbLCBmbGFnc10pIC0+IChkYXRhLCBhZGRyZXNzIGlu
Zm8pXG5cClxuXApMaWtlIHJlY3YoYnVmZmVyc2l6ZSwgZmxhZ3MpIGJ1dCBhbHNvIHJldHVy
biB0aGUgc2VuZGVyJ3MgYWRkcmVzcyBpbmZvLiI7CgovKiBzLnNlbmQoZGF0YSBbLGZsYWdz
XSkgbWV0aG9kICovCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrX3NlbmQoUHlTb2NrZXRTb2Nr
T2JqZWN0ICpzLCBQeU9iamVjdCAqYXJncykKewoJY2hhciAqYnVmOwoJaW50IGxlbiwgbiwg
ZmxhZ3MgPSAwOwoKCWlmICghUHlBcmdfUGFyc2VUdXBsZShhcmdzLCAicyN8aTpzZW5kIiwg
JmJ1ZiwgJmxlbiwgJmZsYWdzKSkKCQlyZXR1cm4gTlVMTDsKCglpZiAocy0+c29ja190aW1l
b3V0ID49IDAuMCkgewoJCWlmIChzLT5zb2NrX2Jsb2NraW5nKSB7CgkJCWlmIChpbnRlcm5h
bF9zZWxlY3QocywgMSkgPD0gMCkKCQkJCXJldHVybiBOVUxMOwoJCX0KCX0KCglQeV9CRUdJ
Tl9BTExPV19USFJFQURTCgluID0gc2VuZChzLT5zb2NrX2ZkLCBidWYsIGxlbiwgZmxhZ3Mp
OwoJUHlfRU5EX0FMTE9XX1RIUkVBRFMKCglpZiAobiA8IDApCgkJcmV0dXJuIHMtPmVycm9y
aGFuZGxlcigpOwoJcmV0dXJuIFB5SW50X0Zyb21Mb25nKChsb25nKW4pOwp9CgpzdGF0aWMg
Y2hhciBzZW5kX2RvY1tdID0KInNlbmQoZGF0YVssIGZsYWdzXSkgLT4gY291bnRcblwKXG5c
ClNlbmQgYSBkYXRhIHN0cmluZyB0byB0aGUgc29ja2V0LiAgRm9yIHRoZSBvcHRpb25hbCBm
bGFnc1xuXAphcmd1bWVudCwgc2VlIHRoZSBVbml4IG1hbnVhbC4gIFJldHVybiB0aGUgbnVt
YmVyIG9mIGJ5dGVzXG5cCnNlbnQ7IHRoaXMgbWF5IGJlIGxlc3MgdGhhbiBsZW4oZGF0YSkg
aWYgdGhlIG5ldHdvcmsgaXMgYnVzeS4iOwoKCi8qIHMuc2VuZGFsbChkYXRhIFssZmxhZ3Nd
KSBtZXRob2QgKi8KCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tfc2VuZGFsbChQeVNvY2tldFNv
Y2tPYmplY3QgKnMsIFB5T2JqZWN0ICphcmdzKQp7CgljaGFyICpidWY7CglpbnQgbGVuLCBu
LCBmbGFncyA9IDA7CgoJaWYgKCFQeUFyZ19QYXJzZVR1cGxlKGFyZ3MsICJzI3xpOnNlbmRh
bGwiLCAmYnVmLCAmbGVuLCAmZmxhZ3MpKQoJCXJldHVybiBOVUxMOwoKCWlmIChzLT5zb2Nr
X3RpbWVvdXQgPj0gMC4wKSB7CgkJaWYgKHMtPnNvY2tfYmxvY2tpbmcpIHsKCQkJaWYgKGlu
dGVybmFsX3NlbGVjdChzLCAxKSA8PSAwKQoJCQkJcmV0dXJuIE5VTEw7CgkJfQoJfQoKCVB5
X0JFR0lOX0FMTE9XX1RIUkVBRFMKCWRvIHsKCQluID0gc2VuZChzLT5zb2NrX2ZkLCBidWYs
IGxlbiwgZmxhZ3MpOwoJCWlmIChuIDwgMCkKCQkJYnJlYWs7CgkJYnVmICs9IG47CgkJbGVu
IC09IG47Cgl9IHdoaWxlIChsZW4gPiAwKTsKCVB5X0VORF9BTExPV19USFJFQURTCgoJaWYg
KG4gPCAwKQoJCXJldHVybiBzLT5lcnJvcmhhbmRsZXIoKTsKCglQeV9JTkNSRUYoUHlfTm9u
ZSk7CglyZXR1cm4gUHlfTm9uZTsKfQoKc3RhdGljIGNoYXIgc2VuZGFsbF9kb2NbXSA9CiJz
ZW5kYWxsKGRhdGFbLCBmbGFnc10pXG5cClxuXApTZW5kIGEgZGF0YSBzdHJpbmcgdG8gdGhl
IHNvY2tldC4gIEZvciB0aGUgb3B0aW9uYWwgZmxhZ3NcblwKYXJndW1lbnQsIHNlZSB0aGUg
VW5peCBtYW51YWwuICBUaGlzIGNhbGxzIHNlbmQoKSByZXBlYXRlZGx5XG5cCnVudGlsIGFs
bCBkYXRhIGlzIHNlbnQuICBJZiBhbiBlcnJvciBvY2N1cnMsIGl0J3MgaW1wb3NzaWJsZVxu
XAp0byB0ZWxsIGhvdyBtdWNoIGRhdGEgaGFzIGJlZW4gc2VudC4iOwoKCi8qIHMuc2VuZHRv
KGRhdGEsIFtmbGFncyxdIHNvY2thZGRyKSBtZXRob2QgKi8KCnN0YXRpYyBQeU9iamVjdCAq
CnNvY2tfc2VuZHRvKFB5U29ja2V0U29ja09iamVjdCAqcywgUHlPYmplY3QgKmFyZ3MpCnsK
CVB5T2JqZWN0ICphZGRybzsKCWNoYXIgKmJ1ZjsKCXN0cnVjdCBzb2NrYWRkciAqYWRkcjsK
CWludCBhZGRybGVuLCBsZW4sIG4sIGZsYWdzOwoKCWZsYWdzID0gMDsKCWlmICghUHlBcmdf
UGFyc2VUdXBsZShhcmdzLCAicyNPOnNlbmR0byIsICZidWYsICZsZW4sICZhZGRybykpIHsK
CQlQeUVycl9DbGVhcigpOwoJCWlmICghUHlBcmdfUGFyc2VUdXBsZShhcmdzLCAicyNpTzpz
ZW5kdG8iLAoJCQkJICAgICAgJmJ1ZiwgJmxlbiwgJmZsYWdzLCAmYWRkcm8pKQoJCQlyZXR1
cm4gTlVMTDsKCX0KCglpZiAoIWdldHNvY2thZGRyYXJnKHMsIGFkZHJvLCAmYWRkciwgJmFk
ZHJsZW4pKQoJCXJldHVybiBOVUxMOwoKCWlmIChzLT5zb2NrX3RpbWVvdXQgPj0gMC4wKSB7
CgkJaWYgKHMtPnNvY2tfYmxvY2tpbmcpIHsKCQkJaWYgKGludGVybmFsX3NlbGVjdChzLCAx
KSA8PSAwKQoJCQkJcmV0dXJuIE5VTEw7CgkJfQoJfQoKCVB5X0JFR0lOX0FMTE9XX1RIUkVB
RFMKCW4gPSBzZW5kdG8ocy0+c29ja19mZCwgYnVmLCBsZW4sIGZsYWdzLCBhZGRyLCBhZGRy
bGVuKTsKCVB5X0VORF9BTExPV19USFJFQURTCgoJaWYgKG4gPCAwKQoJCXJldHVybiBzLT5l
cnJvcmhhbmRsZXIoKTsKCXJldHVybiBQeUludF9Gcm9tTG9uZygobG9uZyluKTsKfQoKc3Rh
dGljIGNoYXIgc2VuZHRvX2RvY1tdID0KInNlbmR0byhkYXRhWywgZmxhZ3NdLCBhZGRyZXNz
KVxuXApcblwKTGlrZSBzZW5kKGRhdGEsIGZsYWdzKSBidXQgYWxsb3dzIHNwZWNpZnlpbmcg
dGhlIGRlc3RpbmF0aW9uIGFkZHJlc3MuXG5cCkZvciBJUCBzb2NrZXRzLCB0aGUgYWRkcmVz
cyBpcyBhIHBhaXIgKGhvc3RhZGRyLCBwb3J0KS4iOwoKCi8qIHMuc2h1dGRvd24oaG93KSBt
ZXRob2QgKi8KCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tfc2h1dGRvd24oUHlTb2NrZXRTb2Nr
T2JqZWN0ICpzLCBQeU9iamVjdCAqYXJnKQp7CglpbnQgaG93OwoJaW50IHJlczsKCglob3cg
PSBQeUludF9Bc0xvbmcoYXJnKTsKCWlmIChob3cgPT0gLTEgJiYgUHlFcnJfT2NjdXJyZWQo
KSkKCQlyZXR1cm4gTlVMTDsKCVB5X0JFR0lOX0FMTE9XX1RIUkVBRFMKCXJlcyA9IHNodXRk
b3duKHMtPnNvY2tfZmQsIGhvdyk7CglQeV9FTkRfQUxMT1dfVEhSRUFEUwoJaWYgKHJlcyA8
IDApCgkJcmV0dXJuIHMtPmVycm9yaGFuZGxlcigpOwoJUHlfSU5DUkVGKFB5X05vbmUpOwoJ
cmV0dXJuIFB5X05vbmU7Cn0KCnN0YXRpYyBjaGFyIHNodXRkb3duX2RvY1tdID0KInNodXRk
b3duKGZsYWcpXG5cClxuXApTaHV0IGRvd24gdGhlIHJlYWRpbmcgc2lkZSBvZiB0aGUgc29j
a2V0IChmbGFnID09IDApLCB0aGUgd3JpdGluZyBzaWRlXG5cCm9mIHRoZSBzb2NrZXQgKGZs
YWcgPT0gMSksIG9yIGJvdGggZW5kcyAoZmxhZyA9PSAyKS4iOwoKCi8qIExpc3Qgb2YgbWV0
aG9kcyBmb3Igc29ja2V0IG9iamVjdHMgKi8KCnN0YXRpYyBQeU1ldGhvZERlZiBzb2NrX21l
dGhvZHNbXSA9IHsKCXsiYWNjZXB0IiwJKFB5Q0Z1bmN0aW9uKXNvY2tfYWNjZXB0LCBNRVRI
X05PQVJHUywKCQkJYWNjZXB0X2RvY30sCgl7ImJpbmQiLAkoUHlDRnVuY3Rpb24pc29ja19i
aW5kLCBNRVRIX08sCgkJCWJpbmRfZG9jfSwKCXsiY2xvc2UiLAkoUHlDRnVuY3Rpb24pc29j
a19jbG9zZSwgTUVUSF9OT0FSR1MsCgkJCWNsb3NlX2RvY30sCgl7ImNvbm5lY3QiLAkoUHlD
RnVuY3Rpb24pc29ja19jb25uZWN0LCBNRVRIX08sCgkJCWNvbm5lY3RfZG9jfSwKCXsiY29u
bmVjdF9leCIsCShQeUNGdW5jdGlvbilzb2NrX2Nvbm5lY3RfZXgsIE1FVEhfTywKCQkJY29u
bmVjdF9leF9kb2N9LAojaWZuZGVmIE5PX0RVUAoJeyJkdXAiLAkJKFB5Q0Z1bmN0aW9uKXNv
Y2tfZHVwLCBNRVRIX05PQVJHUywKCQkJZHVwX2RvY30sCiNlbmRpZgoJeyJmaWxlbm8iLAko
UHlDRnVuY3Rpb24pc29ja19maWxlbm8sIE1FVEhfTk9BUkdTLAoJCQlmaWxlbm9fZG9jfSwK
I2lmZGVmIEhBVkVfR0VUUEVFUk5BTUUKCXsiZ2V0cGVlcm5hbWUiLAkoUHlDRnVuY3Rpb24p
c29ja19nZXRwZWVybmFtZSwKCQkJTUVUSF9OT0FSR1MsIGdldHBlZXJuYW1lX2RvY30sCiNl
bmRpZgoJeyJnZXRzb2NrbmFtZSIsCShQeUNGdW5jdGlvbilzb2NrX2dldHNvY2tuYW1lLAoJ
CQlNRVRIX05PQVJHUywgZ2V0c29ja25hbWVfZG9jfSwKCXsiZ2V0c29ja29wdCIsCShQeUNG
dW5jdGlvbilzb2NrX2dldHNvY2tvcHQsIE1FVEhfVkFSQVJHUywKCQkJZ2V0c29ja29wdF9k
b2N9LAoJeyJsaXN0ZW4iLAkoUHlDRnVuY3Rpb24pc29ja19saXN0ZW4sIE1FVEhfTywKCQkJ
bGlzdGVuX2RvY30sCiNpZm5kZWYgTk9fRFVQCgl7Im1ha2VmaWxlIiwJKFB5Q0Z1bmN0aW9u
KXNvY2tfbWFrZWZpbGUsIE1FVEhfVkFSQVJHUywKCQkJbWFrZWZpbGVfZG9jfSwKI2VuZGlm
Cgl7InJlY3YiLAkoUHlDRnVuY3Rpb24pc29ja19yZWN2LCBNRVRIX1ZBUkFSR1MsCgkJCXJl
Y3ZfZG9jfSwKCXsicmVjdmZyb20iLAkoUHlDRnVuY3Rpb24pc29ja19yZWN2ZnJvbSwgTUVU
SF9WQVJBUkdTLAoJCQlyZWN2ZnJvbV9kb2N9LAoJeyJzZW5kIiwJKFB5Q0Z1bmN0aW9uKXNv
Y2tfc2VuZCwgTUVUSF9WQVJBUkdTLAoJCQlzZW5kX2RvY30sCgl7InNlbmRhbGwiLAkoUHlD
RnVuY3Rpb24pc29ja19zZW5kYWxsLCBNRVRIX1ZBUkFSR1MsCgkJCXNlbmRhbGxfZG9jfSwK
CXsic2VuZHRvIiwJKFB5Q0Z1bmN0aW9uKXNvY2tfc2VuZHRvLCBNRVRIX1ZBUkFSR1MsCgkJ
CXNlbmR0b19kb2N9LAoJeyJzZXRibG9ja2luZyIsCShQeUNGdW5jdGlvbilzb2NrX3NldGJs
b2NraW5nLCBNRVRIX08sCgkJCXNldGJsb2NraW5nX2RvY30sCgl7InNldHRpbWVvdXQiLCAo
UHlDRnVuY3Rpb24pc29ja19zZXR0aW1lb3V0LCBNRVRIX08sCgkJCXNldHRpbWVvdXRfZG9j
fSwKCXsiZ2V0dGltZW91dCIsIChQeUNGdW5jdGlvbilzb2NrX2dldHRpbWVvdXQsIE1FVEhf
Tk9BUkdTLAoJCQlnZXR0aW1lb3V0X2RvY30sCgl7InNldHNvY2tvcHQiLAkoUHlDRnVuY3Rp
b24pc29ja19zZXRzb2Nrb3B0LCBNRVRIX1ZBUkFSR1MsCgkJCXNldHNvY2tvcHRfZG9jfSwK
CXsic2h1dGRvd24iLAkoUHlDRnVuY3Rpb24pc29ja19zaHV0ZG93biwgTUVUSF9PLAoJCQlz
aHV0ZG93bl9kb2N9LAojaWZkZWYgUklTQ09TCgl7InNsZWVwdGFza3ciLAkoUHlDRnVuY3Rp
b24pc29ja19zbGVlcHRhc2t3LCBNRVRIX1ZBUkFSR1MsCgkgCQlzbGVlcHRhc2t3X2RvY30s
CiNlbmRpZgoJe05VTEwsCQkJTlVMTH0JCS8qIHNlbnRpbmVsICovCn07CgoKLyogRGVhbGxv
Y2F0ZSBhIHNvY2tldCBvYmplY3QgaW4gcmVzcG9uc2UgdG8gdGhlIGxhc3QgUHlfREVDUkVG
KCkuCiAgIEZpcnN0IGNsb3NlIHRoZSBmaWxlIGRlc2NyaXB0aW9uLiAqLwoKc3RhdGljIHZv
aWQKc29ja19kZWFsbG9jKFB5U29ja2V0U29ja09iamVjdCAqcykKewoJaWYgKHMtPnNvY2tf
ZmQgIT0gLTEpCgkJKHZvaWQpIFNPQ0tFVENMT1NFKHMtPnNvY2tfZmQpOwoJcy0+b2JfdHlw
ZS0+dHBfZnJlZSgoUHlPYmplY3QgKilzKTsKfQoKCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tf
cmVwcihQeVNvY2tldFNvY2tPYmplY3QgKnMpCnsKCWNoYXIgYnVmWzUxMl07CiNpZiBTSVpF
T0ZfU09DS0VUX1QgPiBTSVpFT0ZfTE9ORwoJaWYgKHMtPnNvY2tfZmQgPiBMT05HX01BWCkg
ewoJCS8qIHRoaXMgY2FuIG9jY3VyIG9uIFdpbjY0LCBhbmQgYWN0dWFsbHkgdGhlcmUgaXMg
YSBzcGVjaWFsCgkJICAgdWdseSBwcmludGYgZm9ybWF0dGVyIGZvciBkZWNpbWFsIHBvaW50
ZXIgbGVuZ3RoIGludGVnZXIKCQkgICBwcmludGluZywgb25seSBib3RoZXIgaWYgbmVjZXNz
YXJ5Ki8KCQlQeUVycl9TZXRTdHJpbmcoUHlFeGNfT3ZlcmZsb3dFcnJvciwKCQkJCSJubyBw
cmludGYgZm9ybWF0dGVyIHRvIGRpc3BsYXkgIgoJCQkJInRoZSBzb2NrZXQgZGVzY3JpcHRv
ciBpbiBkZWNpbWFsIik7CgkJcmV0dXJuIE5VTEw7Cgl9CiNlbmRpZgoJUHlPU19zbnByaW50
ZigKCQlidWYsIHNpemVvZihidWYpLAoJCSI8c29ja2V0IG9iamVjdCwgZmQ9JWxkLCBmYW1p
bHk9JWQsIHR5cGU9JWQsIHByb3RvY29sPSVkPiIsCgkJKGxvbmcpcy0+c29ja19mZCwgcy0+
c29ja19mYW1pbHksCgkJcy0+c29ja190eXBlLAoJCXMtPnNvY2tfcHJvdG8pOwoJcmV0dXJu
IFB5U3RyaW5nX0Zyb21TdHJpbmcoYnVmKTsKfQoKCi8qIENyZWF0ZSBhIG5ldywgdW5pbml0
aWFsaXplZCBzb2NrZXQgb2JqZWN0LiAqLwoKc3RhdGljIFB5T2JqZWN0ICoKc29ja19uZXco
UHlUeXBlT2JqZWN0ICp0eXBlLCBQeU9iamVjdCAqYXJncywgUHlPYmplY3QgKmt3ZHMpCnsK
CVB5T2JqZWN0ICpuZXc7CgoJbmV3ID0gdHlwZS0+dHBfYWxsb2ModHlwZSwgMCk7CglpZiAo
bmV3ICE9IE5VTEwpIHsKCQkoKFB5U29ja2V0U29ja09iamVjdCAqKW5ldyktPnNvY2tfZmQg
PSAtMTsKCQkoKFB5U29ja2V0U29ja09iamVjdCAqKW5ldyktPnNvY2tfdGltZW91dCA9IC0x
LjA7CgkJKChQeVNvY2tldFNvY2tPYmplY3QgKiluZXcpLT5lcnJvcmhhbmRsZXIgPSAmc2V0
X2Vycm9yOwoJfQoJcmV0dXJuIG5ldzsKfQoKCi8qIEluaXRpYWxpemUgYSBuZXcgc29ja2V0
IG9iamVjdC4gKi8KCi8qQVJHU1VTRUQqLwpzdGF0aWMgaW50CnNvY2tfaW5pdChQeU9iamVj
dCAqc2VsZiwgUHlPYmplY3QgKmFyZ3MsIFB5T2JqZWN0ICprd2RzKQp7CglQeVNvY2tldFNv
Y2tPYmplY3QgKnMgPSAoUHlTb2NrZXRTb2NrT2JqZWN0ICopc2VsZjsKCVNPQ0tFVF9UIGZk
OwoJaW50IGZhbWlseSA9IEFGX0lORVQsIHR5cGUgPSBTT0NLX1NUUkVBTSwgcHJvdG8gPSAw
OwoJc3RhdGljIGNoYXIgKmtleXdvcmRzW10gPSB7ImZhbWlseSIsICJ0eXBlIiwgInByb3Rv
IiwgMH07CgoJaWYgKCFQeUFyZ19QYXJzZVR1cGxlQW5kS2V5d29yZHMoYXJncywga3dkcywK
CQkJCQkgInxpaWk6c29ja2V0Iiwga2V5d29yZHMsCgkJCQkJICZmYW1pbHksICZ0eXBlLCAm
cHJvdG8pKQoJCXJldHVybiAtMTsKCglQeV9CRUdJTl9BTExPV19USFJFQURTCglmZCA9IHNv
Y2tldChmYW1pbHksIHR5cGUsIHByb3RvKTsKCVB5X0VORF9BTExPV19USFJFQURTCgojaWZk
ZWYgTVNfV0lORE9XUwoJaWYgKGZkID09IElOVkFMSURfU09DS0VUKQojZWxzZQoJaWYgKGZk
IDwgMCkKI2VuZGlmCgl7CgkJc2V0X2Vycm9yKCk7CgkJcmV0dXJuIC0xOwoJfQoJaW5pdF9z
b2Nrb2JqZWN0KHMsIGZkLCBmYW1pbHksIHR5cGUsIHByb3RvKTsKCS8qIEZyb20gbm93IG9u
LCBpZ25vcmUgU0lHUElQRSBhbmQgbGV0IHRoZSBlcnJvciBjaGVja2luZwoJICAgZG8gdGhl
IHdvcmsuICovCiNpZmRlZiBTSUdQSVBFCgkodm9pZCkgc2lnbmFsKFNJR1BJUEUsIFNJR19J
R04pOwojZW5kaWYKCglyZXR1cm4gMDsKCn0KCgovKiBUeXBlIG9iamVjdCBmb3Igc29ja2V0
IG9iamVjdHMuICovCgpzdGF0aWMgUHlUeXBlT2JqZWN0IHNvY2tfdHlwZSA9IHsKCVB5T2Jq
ZWN0X0hFQURfSU5JVCgwKQkvKiBNdXN0IGZpbGwgaW4gdHlwZSB2YWx1ZSBsYXRlciAqLwoJ
MCwJCQkJCS8qIG9iX3NpemUgKi8KCSJfc29ja2V0LnNvY2tldCIsCQkJLyogdHBfbmFtZSAq
LwoJc2l6ZW9mKFB5U29ja2V0U29ja09iamVjdCksCQkvKiB0cF9iYXNpY3NpemUgKi8KCTAs
CQkJCQkvKiB0cF9pdGVtc2l6ZSAqLwoJKGRlc3RydWN0b3Ipc29ja19kZWFsbG9jLAkJLyog
dHBfZGVhbGxvYyAqLwoJMCwJCQkJCS8qIHRwX3ByaW50ICovCgkwLAkJCQkJLyogdHBfZ2V0
YXR0ciAqLwoJMCwJCQkJCS8qIHRwX3NldGF0dHIgKi8KCTAsCQkJCQkvKiB0cF9jb21wYXJl
ICovCgkocmVwcmZ1bmMpc29ja19yZXByLAkJCS8qIHRwX3JlcHIgKi8KCTAsCQkJCQkvKiB0
cF9hc19udW1iZXIgKi8KCTAsCQkJCQkvKiB0cF9hc19zZXF1ZW5jZSAqLwoJMCwJCQkJCS8q
IHRwX2FzX21hcHBpbmcgKi8KCTAsCQkJCQkvKiB0cF9oYXNoICovCgkwLAkJCQkJLyogdHBf
Y2FsbCAqLwoJMCwJCQkJCS8qIHRwX3N0ciAqLwoJMCwJLyogc2V0IGJlbG93ICovCQkJLyog
dHBfZ2V0YXR0cm8gKi8KCTAsCQkJCQkvKiB0cF9zZXRhdHRybyAqLwoJMCwJCQkJCS8qIHRw
X2FzX2J1ZmZlciAqLwoJUHlfVFBGTEFHU19ERUZBVUxUIHwgUHlfVFBGTEFHU19CQVNFVFlQ
RSwgLyogdHBfZmxhZ3MgKi8KCXNvY2tfZG9jLAkJCQkvKiB0cF9kb2MgKi8KCTAsCQkJCQkv
KiB0cF90cmF2ZXJzZSAqLwoJMCwJCQkJCS8qIHRwX2NsZWFyICovCgkwLAkJCQkJLyogdHBf
cmljaGNvbXBhcmUgKi8KCTAsCQkJCQkvKiB0cF93ZWFrbGlzdG9mZnNldCAqLwoJMCwJCQkJ
CS8qIHRwX2l0ZXIgKi8KCTAsCQkJCQkvKiB0cF9pdGVybmV4dCAqLwoJc29ja19tZXRob2Rz
LAkJCQkvKiB0cF9tZXRob2RzICovCgkwLAkJCQkJLyogdHBfbWVtYmVycyAqLwoJMCwJCQkJ
CS8qIHRwX2dldHNldCAqLwoJMCwJCQkJCS8qIHRwX2Jhc2UgKi8KCTAsCQkJCQkvKiB0cF9k
aWN0ICovCgkwLAkJCQkJLyogdHBfZGVzY3JfZ2V0ICovCgkwLAkJCQkJLyogdHBfZGVzY3Jf
c2V0ICovCgkwLAkJCQkJLyogdHBfZGljdG9mZnNldCAqLwoJc29ja19pbml0LAkJCQkvKiB0
cF9pbml0ICovCgkwLAkvKiBzZXQgYmVsb3cgKi8JCQkvKiB0cF9hbGxvYyAqLwoJc29ja19u
ZXcsCQkJCS8qIHRwX25ldyAqLwoJMCwJLyogc2V0IGJlbG93ICovCQkJLyogdHBfZnJlZSAq
Lwp9OwoKCi8qIFB5dGhvbiBpbnRlcmZhY2UgdG8gZ2V0aG9zdG5hbWUoKS4gKi8KCi8qQVJH
U1VTRUQqLwpzdGF0aWMgUHlPYmplY3QgKgpzb2NrZXRfZ2V0aG9zdG5hbWUoUHlPYmplY3Qg
KnNlbGYsIFB5T2JqZWN0ICphcmdzKQp7CgljaGFyIGJ1ZlsxMDI0XTsKCWludCByZXM7Cglp
ZiAoIVB5QXJnX1BhcnNlVHVwbGUoYXJncywgIjpnZXRob3N0bmFtZSIpKQoJCXJldHVybiBO
VUxMOwoJUHlfQkVHSU5fQUxMT1dfVEhSRUFEUwoJcmVzID0gZ2V0aG9zdG5hbWUoYnVmLCAo
aW50KSBzaXplb2YgYnVmIC0gMSk7CglQeV9FTkRfQUxMT1dfVEhSRUFEUwoJaWYgKHJlcyA8
IDApCgkJcmV0dXJuIHNldF9lcnJvcigpOwoJYnVmW3NpemVvZiBidWYgLSAxXSA9ICdcMCc7
CglyZXR1cm4gUHlTdHJpbmdfRnJvbVN0cmluZyhidWYpOwp9CgpzdGF0aWMgY2hhciBnZXRo
b3N0bmFtZV9kb2NbXSA9CiJnZXRob3N0bmFtZSgpIC0+IHN0cmluZ1xuXApcblwKUmV0dXJu
IHRoZSBjdXJyZW50IGhvc3QgbmFtZS4iOwoKCi8qIFB5dGhvbiBpbnRlcmZhY2UgdG8gZ2V0
aG9zdGJ5bmFtZShuYW1lKS4gKi8KCi8qQVJHU1VTRUQqLwpzdGF0aWMgUHlPYmplY3QgKgpz
b2NrZXRfZ2V0aG9zdGJ5bmFtZShQeU9iamVjdCAqc2VsZiwgUHlPYmplY3QgKmFyZ3MpCnsK
CWNoYXIgKm5hbWU7CglzdHJ1Y3Qgc29ja2FkZHJfc3RvcmFnZSBhZGRyYnVmOwoKCWlmICgh
UHlBcmdfUGFyc2VUdXBsZShhcmdzLCAiczpnZXRob3N0YnluYW1lIiwgJm5hbWUpKQoJCXJl
dHVybiBOVUxMOwoJaWYgKHNldGlwYWRkcihuYW1lLCAoc3RydWN0IHNvY2thZGRyICopJmFk
ZHJidWYsIEFGX0lORVQpIDwgMCkKCQlyZXR1cm4gTlVMTDsKCXJldHVybiBtYWtlaXBhZGRy
KChzdHJ1Y3Qgc29ja2FkZHIgKikmYWRkcmJ1ZiwKCQlzaXplb2Yoc3RydWN0IHNvY2thZGRy
X2luKSk7Cn0KCnN0YXRpYyBjaGFyIGdldGhvc3RieW5hbWVfZG9jW10gPQoiZ2V0aG9zdGJ5
bmFtZShob3N0KSAtPiBhZGRyZXNzXG5cClxuXApSZXR1cm4gdGhlIElQIGFkZHJlc3MgKGEg
c3RyaW5nIG9mIHRoZSBmb3JtICcyNTUuMjU1LjI1NS4yNTUnKSBmb3IgYSBob3N0LiI7CgoK
LyogQ29udmVuaWVuY2UgZnVuY3Rpb24gY29tbW9uIHRvIGdldGhvc3RieW5hbWVfZXggYW5k
IGdldGhvc3RieWFkZHIgKi8KCnN0YXRpYyBQeU9iamVjdCAqCmdldGhvc3RfY29tbW9uKHN0
cnVjdCBob3N0ZW50ICpoLCBzdHJ1Y3Qgc29ja2FkZHIgKmFkZHIsIGludCBhbGVuLCBpbnQg
YWYpCnsKCWNoYXIgKipwY2g7CglQeU9iamVjdCAqcnRuX3R1cGxlID0gKFB5T2JqZWN0ICop
TlVMTDsKCVB5T2JqZWN0ICpuYW1lX2xpc3QgPSAoUHlPYmplY3QgKilOVUxMOwoJUHlPYmpl
Y3QgKmFkZHJfbGlzdCA9IChQeU9iamVjdCAqKU5VTEw7CglQeU9iamVjdCAqdG1wOwoKCWlm
IChoID09IE5VTEwpIHsKCQkvKiBMZXQncyBnZXQgcmVhbCBlcnJvciBtZXNzYWdlIHRvIHJl
dHVybiAqLwojaWZuZGVmIFJJU0NPUwoJCXNldF9oZXJyb3IoaF9lcnJubyk7CiNlbHNlCgkJ
UHlFcnJfU2V0U3RyaW5nKHNvY2tldF9lcnJvciwgImhvc3Qgbm90IGZvdW5kIik7CiNlbmRp
ZgoJCXJldHVybiBOVUxMOwoJfQoKCWlmIChoLT5oX2FkZHJ0eXBlICE9IGFmKSB7CiNpZmRl
ZiBIQVZFX1NUUkVSUk9SCgkJLyogTGV0J3MgZ2V0IHJlYWwgZXJyb3IgbWVzc2FnZSB0byBy
ZXR1cm4gKi8KCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9yLAoJCQkJKGNoYXIgKilz
dHJlcnJvcihFQUZOT1NVUFBPUlQpKTsKI2Vsc2UKCQlQeUVycl9TZXRTdHJpbmcoCgkJCXNv
Y2tldF9lcnJvciwKCQkJIkFkZHJlc3MgZmFtaWx5IG5vdCBzdXBwb3J0ZWQgYnkgcHJvdG9j
b2wgZmFtaWx5Iik7CiNlbmRpZgoJCXJldHVybiBOVUxMOwoJfQoKCXN3aXRjaCAoYWYpIHsK
CgljYXNlIEFGX0lORVQ6CgkJaWYgKGFsZW4gPCBzaXplb2Yoc3RydWN0IHNvY2thZGRyX2lu
KSkKCQkJcmV0dXJuIE5VTEw7CgkJYnJlYWs7CgojaWZkZWYgRU5BQkxFX0lQVjYKCWNhc2Ug
QUZfSU5FVDY6CgkJaWYgKGFsZW4gPCBzaXplb2Yoc3RydWN0IHNvY2thZGRyX2luNikpCgkJ
CXJldHVybiBOVUxMOwoJCWJyZWFrOwojZW5kaWYKCgl9CgoJaWYgKChuYW1lX2xpc3QgPSBQ
eUxpc3RfTmV3KDApKSA9PSBOVUxMKQoJCWdvdG8gZXJyOwoKCWlmICgoYWRkcl9saXN0ID0g
UHlMaXN0X05ldygwKSkgPT0gTlVMTCkKCQlnb3RvIGVycjsKCglmb3IgKHBjaCA9IGgtPmhf
YWxpYXNlczsgKnBjaCAhPSBOVUxMOyBwY2grKykgewoJCWludCBzdGF0dXM7CgkJdG1wID0g
UHlTdHJpbmdfRnJvbVN0cmluZygqcGNoKTsKCQlpZiAodG1wID09IE5VTEwpCgkJCWdvdG8g
ZXJyOwoKCQlzdGF0dXMgPSBQeUxpc3RfQXBwZW5kKG5hbWVfbGlzdCwgdG1wKTsKCQlQeV9E
RUNSRUYodG1wKTsKCgkJaWYgKHN0YXR1cykKCQkJZ290byBlcnI7Cgl9CgoJZm9yIChwY2gg
PSBoLT5oX2FkZHJfbGlzdDsgKnBjaCAhPSBOVUxMOyBwY2grKykgewoJCWludCBzdGF0dXM7
CgoJCXN3aXRjaCAoYWYpIHsKCgkJY2FzZSBBRl9JTkVUOgoJCSAgICB7CgkJCXN0cnVjdCBz
b2NrYWRkcl9pbiBzaW47CgkJCW1lbXNldCgmc2luLCAwLCBzaXplb2Yoc2luKSk7CgkJCXNp
bi5zaW5fZmFtaWx5ID0gYWY7CiNpZmRlZiBIQVZFX1NPQ0tBRERSX1NBX0xFTgoJCQlzaW4u
c2luX2xlbiA9IHNpemVvZihzaW4pOwojZW5kaWYKCQkJbWVtY3B5KCZzaW4uc2luX2FkZHIs
ICpwY2gsIHNpemVvZihzaW4uc2luX2FkZHIpKTsKCQkJdG1wID0gbWFrZWlwYWRkcigoc3Ry
dWN0IHNvY2thZGRyICopJnNpbiwgc2l6ZW9mKHNpbikpOwoKCQkJaWYgKHBjaCA9PSBoLT5o
X2FkZHJfbGlzdCAmJiBhbGVuID49IHNpemVvZihzaW4pKQoJCQkJbWVtY3B5KChjaGFyICop
IGFkZHIsICZzaW4sIHNpemVvZihzaW4pKTsKCQkJYnJlYWs7CgkJICAgIH0KCiNpZmRlZiBF
TkFCTEVfSVBWNgoJCWNhc2UgQUZfSU5FVDY6CgkJICAgIHsKCQkJc3RydWN0IHNvY2thZGRy
X2luNiBzaW42OwoJCQltZW1zZXQoJnNpbjYsIDAsIHNpemVvZihzaW42KSk7CgkJCXNpbjYu
c2luNl9mYW1pbHkgPSBhZjsKI2lmZGVmIEhBVkVfU09DS0FERFJfU0FfTEVOCgkJCXNpbjYu
c2luNl9sZW4gPSBzaXplb2Yoc2luNik7CiNlbmRpZgoJCQltZW1jcHkoJnNpbjYuc2luNl9h
ZGRyLCAqcGNoLCBzaXplb2Yoc2luNi5zaW42X2FkZHIpKTsKCQkJdG1wID0gbWFrZWlwYWRk
cigoc3RydWN0IHNvY2thZGRyICopJnNpbjYsCgkJCQlzaXplb2Yoc2luNikpOwoKCQkJaWYg
KHBjaCA9PSBoLT5oX2FkZHJfbGlzdCAmJiBhbGVuID49IHNpemVvZihzaW42KSkKCQkJCW1l
bWNweSgoY2hhciAqKSBhZGRyLCAmc2luNiwgc2l6ZW9mKHNpbjYpKTsKCQkJYnJlYWs7CgkJ
ICAgIH0KI2VuZGlmCgoJCWRlZmF1bHQ6CS8qIGNhbid0IGhhcHBlbiAqLwoJCQlQeUVycl9T
ZXRTdHJpbmcoc29ja2V0X2Vycm9yLAoJCQkJCSJ1bnN1cHBvcnRlZCBhZGRyZXNzIGZhbWls
eSIpOwoJCQlyZXR1cm4gTlVMTDsKCQl9CgoJCWlmICh0bXAgPT0gTlVMTCkKCQkJZ290byBl
cnI7CgoJCXN0YXR1cyA9IFB5TGlzdF9BcHBlbmQoYWRkcl9saXN0LCB0bXApOwoJCVB5X0RF
Q1JFRih0bXApOwoKCQlpZiAoc3RhdHVzKQoJCQlnb3RvIGVycjsKCX0KCglydG5fdHVwbGUg
PSBQeV9CdWlsZFZhbHVlKCJzT08iLCBoLT5oX25hbWUsIG5hbWVfbGlzdCwgYWRkcl9saXN0
KTsKCiBlcnI6CglQeV9YREVDUkVGKG5hbWVfbGlzdCk7CglQeV9YREVDUkVGKGFkZHJfbGlz
dCk7CglyZXR1cm4gcnRuX3R1cGxlOwp9CgoKLyogUHl0aG9uIGludGVyZmFjZSB0byBnZXRo
b3N0YnluYW1lX2V4KG5hbWUpLiAqLwoKLypBUkdTVVNFRCovCnN0YXRpYyBQeU9iamVjdCAq
CnNvY2tldF9nZXRob3N0YnluYW1lX2V4KFB5T2JqZWN0ICpzZWxmLCBQeU9iamVjdCAqYXJn
cykKewoJY2hhciAqbmFtZTsKCXN0cnVjdCBob3N0ZW50ICpoOwoJc3RydWN0IHNvY2thZGRy
X3N0b3JhZ2UgYWRkcjsKCXN0cnVjdCBzb2NrYWRkciAqc2E7CglQeU9iamVjdCAqcmV0Owoj
aWZkZWYgSEFWRV9HRVRIT1NUQllOQU1FX1IKCXN0cnVjdCBob3N0ZW50IGhwX2FsbG9jYXRl
ZDsKI2lmZGVmIEhBVkVfR0VUSE9TVEJZTkFNRV9SXzNfQVJHCglzdHJ1Y3QgaG9zdGVudF9k
YXRhIGRhdGE7CiNlbHNlCgljaGFyIGJ1ZlsxNjM4NF07CglpbnQgYnVmX2xlbiA9IChzaXpl
b2YgYnVmKSAtIDE7CglpbnQgZXJybm9wOwojZW5kaWYKI2lmIGRlZmluZWQoSEFWRV9HRVRI
T1NUQllOQU1FX1JfM19BUkcpIHx8IGRlZmluZWQoSEFWRV9HRVRIT1NUQllOQU1FX1JfNl9B
UkcpCglpbnQgcmVzdWx0OwojZW5kaWYKI2VuZGlmIC8qIEhBVkVfR0VUSE9TVEJZTkFNRV9S
ICovCgoJaWYgKCFQeUFyZ19QYXJzZVR1cGxlKGFyZ3MsICJzOmdldGhvc3RieW5hbWVfZXgi
LCAmbmFtZSkpCgkJcmV0dXJuIE5VTEw7CglpZiAoc2V0aXBhZGRyKG5hbWUsIChzdHJ1Y3Qg
c29ja2FkZHIgKikmYWRkciwgUEZfSU5FVCkgPCAwKQoJCXJldHVybiBOVUxMOwoJUHlfQkVH
SU5fQUxMT1dfVEhSRUFEUwojaWZkZWYgSEFWRV9HRVRIT1NUQllOQU1FX1IKI2lmICAgZGVm
aW5lZChIQVZFX0dFVEhPU1RCWU5BTUVfUl82X0FSRykKCXJlc3VsdCA9IGdldGhvc3RieW5h
bWVfcihuYW1lLCAmaHBfYWxsb2NhdGVkLCBidWYsIGJ1Zl9sZW4sCgkJCQkgJmgsICZlcnJu
b3ApOwojZWxpZiBkZWZpbmVkKEhBVkVfR0VUSE9TVEJZTkFNRV9SXzVfQVJHKQoJaCA9IGdl
dGhvc3RieW5hbWVfcihuYW1lLCAmaHBfYWxsb2NhdGVkLCBidWYsIGJ1Zl9sZW4sICZlcnJu
b3ApOwojZWxzZSAvKiBIQVZFX0dFVEhPU1RCWU5BTUVfUl8zX0FSRyAqLwoJbWVtc2V0KCh2
b2lkICopICZkYXRhLCAnXDAnLCBzaXplb2YoZGF0YSkpOwoJcmVzdWx0ID0gZ2V0aG9zdGJ5
bmFtZV9yKG5hbWUsICZocF9hbGxvY2F0ZWQsICZkYXRhKTsKCWggPSAocmVzdWx0ICE9IDAp
ID8gTlVMTCA6ICZocF9hbGxvY2F0ZWQ7CiNlbmRpZgojZWxzZSAvKiBub3QgSEFWRV9HRVRI
T1NUQllOQU1FX1IgKi8KI2lmZGVmIFVTRV9HRVRIT1NUQllOQU1FX0xPQ0sKCVB5VGhyZWFk
X2FjcXVpcmVfbG9jayhnZXRob3N0YnluYW1lX2xvY2ssIDEpOwojZW5kaWYKCWggPSBnZXRo
b3N0YnluYW1lKG5hbWUpOwojZW5kaWYgLyogSEFWRV9HRVRIT1NUQllOQU1FX1IgKi8KCVB5
X0VORF9BTExPV19USFJFQURTCgkvKiBTb21lIEMgbGlicmFyaWVzIHdvdWxkIHJlcXVpcmUg
YWRkci5fX3NzX2ZhbWlseSBpbnN0ZWFkIG9mCgkgICBhZGRyLnNzX2ZhbWlseS4KCSAgIFRo
ZXJlZm9yZSwgd2UgY2FzdCB0aGUgc29ja2FkZHJfc3RvcmFnZSBpbnRvIHNvY2thZGRyIHRv
CgkgICBhY2Nlc3Mgc2FfZmFtaWx5LiAqLwoJc2EgPSAoc3RydWN0IHNvY2thZGRyKikmYWRk
cjsKCXJldCA9IGdldGhvc3RfY29tbW9uKGgsIChzdHJ1Y3Qgc29ja2FkZHIgKikmYWRkciwg
c2l6ZW9mKGFkZHIpLAoJCQkgICAgIHNhLT5zYV9mYW1pbHkpOwojaWZkZWYgVVNFX0dFVEhP
U1RCWU5BTUVfTE9DSwoJUHlUaHJlYWRfcmVsZWFzZV9sb2NrKGdldGhvc3RieW5hbWVfbG9j
ayk7CiNlbmRpZgoJcmV0dXJuIHJldDsKfQoKc3RhdGljIGNoYXIgZ2hibl9leF9kb2NbXSA9
CiJnZXRob3N0YnluYW1lX2V4KGhvc3QpIC0+IChuYW1lLCBhbGlhc2xpc3QsIGFkZHJlc3Ns
aXN0KVxuXApcblwKUmV0dXJuIHRoZSB0cnVlIGhvc3QgbmFtZSwgYSBsaXN0IG9mIGFsaWFz
ZXMsIGFuZCBhIGxpc3Qgb2YgSVAgYWRkcmVzc2VzLFxuXApmb3IgYSBob3N0LiAgVGhlIGhv
c3QgYXJndW1lbnQgaXMgYSBzdHJpbmcgZ2l2aW5nIGEgaG9zdCBuYW1lIG9yIElQIG51bWJl
ci4iOwoKCi8qIFB5dGhvbiBpbnRlcmZhY2UgdG8gZ2V0aG9zdGJ5YWRkcihJUCkuICovCgov
KkFSR1NVU0VEKi8Kc3RhdGljIFB5T2JqZWN0ICoKc29ja2V0X2dldGhvc3RieWFkZHIoUHlP
YmplY3QgKnNlbGYsIFB5T2JqZWN0ICphcmdzKQp7CiNpZmRlZiBFTkFCTEVfSVBWNgoJc3Ry
dWN0IHNvY2thZGRyX3N0b3JhZ2UgYWRkcjsKI2Vsc2UKCXN0cnVjdCBzb2NrYWRkcl9pbiBh
ZGRyOwojZW5kaWYKCXN0cnVjdCBzb2NrYWRkciAqc2EgPSAoc3RydWN0IHNvY2thZGRyICop
JmFkZHI7CgljaGFyICppcF9udW07CglzdHJ1Y3QgaG9zdGVudCAqaDsKCVB5T2JqZWN0ICpy
ZXQ7CiNpZmRlZiBIQVZFX0dFVEhPU1RCWU5BTUVfUgoJc3RydWN0IGhvc3RlbnQgaHBfYWxs
b2NhdGVkOwojaWZkZWYgSEFWRV9HRVRIT1NUQllOQU1FX1JfM19BUkcKCXN0cnVjdCBob3N0
ZW50X2RhdGEgZGF0YTsKI2Vsc2UKCWNoYXIgYnVmWzE2Mzg0XTsKCWludCBidWZfbGVuID0g
KHNpemVvZiBidWYpIC0gMTsKCWludCBlcnJub3A7CiNlbmRpZgojaWYgZGVmaW5lZChIQVZF
X0dFVEhPU1RCWU5BTUVfUl8zX0FSRykgfHwgZGVmaW5lZChIQVZFX0dFVEhPU1RCWU5BTUVf
Ul82X0FSRykKCWludCByZXN1bHQ7CiNlbmRpZgojZW5kaWYgLyogSEFWRV9HRVRIT1NUQllO
QU1FX1IgKi8KCWNoYXIgKmFwOwoJaW50IGFsOwoJaW50IGFmOwoKCWlmICghUHlBcmdfUGFy
c2VUdXBsZShhcmdzLCAiczpnZXRob3N0YnlhZGRyIiwgJmlwX251bSkpCgkJcmV0dXJuIE5V
TEw7CglhZiA9IFBGX1VOU1BFQzsKCWlmIChzZXRpcGFkZHIoaXBfbnVtLCBzYSwgYWYpIDwg
MCkKCQlyZXR1cm4gTlVMTDsKCWFmID0gc2EtPnNhX2ZhbWlseTsKCWFwID0gTlVMTDsKCWFs
ID0gMDsKCXN3aXRjaCAoYWYpIHsKCWNhc2UgQUZfSU5FVDoKCQlhcCA9IChjaGFyICopJigo
c3RydWN0IHNvY2thZGRyX2luICopc2EpLT5zaW5fYWRkcjsKCQlhbCA9IHNpemVvZigoKHN0
cnVjdCBzb2NrYWRkcl9pbiAqKXNhKS0+c2luX2FkZHIpOwoJCWJyZWFrOwojaWZkZWYgRU5B
QkxFX0lQVjYKCWNhc2UgQUZfSU5FVDY6CgkJYXAgPSAoY2hhciAqKSYoKHN0cnVjdCBzb2Nr
YWRkcl9pbjYgKilzYSktPnNpbjZfYWRkcjsKCQlhbCA9IHNpemVvZigoKHN0cnVjdCBzb2Nr
YWRkcl9pbjYgKilzYSktPnNpbjZfYWRkcik7CgkJYnJlYWs7CiNlbmRpZgoJZGVmYXVsdDoK
CQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9yLCAidW5zdXBwb3J0ZWQgYWRkcmVzcyBm
YW1pbHkiKTsKCQlyZXR1cm4gTlVMTDsKCX0KCVB5X0JFR0lOX0FMTE9XX1RIUkVBRFMKI2lm
ZGVmIEhBVkVfR0VUSE9TVEJZTkFNRV9SCiNpZiAgIGRlZmluZWQoSEFWRV9HRVRIT1NUQllO
QU1FX1JfNl9BUkcpCglyZXN1bHQgPSBnZXRob3N0YnlhZGRyX3IoYXAsIGFsLCBhZiwKCQkm
aHBfYWxsb2NhdGVkLCBidWYsIGJ1Zl9sZW4sCgkJJmgsICZlcnJub3ApOwojZWxpZiBkZWZp
bmVkKEhBVkVfR0VUSE9TVEJZTkFNRV9SXzVfQVJHKQoJaCA9IGdldGhvc3RieWFkZHJfcihh
cCwgYWwsIGFmLAoJCQkgICAgJmhwX2FsbG9jYXRlZCwgYnVmLCBidWZfbGVuLCAmZXJybm9w
KTsKI2Vsc2UgLyogSEFWRV9HRVRIT1NUQllOQU1FX1JfM19BUkcgKi8KCW1lbXNldCgodm9p
ZCAqKSAmZGF0YSwgJ1wwJywgc2l6ZW9mKGRhdGEpKTsKCXJlc3VsdCA9IGdldGhvc3RieWFk
ZHJfcihhcCwgYWwsIGFmLCAmaHBfYWxsb2NhdGVkLCAmZGF0YSk7CgloID0gKHJlc3VsdCAh
PSAwKSA/IE5VTEwgOiAmaHBfYWxsb2NhdGVkOwojZW5kaWYKI2Vsc2UgLyogbm90IEhBVkVf
R0VUSE9TVEJZTkFNRV9SICovCiNpZmRlZiBVU0VfR0VUSE9TVEJZTkFNRV9MT0NLCglQeVRo
cmVhZF9hY3F1aXJlX2xvY2soZ2V0aG9zdGJ5bmFtZV9sb2NrLCAxKTsKI2VuZGlmCgloID0g
Z2V0aG9zdGJ5YWRkcihhcCwgYWwsIGFmKTsKI2VuZGlmIC8qIEhBVkVfR0VUSE9TVEJZTkFN
RV9SICovCglQeV9FTkRfQUxMT1dfVEhSRUFEUwoJcmV0ID0gZ2V0aG9zdF9jb21tb24oaCwg
KHN0cnVjdCBzb2NrYWRkciAqKSZhZGRyLCBzaXplb2YoYWRkciksIGFmKTsKI2lmZGVmIFVT
RV9HRVRIT1NUQllOQU1FX0xPQ0sKCVB5VGhyZWFkX3JlbGVhc2VfbG9jayhnZXRob3N0Ynlu
YW1lX2xvY2spOwojZW5kaWYKCXJldHVybiByZXQ7Cn0KCnN0YXRpYyBjaGFyIGdldGhvc3Ri
eWFkZHJfZG9jW10gPQoiZ2V0aG9zdGJ5YWRkcihob3N0KSAtPiAobmFtZSwgYWxpYXNsaXN0
LCBhZGRyZXNzbGlzdClcblwKXG5cClJldHVybiB0aGUgdHJ1ZSBob3N0IG5hbWUsIGEgbGlz
dCBvZiBhbGlhc2VzLCBhbmQgYSBsaXN0IG9mIElQIGFkZHJlc3NlcyxcblwKZm9yIGEgaG9z
dC4gIFRoZSBob3N0IGFyZ3VtZW50IGlzIGEgc3RyaW5nIGdpdmluZyBhIGhvc3QgbmFtZSBv
ciBJUCBudW1iZXIuIjsKCgovKiBQeXRob24gaW50ZXJmYWNlIHRvIGdldHNlcnZieW5hbWUo
bmFtZSkuCiAgIFRoaXMgb25seSByZXR1cm5zIHRoZSBwb3J0IG51bWJlciwgc2luY2UgdGhl
IG90aGVyIGluZm8gaXMgYWxyZWFkeQogICBrbm93biBvciBub3QgdXNlZnVsIChsaWtlIHRo
ZSBsaXN0IG9mIGFsaWFzZXMpLiAqLwoKLypBUkdTVVNFRCovCnN0YXRpYyBQeU9iamVjdCAq
CnNvY2tldF9nZXRzZXJ2YnluYW1lKFB5T2JqZWN0ICpzZWxmLCBQeU9iamVjdCAqYXJncykK
ewoJY2hhciAqbmFtZSwgKnByb3RvOwoJc3RydWN0IHNlcnZlbnQgKnNwOwoJaWYgKCFQeUFy
Z19QYXJzZVR1cGxlKGFyZ3MsICJzczpnZXRzZXJ2YnluYW1lIiwgJm5hbWUsICZwcm90bykp
CgkJcmV0dXJuIE5VTEw7CglQeV9CRUdJTl9BTExPV19USFJFQURTCglzcCA9IGdldHNlcnZi
eW5hbWUobmFtZSwgcHJvdG8pOwoJUHlfRU5EX0FMTE9XX1RIUkVBRFMKCWlmIChzcCA9PSBO
VUxMKSB7CgkJUHlFcnJfU2V0U3RyaW5nKHNvY2tldF9lcnJvciwgInNlcnZpY2UvcHJvdG8g
bm90IGZvdW5kIik7CgkJcmV0dXJuIE5VTEw7Cgl9CglyZXR1cm4gUHlJbnRfRnJvbUxvbmco
KGxvbmcpIG50b2hzKHNwLT5zX3BvcnQpKTsKfQoKc3RhdGljIGNoYXIgZ2V0c2VydmJ5bmFt
ZV9kb2NbXSA9CiJnZXRzZXJ2YnluYW1lKHNlcnZpY2VuYW1lLCBwcm90b2NvbG5hbWUpIC0+
IGludGVnZXJcblwKXG5cClJldHVybiBhIHBvcnQgbnVtYmVyIGZyb20gYSBzZXJ2aWNlIG5h
bWUgYW5kIHByb3RvY29sIG5hbWUuXG5cClRoZSBwcm90b2NvbCBuYW1lIHNob3VsZCBiZSAn
dGNwJyBvciAndWRwJy4iOwoKCi8qIFB5dGhvbiBpbnRlcmZhY2UgdG8gZ2V0cHJvdG9ieW5h
bWUobmFtZSkuCiAgIFRoaXMgb25seSByZXR1cm5zIHRoZSBwcm90b2NvbCBudW1iZXIsIHNp
bmNlIHRoZSBvdGhlciBpbmZvIGlzCiAgIGFscmVhZHkga25vd24gb3Igbm90IHVzZWZ1bCAo
bGlrZSB0aGUgbGlzdCBvZiBhbGlhc2VzKS4gKi8KCi8qQVJHU1VTRUQqLwpzdGF0aWMgUHlP
YmplY3QgKgpzb2NrZXRfZ2V0cHJvdG9ieW5hbWUoUHlPYmplY3QgKnNlbGYsIFB5T2JqZWN0
ICphcmdzKQp7CgljaGFyICpuYW1lOwoJc3RydWN0IHByb3RvZW50ICpzcDsKI2lmZGVmIF9f
QkVPU19fCi8qIE5vdCBhdmFpbGFibGUgaW4gQmVPUyB5ZXQuIC0gW2NqaF0gKi8KCVB5RXJy
X1NldFN0cmluZyhzb2NrZXRfZXJyb3IsICJnZXRwcm90b2J5bmFtZSBub3Qgc3VwcG9ydGVk
Iik7CglyZXR1cm4gTlVMTDsKI2Vsc2UKCWlmICghUHlBcmdfUGFyc2VUdXBsZShhcmdzLCAi
czpnZXRwcm90b2J5bmFtZSIsICZuYW1lKSkKCQlyZXR1cm4gTlVMTDsKCVB5X0JFR0lOX0FM
TE9XX1RIUkVBRFMKCXNwID0gZ2V0cHJvdG9ieW5hbWUobmFtZSk7CglQeV9FTkRfQUxMT1df
VEhSRUFEUwoJaWYgKHNwID09IE5VTEwpIHsKCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vy
cm9yLCAicHJvdG9jb2wgbm90IGZvdW5kIik7CgkJcmV0dXJuIE5VTEw7Cgl9CglyZXR1cm4g
UHlJbnRfRnJvbUxvbmcoKGxvbmcpIHNwLT5wX3Byb3RvKTsKI2VuZGlmCn0KCnN0YXRpYyBj
aGFyIGdldHByb3RvYnluYW1lX2RvY1tdID0KImdldHByb3RvYnluYW1lKG5hbWUpIC0+IGlu
dGVnZXJcblwKXG5cClJldHVybiB0aGUgcHJvdG9jb2wgbnVtYmVyIGZvciB0aGUgbmFtZWQg
cHJvdG9jb2wuICAoUmFyZWx5IHVzZWQuKSI7CgoKI2lmbmRlZiBOT19EVVAKLyogQ3JlYXRl
IGEgc29ja2V0IG9iamVjdCBmcm9tIGEgbnVtZXJpYyBmaWxlIGRlc2NyaXB0aW9uLgogICBV
c2VmdWwgZS5nLiBpZiBzdGRpbiBpcyBhIHNvY2tldC4KICAgQWRkaXRpb25hbCBhcmd1bWVu
dHMgYXMgZm9yIHNvY2tldCgpLiAqLwoKLypBUkdTVVNFRCovCnN0YXRpYyBQeU9iamVjdCAq
CnNvY2tldF9mcm9tZmQoUHlPYmplY3QgKnNlbGYsIFB5T2JqZWN0ICphcmdzKQp7CglQeVNv
Y2tldFNvY2tPYmplY3QgKnM7CglTT0NLRVRfVCBmZDsKCWludCBmYW1pbHksIHR5cGUsIHBy
b3RvID0gMDsKCWlmICghUHlBcmdfUGFyc2VUdXBsZShhcmdzLCAiaWlpfGk6ZnJvbWZkIiwK
CQkJICAgICAgJmZkLCAmZmFtaWx5LCAmdHlwZSwgJnByb3RvKSkKCQlyZXR1cm4gTlVMTDsK
CS8qIER1cCB0aGUgZmQgc28gaXQgYW5kIHRoZSBzb2NrZXQgY2FuIGJlIGNsb3NlZCBpbmRl
cGVuZGVudGx5ICovCglmZCA9IGR1cChmZCk7CglpZiAoZmQgPCAwKQoJCXJldHVybiBzZXRf
ZXJyb3IoKTsKCXMgPSBuZXdfc29ja29iamVjdChmZCwgZmFtaWx5LCB0eXBlLCBwcm90byk7
CgkvKiBGcm9tIG5vdyBvbiwgaWdub3JlIFNJR1BJUEUgYW5kIGxldCB0aGUgZXJyb3IgY2hl
Y2tpbmcKCSAgIGRvIHRoZSB3b3JrLiAqLwojaWZkZWYgU0lHUElQRQoJKHZvaWQpIHNpZ25h
bChTSUdQSVBFLCBTSUdfSUdOKTsKI2VuZGlmCglyZXR1cm4gKFB5T2JqZWN0ICopIHM7Cn0K
CnN0YXRpYyBjaGFyIGZyb21mZF9kb2NbXSA9CiJmcm9tZmQoZmQsIGZhbWlseSwgdHlwZVss
IHByb3RvXSkgLT4gc29ja2V0IG9iamVjdFxuXApcblwKQ3JlYXRlIGEgc29ja2V0IG9iamVj
dCBmcm9tIHRoZSBnaXZlbiBmaWxlIGRlc2NyaXB0b3IuXG5cClRoZSByZW1haW5pbmcgYXJn
dW1lbnRzIGFyZSB0aGUgc2FtZSBhcyBmb3Igc29ja2V0KCkuIjsKCiNlbmRpZiAvKiBOT19E
VVAgKi8KCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrZXRfbnRvaHMoUHlPYmplY3QgKnNlbGYs
IFB5T2JqZWN0ICphcmdzKQp7CglpbnQgeDEsIHgyOwoKCWlmICghUHlBcmdfUGFyc2VUdXBs
ZShhcmdzLCAiaTpudG9ocyIsICZ4MSkpIHsKCQlyZXR1cm4gTlVMTDsKCX0KCXgyID0gKGlu
dCludG9ocygoc2hvcnQpeDEpOwoJcmV0dXJuIFB5SW50X0Zyb21Mb25nKHgyKTsKfQoKc3Rh
dGljIGNoYXIgbnRvaHNfZG9jW10gPQoibnRvaHMoaW50ZWdlcikgLT4gaW50ZWdlclxuXApc
blwKQ29udmVydCBhIDE2LWJpdCBpbnRlZ2VyIGZyb20gbmV0d29yayB0byBob3N0IGJ5dGUg
b3JkZXIuIjsKCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrZXRfbnRvaGwoUHlPYmplY3QgKnNl
bGYsIFB5T2JqZWN0ICphcmdzKQp7CglpbnQgeDEsIHgyOwoKCWlmICghUHlBcmdfUGFyc2VU
dXBsZShhcmdzLCAiaTpudG9obCIsICZ4MSkpIHsKCQlyZXR1cm4gTlVMTDsKCX0KCXgyID0g
bnRvaGwoeDEpOwoJcmV0dXJuIFB5SW50X0Zyb21Mb25nKHgyKTsKfQoKc3RhdGljIGNoYXIg
bnRvaGxfZG9jW10gPQoibnRvaGwoaW50ZWdlcikgLT4gaW50ZWdlclxuXApcblwKQ29udmVy
dCBhIDMyLWJpdCBpbnRlZ2VyIGZyb20gbmV0d29yayB0byBob3N0IGJ5dGUgb3JkZXIuIjsK
CgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrZXRfaHRvbnMoUHlPYmplY3QgKnNlbGYsIFB5T2Jq
ZWN0ICphcmdzKQp7CglpbnQgeDEsIHgyOwoKCWlmICghUHlBcmdfUGFyc2VUdXBsZShhcmdz
LCAiaTpodG9ucyIsICZ4MSkpIHsKCQlyZXR1cm4gTlVMTDsKCX0KCXgyID0gKGludClodG9u
cygoc2hvcnQpeDEpOwoJcmV0dXJuIFB5SW50X0Zyb21Mb25nKHgyKTsKfQoKc3RhdGljIGNo
YXIgaHRvbnNfZG9jW10gPQoiaHRvbnMoaW50ZWdlcikgLT4gaW50ZWdlclxuXApcblwKQ29u
dmVydCBhIDE2LWJpdCBpbnRlZ2VyIGZyb20gaG9zdCB0byBuZXR3b3JrIGJ5dGUgb3JkZXIu
IjsKCgpzdGF0aWMgUHlPYmplY3QgKgpzb2NrZXRfaHRvbmwoUHlPYmplY3QgKnNlbGYsIFB5
T2JqZWN0ICphcmdzKQp7CglpbnQgeDEsIHgyOwoKCWlmICghUHlBcmdfUGFyc2VUdXBsZShh
cmdzLCAiaTpodG9ubCIsICZ4MSkpIHsKCQlyZXR1cm4gTlVMTDsKCX0KCXgyID0gaHRvbmwo
eDEpOwoJcmV0dXJuIFB5SW50X0Zyb21Mb25nKHgyKTsKfQoKc3RhdGljIGNoYXIgaHRvbmxf
ZG9jW10gPQoiaHRvbmwoaW50ZWdlcikgLT4gaW50ZWdlclxuXApcblwKQ29udmVydCBhIDMy
LWJpdCBpbnRlZ2VyIGZyb20gaG9zdCB0byBuZXR3b3JrIGJ5dGUgb3JkZXIuIjsKCi8qIHNv
Y2tldC5pbmV0X2F0b24oKSBhbmQgc29ja2V0LmluZXRfbnRvYSgpIGZ1bmN0aW9ucy4gKi8K
CnN0YXRpYyBjaGFyIGluZXRfYXRvbl9kb2NbXSA9CiJpbmV0X2F0b24oc3RyaW5nKSAtPiBw
YWNrZWQgMzItYml0IElQIHJlcHJlc2VudGF0aW9uXG5cClxuXApDb252ZXJ0IGFuIElQIGFk
ZHJlc3MgaW4gc3RyaW5nIGZvcm1hdCAoMTIzLjQ1LjY3Ljg5KSB0byB0aGUgMzItYml0IHBh
Y2tlZFxuXApiaW5hcnkgZm9ybWF0IHVzZWQgaW4gbG93LWxldmVsIG5ldHdvcmsgZnVuY3Rp
b25zLiI7CgpzdGF0aWMgUHlPYmplY3QqCnNvY2tldF9pbmV0X2F0b24oUHlPYmplY3QgKnNl
bGYsIFB5T2JqZWN0ICphcmdzKQp7CiNpZm5kZWYgSU5BRERSX05PTkUKI2RlZmluZSBJTkFE
RFJfTk9ORSAoLTEpCiNlbmRpZgoKCS8qIEhhdmUgdG8gdXNlIGluZXRfYWRkcigpIGluc3Rl
YWQgKi8KCWNoYXIgKmlwX2FkZHI7Cgl1bnNpZ25lZCBsb25nIHBhY2tlZF9hZGRyOwoKCWlm
ICghUHlBcmdfUGFyc2VUdXBsZShhcmdzLCAiczppbmV0X2F0b24iLCAmaXBfYWRkcikpIHsK
CQlyZXR1cm4gTlVMTDsKCX0KCXBhY2tlZF9hZGRyID0gaW5ldF9hZGRyKGlwX2FkZHIpOwoK
CWlmIChwYWNrZWRfYWRkciA9PSBJTkFERFJfTk9ORSkgewkvKiBpbnZhbGlkIGFkZHJlc3Mg
Ki8KCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9yLAoJCQkiaWxsZWdhbCBJUCBhZGRy
ZXNzIHN0cmluZyBwYXNzZWQgdG8gaW5ldF9hdG9uIik7CgkJcmV0dXJuIE5VTEw7Cgl9CgoJ
cmV0dXJuIFB5U3RyaW5nX0Zyb21TdHJpbmdBbmRTaXplKChjaGFyICopICZwYWNrZWRfYWRk
ciwKCQkJCQkgIHNpemVvZihwYWNrZWRfYWRkcikpOwp9CgpzdGF0aWMgY2hhciBpbmV0X250
b2FfZG9jW10gPQoiaW5ldF9udG9hKHBhY2tlZF9pcCkgLT4gaXBfYWRkcmVzc19zdHJpbmdc
blwKXG5cCkNvbnZlcnQgYW4gSVAgYWRkcmVzcyBmcm9tIDMyLWJpdCBwYWNrZWQgYmluYXJ5
IGZvcm1hdCB0byBzdHJpbmcgZm9ybWF0IjsKCnN0YXRpYyBQeU9iamVjdCoKc29ja2V0X2lu
ZXRfbnRvYShQeU9iamVjdCAqc2VsZiwgUHlPYmplY3QgKmFyZ3MpCnsKCWNoYXIgKnBhY2tl
ZF9zdHI7CglpbnQgYWRkcl9sZW47CglzdHJ1Y3QgaW5fYWRkciBwYWNrZWRfYWRkcjsKCglp
ZiAoIVB5QXJnX1BhcnNlVHVwbGUoYXJncywgInMjOmluZXRfbnRvYSIsICZwYWNrZWRfc3Ry
LCAmYWRkcl9sZW4pKSB7CgkJcmV0dXJuIE5VTEw7Cgl9CgoJaWYgKGFkZHJfbGVuICE9IHNp
emVvZihwYWNrZWRfYWRkcikpIHsKCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9yLAoJ
CQkicGFja2VkIElQIHdyb25nIGxlbmd0aCBmb3IgaW5ldF9udG9hIik7CgkJcmV0dXJuIE5V
TEw7Cgl9CgoJbWVtY3B5KCZwYWNrZWRfYWRkciwgcGFja2VkX3N0ciwgYWRkcl9sZW4pOwoK
CXJldHVybiBQeVN0cmluZ19Gcm9tU3RyaW5nKGluZXRfbnRvYShwYWNrZWRfYWRkcikpOwp9
CgovKiBQeXRob24gaW50ZXJmYWNlIHRvIGdldGFkZHJpbmZvKGhvc3QsIHBvcnQpLiAqLwoK
LypBUkdTVVNFRCovCnN0YXRpYyBQeU9iamVjdCAqCnNvY2tldF9nZXRhZGRyaW5mbyhQeU9i
amVjdCAqc2VsZiwgUHlPYmplY3QgKmFyZ3MpCnsKCXN0cnVjdCBhZGRyaW5mbyBoaW50cywg
KnJlczsKCXN0cnVjdCBhZGRyaW5mbyAqcmVzMCA9IE5VTEw7CglQeU9iamVjdCAqcG9iaiA9
IChQeU9iamVjdCAqKU5VTEw7CgljaGFyIHBidWZbMzBdOwoJY2hhciAqaHB0ciwgKnBwdHI7
CglpbnQgZmFtaWx5LCBzb2NrdHlwZSwgcHJvdG9jb2wsIGZsYWdzOwoJaW50IGVycm9yOwoJ
UHlPYmplY3QgKmFsbCA9IChQeU9iamVjdCAqKU5VTEw7CglQeU9iamVjdCAqc2luZ2xlID0g
KFB5T2JqZWN0ICopTlVMTDsKCglmYW1pbHkgPSBzb2NrdHlwZSA9IHByb3RvY29sID0gZmxh
Z3MgPSAwOwoJZmFtaWx5ID0gUEZfVU5TUEVDOwoJaWYgKCFQeUFyZ19QYXJzZVR1cGxlKGFy
Z3MsICJ6T3xpaWlpOmdldGFkZHJpbmZvIiwKCSAgICAmaHB0ciwgJnBvYmosICZmYW1pbHks
ICZzb2NrdHlwZSwKCQkJJnByb3RvY29sLCAmZmxhZ3MpKSB7CgkJcmV0dXJuIE5VTEw7Cgl9
CglpZiAoUHlJbnRfQ2hlY2socG9iaikpIHsKCQlQeU9TX3NucHJpbnRmKHBidWYsIHNpemVv
ZihwYnVmKSwgIiVsZCIsIFB5SW50X0FzTG9uZyhwb2JqKSk7CgkJcHB0ciA9IHBidWY7Cgl9
IGVsc2UgaWYgKFB5U3RyaW5nX0NoZWNrKHBvYmopKSB7CgkJcHB0ciA9IFB5U3RyaW5nX0Fz
U3RyaW5nKHBvYmopOwoJfSBlbHNlIGlmIChwb2JqID09IFB5X05vbmUpIHsKCQlwcHRyID0g
KGNoYXIgKilOVUxMOwoJfSBlbHNlIHsKCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vycm9y
LCAiSW50IG9yIFN0cmluZyBleHBlY3RlZCIpOwoJCXJldHVybiBOVUxMOwoJfQoJbWVtc2V0
KCZoaW50cywgMCwgc2l6ZW9mKGhpbnRzKSk7CgloaW50cy5haV9mYW1pbHkgPSBmYW1pbHk7
CgloaW50cy5haV9zb2NrdHlwZSA9IHNvY2t0eXBlOwoJaGludHMuYWlfcHJvdG9jb2wgPSBw
cm90b2NvbDsKCWhpbnRzLmFpX2ZsYWdzID0gZmxhZ3M7CgllcnJvciA9IGdldGFkZHJpbmZv
KGhwdHIsIHBwdHIsICZoaW50cywgJnJlczApOwoJaWYgKGVycm9yKSB7CgkJc2V0X2dhaWVy
cm9yKGVycm9yKTsKCQlyZXR1cm4gTlVMTDsKCX0KCglpZiAoKGFsbCA9IFB5TGlzdF9OZXco
MCkpID09IE5VTEwpCgkJZ290byBlcnI7Cglmb3IgKHJlcyA9IHJlczA7IHJlczsgcmVzID0g
cmVzLT5haV9uZXh0KSB7CgkJUHlPYmplY3QgKmFkZHIgPQoJCQltYWtlc29ja2FkZHIoLTEs
IHJlcy0+YWlfYWRkciwgcmVzLT5haV9hZGRybGVuKTsKCQlpZiAoYWRkciA9PSBOVUxMKQoJ
CQlnb3RvIGVycjsKCQlzaW5nbGUgPSBQeV9CdWlsZFZhbHVlKCJpaWlzTyIsIHJlcy0+YWlf
ZmFtaWx5LAoJCQlyZXMtPmFpX3NvY2t0eXBlLCByZXMtPmFpX3Byb3RvY29sLAoJCQlyZXMt
PmFpX2Nhbm9ubmFtZSA/IHJlcy0+YWlfY2Fub25uYW1lIDogIiIsCgkJCWFkZHIpOwoJCVB5
X0RFQ1JFRihhZGRyKTsKCQlpZiAoc2luZ2xlID09IE5VTEwpCgkJCWdvdG8gZXJyOwoKCQlp
ZiAoUHlMaXN0X0FwcGVuZChhbGwsIHNpbmdsZSkpCgkJCWdvdG8gZXJyOwoJCVB5X1hERUNS
RUYoc2luZ2xlKTsKCX0KCXJldHVybiBhbGw7CiBlcnI6CglQeV9YREVDUkVGKHNpbmdsZSk7
CglQeV9YREVDUkVGKGFsbCk7CglpZiAocmVzMCkKCQlmcmVlYWRkcmluZm8ocmVzMCk7Cgly
ZXR1cm4gKFB5T2JqZWN0ICopTlVMTDsKfQoKc3RhdGljIGNoYXIgZ2V0YWRkcmluZm9fZG9j
W10gPQoic29ja2V0LmdldGFkZHJpbmZvKGhvc3QsIHBvcnQgWywgZmFtaWx5LCBzb2NrdHlw
ZSwgcHJvdG8sIGZsYWdzXSlcblwKCS0tPiBMaXN0IG9mIChmYW1pbHksIHNvY2t0eXBlLCBw
cm90bywgY2Fub25uYW1lLCBzb2NrYWRkcilcblwKXG5cClJlc29sdmUgaG9zdCBhbmQgcG9y
dCBpbnRvIGFkZHJpbmZvIHN0cnVjdC4iOwoKLyogUHl0aG9uIGludGVyZmFjZSB0byBnZXRu
YW1laW5mbyhzYSwgZmxhZ3MpLiAqLwoKLypBUkdTVVNFRCovCnN0YXRpYyBQeU9iamVjdCAq
CnNvY2tldF9nZXRuYW1laW5mbyhQeU9iamVjdCAqc2VsZiwgUHlPYmplY3QgKmFyZ3MpCnsK
CVB5T2JqZWN0ICpzYSA9IChQeU9iamVjdCAqKU5VTEw7CglpbnQgZmxhZ3M7CgljaGFyICpo
b3N0cDsKCWludCBwb3J0LCBmbG93aW5mbywgc2NvcGVfaWQ7CgljaGFyIGhidWZbTklfTUFY
SE9TVF0sIHBidWZbTklfTUFYU0VSVl07CglzdHJ1Y3QgYWRkcmluZm8gaGludHMsICpyZXMg
PSBOVUxMOwoJaW50IGVycm9yOwoJUHlPYmplY3QgKnJldCA9IChQeU9iamVjdCAqKU5VTEw7
CgoJZmxhZ3MgPSBmbG93aW5mbyA9IHNjb3BlX2lkID0gMDsKCWlmICghUHlBcmdfUGFyc2VU
dXBsZShhcmdzLCAiT2k6Z2V0bmFtZWluZm8iLCAmc2EsICZmbGFncykpCgkJcmV0dXJuIE5V
TEw7CglpZiAgKCFQeUFyZ19QYXJzZVR1cGxlKHNhLCAic2l8aWkiLAoJCQkgICAgICAgJmhv
c3RwLCAmcG9ydCwgJmZsb3dpbmZvLCAmc2NvcGVfaWQpKQoJCXJldHVybiBOVUxMOwoJUHlP
U19zbnByaW50ZihwYnVmLCBzaXplb2YocGJ1ZiksICIlZCIsIHBvcnQpOwoJbWVtc2V0KCZo
aW50cywgMCwgc2l6ZW9mKGhpbnRzKSk7CgloaW50cy5haV9mYW1pbHkgPSBQRl9VTlNQRUM7
CgloaW50cy5haV9zb2NrdHlwZSA9IFNPQ0tfREdSQU07CS8qIG1ha2UgbnVtZXJpYyBwb3J0
IGhhcHB5ICovCgllcnJvciA9IGdldGFkZHJpbmZvKGhvc3RwLCBwYnVmLCAmaGludHMsICZy
ZXMpOwoJaWYgKGVycm9yKSB7CgkJc2V0X2dhaWVycm9yKGVycm9yKTsKCQlnb3RvIGZhaWw7
Cgl9CglpZiAocmVzLT5haV9uZXh0KSB7CgkJUHlFcnJfU2V0U3RyaW5nKHNvY2tldF9lcnJv
ciwKCQkJInNvY2thZGRyIHJlc29sdmVkIHRvIG11bHRpcGxlIGFkZHJlc3NlcyIpOwoJCWdv
dG8gZmFpbDsKCX0KCXN3aXRjaCAocmVzLT5haV9mYW1pbHkpIHsKCWNhc2UgQUZfSU5FVDoK
CSAgICB7CgkJY2hhciAqdDE7CgkJaW50IHQyOwoJCWlmIChQeUFyZ19QYXJzZVR1cGxlKHNh
LCAic2kiLCAmdDEsICZ0MikgPT0gMCkgewoJCQlQeUVycl9TZXRTdHJpbmcoc29ja2V0X2Vy
cm9yLAoJCQkJIklQdjQgc29ja2FkZHIgbXVzdCBiZSAyIHR1cGxlIik7CgkJCWdvdG8gZmFp
bDsKCQl9CgkJYnJlYWs7CgkgICAgfQojaWZkZWYgRU5BQkxFX0lQVjYKCWNhc2UgQUZfSU5F
VDY6CgkgICAgewoJCXN0cnVjdCBzb2NrYWRkcl9pbjYgKnNpbjY7CgkJc2luNiA9IChzdHJ1
Y3Qgc29ja2FkZHJfaW42ICopcmVzLT5haV9hZGRyOwoJCXNpbjYtPnNpbjZfZmxvd2luZm8g
PSBmbG93aW5mbzsKCQlzaW42LT5zaW42X3Njb3BlX2lkID0gc2NvcGVfaWQ7CgkJYnJlYWs7
CgkgICAgfQojZW5kaWYKCX0KCWVycm9yID0gZ2V0bmFtZWluZm8ocmVzLT5haV9hZGRyLCBy
ZXMtPmFpX2FkZHJsZW4sCgkJCWhidWYsIHNpemVvZihoYnVmKSwgcGJ1Ziwgc2l6ZW9mKHBi
dWYpLCBmbGFncyk7CglpZiAoZXJyb3IpIHsKCQlzZXRfZ2FpZXJyb3IoZXJyb3IpOwoJCWdv
dG8gZmFpbDsKCX0KCXJldCA9IFB5X0J1aWxkVmFsdWUoInNzIiwgaGJ1ZiwgcGJ1Zik7Cgpm
YWlsOgoJaWYgKHJlcykKCQlmcmVlYWRkcmluZm8ocmVzKTsKCXJldHVybiByZXQ7Cn0KCnN0
YXRpYyBjaGFyIGdldG5hbWVpbmZvX2RvY1tdID0KInNvY2tldC5nZXRuYW1laW5mbyhzb2Nr
YWRkciwgZmxhZ3MpIC0tPiAoaG9zdCwgcG9ydClcblwKXG5cCkdldCBob3N0IGFuZCBwb3J0
IGZvciBhIHNvY2thZGRyLiI7CgovKiBMaXN0IG9mIGZ1bmN0aW9ucyBleHBvcnRlZCBieSB0
aGlzIG1vZHVsZS4gKi8KCnN0YXRpYyBQeU1ldGhvZERlZiBzb2NrZXRfbWV0aG9kc1tdID0g
ewoJeyJnZXRob3N0YnluYW1lIiwJc29ja2V0X2dldGhvc3RieW5hbWUsCgkgTUVUSF9WQVJB
UkdTLCBnZXRob3N0YnluYW1lX2RvY30sCgl7ImdldGhvc3RieW5hbWVfZXgiLAlzb2NrZXRf
Z2V0aG9zdGJ5bmFtZV9leCwKCSBNRVRIX1ZBUkFSR1MsIGdoYm5fZXhfZG9jfSwKCXsiZ2V0
aG9zdGJ5YWRkciIsCXNvY2tldF9nZXRob3N0YnlhZGRyLAoJIE1FVEhfVkFSQVJHUywgZ2V0
aG9zdGJ5YWRkcl9kb2N9LAoJeyJnZXRob3N0bmFtZSIsCQlzb2NrZXRfZ2V0aG9zdG5hbWUs
CgkgTUVUSF9WQVJBUkdTLCBnZXRob3N0bmFtZV9kb2N9LAoJeyJnZXRzZXJ2YnluYW1lIiwJ
c29ja2V0X2dldHNlcnZieW5hbWUsCgkgTUVUSF9WQVJBUkdTLCBnZXRzZXJ2YnluYW1lX2Rv
Y30sCgl7ImdldHByb3RvYnluYW1lIiwJc29ja2V0X2dldHByb3RvYnluYW1lLAoJIE1FVEhf
VkFSQVJHUyxnZXRwcm90b2J5bmFtZV9kb2N9LAojaWZuZGVmIE5PX0RVUAoJeyJmcm9tZmQi
LAkJc29ja2V0X2Zyb21mZCwKCSBNRVRIX1ZBUkFSR1MsIGZyb21mZF9kb2N9LAojZW5kaWYK
CXsibnRvaHMiLAkJc29ja2V0X250b2hzLAoJIE1FVEhfVkFSQVJHUywgbnRvaHNfZG9jfSwK
CXsibnRvaGwiLAkJc29ja2V0X250b2hsLAoJIE1FVEhfVkFSQVJHUywgbnRvaGxfZG9jfSwK
CXsiaHRvbnMiLAkJc29ja2V0X2h0b25zLAoJIE1FVEhfVkFSQVJHUywgaHRvbnNfZG9jfSwK
CXsiaHRvbmwiLAkJc29ja2V0X2h0b25sLAoJIE1FVEhfVkFSQVJHUywgaHRvbmxfZG9jfSwK
CXsiaW5ldF9hdG9uIiwJCXNvY2tldF9pbmV0X2F0b24sCgkgTUVUSF9WQVJBUkdTLCBpbmV0
X2F0b25fZG9jfSwKCXsiaW5ldF9udG9hIiwJCXNvY2tldF9pbmV0X250b2EsCgkgTUVUSF9W
QVJBUkdTLCBpbmV0X250b2FfZG9jfSwKCXsiZ2V0YWRkcmluZm8iLAkJc29ja2V0X2dldGFk
ZHJpbmZvLAoJIE1FVEhfVkFSQVJHUywgZ2V0YWRkcmluZm9fZG9jfSwKCXsiZ2V0bmFtZWlu
Zm8iLAkJc29ja2V0X2dldG5hbWVpbmZvLAoJIE1FVEhfVkFSQVJHUywgZ2V0bmFtZWluZm9f
ZG9jfSwKCXtOVUxMLAkJCU5VTEx9CQkgLyogU2VudGluZWwgKi8KfTsKCgojaWZkZWYgUklT
Q09TCiNkZWZpbmUgT1NfSU5JVF9ERUZJTkVECgpzdGF0aWMgaW50Cm9zX2luaXQodm9pZCkK
ewoJX2tlcm5lbF9zd2lfcmVncyByOwoKCXIuclswXSA9IDA7Cglfa2VybmVsX3N3aSgweDQz
MzgwLCAmciwgJnIpOwoJdGFza3dpbmRvdyA9IHIuclswXTsKCglyZXR1cm4gMDsKfQoKI2Vu
ZGlmIC8qIFJJU0NPUyAqLwoKCiNpZmRlZiBNU19XSU5ET1dTCiNkZWZpbmUgT1NfSU5JVF9E
RUZJTkVECgovKiBBZGRpdGlvbmFsIGluaXRpYWxpemF0aW9uIGFuZCBjbGVhbnVwIGZvciBX
aW5kb3dzICovCgpzdGF0aWMgdm9pZApvc19jbGVhbnVwKHZvaWQpCnsKCVdTQUNsZWFudXAo
KTsKfQoKc3RhdGljIGludApvc19pbml0KHZvaWQpCnsKCVdTQURBVEEgV1NBRGF0YTsKCWlu
dCByZXQ7CgljaGFyIGJ1ZlsxMDBdOwoJcmV0ID0gV1NBU3RhcnR1cCgweDAxMDEsICZXU0FE
YXRhKTsKCXN3aXRjaCAocmV0KSB7CgljYXNlIDA6CS8qIE5vIGVycm9yICovCgkJYXRleGl0
KG9zX2NsZWFudXApOwoJCXJldHVybiAxOyAvKiBTdWNjZXNzICovCgljYXNlIFdTQVNZU05P
VFJFQURZOgoJCVB5RXJyX1NldFN0cmluZyhQeUV4Y19JbXBvcnRFcnJvciwKCQkJCSJXU0FT
dGFydHVwIGZhaWxlZDogbmV0d29yayBub3QgcmVhZHkiKTsKCQlicmVhazsKCWNhc2UgV1NB
VkVSTk9UU1VQUE9SVEVEOgoJY2FzZSBXU0FFSU5WQUw6CgkJUHlFcnJfU2V0U3RyaW5nKAoJ
CQlQeUV4Y19JbXBvcnRFcnJvciwKCQkJIldTQVN0YXJ0dXAgZmFpbGVkOiByZXF1ZXN0ZWQg
dmVyc2lvbiBub3Qgc3VwcG9ydGVkIik7CgkJYnJlYWs7CglkZWZhdWx0OgoJCVB5T1Nfc25w
cmludGYoYnVmLCBzaXplb2YoYnVmKSwKCQkJICAgICAgIldTQVN0YXJ0dXAgZmFpbGVkOiBl
cnJvciBjb2RlICVkIiwgcmV0KTsKCQlQeUVycl9TZXRTdHJpbmcoUHlFeGNfSW1wb3J0RXJy
b3IsIGJ1Zik7CgkJYnJlYWs7Cgl9CglyZXR1cm4gMDsgLyogRmFpbHVyZSAqLwp9CgojZW5k
aWYgLyogTVNfV0lORE9XUyAqLwoKCiNpZmRlZiBQWU9TX09TMgojZGVmaW5lIE9TX0lOSVRf
REVGSU5FRAoKLyogQWRkaXRpb25hbCBpbml0aWFsaXphdGlvbiBmb3IgT1MvMiAqLwoKc3Rh
dGljIGludApvc19pbml0KHZvaWQpCnsKI2lmbmRlZiBQWUNDX0dDQwoJY2hhciByZWFzb25b
NjRdOwoJaW50IHJjID0gc29ja19pbml0KCk7CgoJaWYgKHJjID09IDApIHsKCQlyZXR1cm4g
MTsgLyogU3VjY2VzcyAqLwoJfQoKCVB5T1Nfc25wcmludGYocmVhc29uLCBzaXplb2YocmVh
c29uKSwKCQkgICAgICAiT1MvMiBUQ1AvSVAgRXJyb3IjICVkIiwgc29ja19lcnJubygpKTsK
CVB5RXJyX1NldFN0cmluZyhQeUV4Y19JbXBvcnRFcnJvciwgcmVhc29uKTsKCglyZXR1cm4g
MDsgIC8qIEZhaWx1cmUgKi8KI2Vsc2UKCS8qIE5vIG5lZWQgdG8gaW5pdGlhbGlzZSBzb2Nr
ZXRzIHdpdGggR0NDL0VNWCAqLwoJcmV0dXJuIDE7IC8qIFN1Y2Nlc3MgKi8KI2VuZGlmCn0K
CiNlbmRpZiAvKiBQWU9TX09TMiAqLwoKCiNpZm5kZWYgT1NfSU5JVF9ERUZJTkVECnN0YXRp
YyBpbnQKb3NfaW5pdCh2b2lkKQp7CglyZXR1cm4gMTsgLyogU3VjY2VzcyAqLwp9CiNlbmRp
ZgoKCi8qIEMgQVBJIHRhYmxlIC0gYWx3YXlzIGFkZCBuZXcgdGhpbmdzIHRvIHRoZSBlbmQg
Zm9yIGJpbmFyeQogICBjb21wYXRpYmlsaXR5LiAqLwpzdGF0aWMKUHlTb2NrZXRNb2R1bGVf
QVBJT2JqZWN0IFB5U29ja2V0TW9kdWxlQVBJID0KewoJJnNvY2tfdHlwZSwKfTsKCgovKiBJ
bml0aWFsaXplIHRoZSBfc29ja2V0IG1vZHVsZS4KCiAgIFRoaXMgbW9kdWxlIGlzIGFjdHVh
bGx5IGNhbGxlZCAiX3NvY2tldCIsIGFuZCB0aGVyZSdzIGEgd3JhcHBlcgogICAic29ja2V0
LnB5IiB3aGljaCBpbXBsZW1lbnRzIHNvbWUgYWRkaXRpb25hbCBmdW5jdGlvbmFsaXR5LiAg
T24gc29tZQogICBwbGF0Zm9ybXMgKGUuZy4gV2luZG93cyBhbmQgT1MvMiksIHNvY2tldC5w
eSBhbHNvIGltcGxlbWVudHMgYQogICB3cmFwcGVyIGZvciB0aGUgc29ja2V0IHR5cGUgdGhh
dCBwcm92aWRlcyBtaXNzaW5nIGZ1bmN0aW9uYWxpdHkgc3VjaAogICBhcyBtYWtlZmlsZSgp
LCBkdXAoKSBhbmQgZnJvbWZkKCkuICBUaGUgaW1wb3J0IG9mICJfc29ja2V0IiBtYXkgZmFp
bAogICB3aXRoIGFuIEltcG9ydEVycm9yIGV4Y2VwdGlvbiBpZiBvcy1zcGVjaWZpYyBpbml0
aWFsaXphdGlvbiBmYWlscy4KICAgT24gV2luZG93cywgdGhpcyBkb2VzIFdJTlNPQ0sgaW5p
dGlhbGl6YXRpb24uICBXaGVuIFdJTlNPQ0sgaXMKICAgaW5pdGlhbGl6ZWQgc3VjY2VzZnVs
bHksIGEgY2FsbCB0byBXU0FDbGVhbnVwKCkgaXMgc2NoZWR1bGVkIHRvIGJlCiAgIG1hZGUg
YXQgZXhpdCB0aW1lLgoqLwoKc3RhdGljIGNoYXIgc29ja2V0X2RvY1tdID0KIkltcGxlbWVu
dGF0aW9uIG1vZHVsZSBmb3Igc29ja2V0IG9wZXJhdGlvbnMuICBTZWUgdGhlIHNvY2tldCBt
b2R1bGVcblwKZm9yIGRvY3VtZW50YXRpb24uIjsKCkRMX0VYUE9SVCh2b2lkKQppbml0X3Nv
Y2tldCh2b2lkKQp7CglQeU9iamVjdCAqbTsKCglpZiAoIW9zX2luaXQoKSkKCQlyZXR1cm47
CgoJc29ja190eXBlLm9iX3R5cGUgPSAmUHlUeXBlX1R5cGU7Cglzb2NrX3R5cGUudHBfZ2V0
YXR0cm8gPSBQeU9iamVjdF9HZW5lcmljR2V0QXR0cjsKCXNvY2tfdHlwZS50cF9hbGxvYyA9
IFB5VHlwZV9HZW5lcmljQWxsb2M7Cglzb2NrX3R5cGUudHBfZnJlZSA9IFB5T2JqZWN0X0Rl
bDsKCW0gPSBQeV9Jbml0TW9kdWxlMyhQeVNvY2tldF9NT0RVTEVfTkFNRSwKCQkJICAgc29j
a2V0X21ldGhvZHMsCgkJCSAgIHNvY2tldF9kb2MpOwoKCXNvY2tldF9lcnJvciA9IFB5RXJy
X05ld0V4Y2VwdGlvbigic29ja2V0LmVycm9yIiwgTlVMTCwgTlVMTCk7CglpZiAoc29ja2V0
X2Vycm9yID09IE5VTEwpCgkJcmV0dXJuOwoJUHlfSU5DUkVGKHNvY2tldF9lcnJvcik7CglQ
eU1vZHVsZV9BZGRPYmplY3QobSwgImVycm9yIiwgc29ja2V0X2Vycm9yKTsKCXNvY2tldF9o
ZXJyb3IgPSBQeUVycl9OZXdFeGNlcHRpb24oInNvY2tldC5oZXJyb3IiLAoJCQkJCSAgIHNv
Y2tldF9lcnJvciwgTlVMTCk7CglpZiAoc29ja2V0X2hlcnJvciA9PSBOVUxMKQoJCXJldHVy
bjsKCVB5X0lOQ1JFRihzb2NrZXRfaGVycm9yKTsKCVB5TW9kdWxlX0FkZE9iamVjdChtLCAi
aGVycm9yIiwgc29ja2V0X2hlcnJvcik7Cglzb2NrZXRfZ2FpZXJyb3IgPSBQeUVycl9OZXdF
eGNlcHRpb24oInNvY2tldC5nYWllcnJvciIsIHNvY2tldF9lcnJvciwKCSAgICBOVUxMKTsK
CWlmIChzb2NrZXRfZ2FpZXJyb3IgPT0gTlVMTCkKCQlyZXR1cm47CglQeV9JTkNSRUYoc29j
a2V0X2dhaWVycm9yKTsKCVB5TW9kdWxlX0FkZE9iamVjdChtLCAiZ2FpZXJyb3IiLCBzb2Nr
ZXRfZ2FpZXJyb3IpOwoJUHlfSU5DUkVGKChQeU9iamVjdCAqKSZzb2NrX3R5cGUpOwoJaWYg
KFB5TW9kdWxlX0FkZE9iamVjdChtLCAiU29ja2V0VHlwZSIsCgkJCSAgICAgICAoUHlPYmpl
Y3QgKikmc29ja190eXBlKSAhPSAwKQoJCXJldHVybjsKCVB5X0lOQ1JFRigoUHlPYmplY3Qg
Kikmc29ja190eXBlKTsKCWlmIChQeU1vZHVsZV9BZGRPYmplY3QobSwgInNvY2tldCIsCgkJ
CSAgICAgICAoUHlPYmplY3QgKikmc29ja190eXBlKSAhPSAwKQoJCXJldHVybjsKCgkvKiBF
eHBvcnQgQyBBUEkgKi8KCWlmIChQeU1vZHVsZV9BZGRPYmplY3QobSwgUHlTb2NrZXRfQ0FQ
SV9OQU1FLAoJICAgICAgIFB5Q09iamVjdF9Gcm9tVm9pZFB0cigodm9pZCAqKSZQeVNvY2tl
dE1vZHVsZUFQSSwgTlVMTCkKCQkJCSApICE9IDApCgkJcmV0dXJuOwoKCS8qIEFkZHJlc3Mg
ZmFtaWxpZXMgKHdlIG9ubHkgc3VwcG9ydCBBRl9JTkVUIGFuZCBBRl9VTklYKSAqLwojaWZk
ZWYgQUZfVU5TUEVDCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiQUZfVU5TUEVDIiwg
QUZfVU5TUEVDKTsKI2VuZGlmCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiQUZfSU5F
VCIsIEFGX0lORVQpOwojaWZkZWYgQUZfSU5FVDYKCVB5TW9kdWxlX0FkZEludENvbnN0YW50
KG0sICJBRl9JTkVUNiIsIEFGX0lORVQ2KTsKI2VuZGlmIC8qIEFGX0lORVQ2ICovCiNpZmRl
ZiBBRl9VTklYCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiQUZfVU5JWCIsIEFGX1VO
SVgpOwojZW5kaWYgLyogQUZfVU5JWCAqLwojaWZkZWYgQUZfQVgyNQoJLyogQW1hdGV1ciBS
YWRpbyBBWC4yNSAqLwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIkFGX0FYMjUiLCBB
Rl9BWDI1KTsKI2VuZGlmCiNpZmRlZiBBRl9JUFgKCVB5TW9kdWxlX0FkZEludENvbnN0YW50
KG0sICJBRl9JUFgiLCBBRl9JUFgpOyAvKiBOb3ZlbGwgSVBYICovCiNlbmRpZgojaWZkZWYg
QUZfQVBQTEVUQUxLCgkvKiBBcHBsZXRhbGsgRERQICovCglQeU1vZHVsZV9BZGRJbnRDb25z
dGFudChtLCAiQUZfQVBQTEVUQUxLIiwgQUZfQVBQTEVUQUxLKTsKI2VuZGlmCiNpZmRlZiBB
Rl9ORVRST00KCS8qIEFtYXRldXIgcmFkaW8gTmV0Uk9NICovCglQeU1vZHVsZV9BZGRJbnRD
b25zdGFudChtLCAiQUZfTkVUUk9NIiwgQUZfTkVUUk9NKTsKI2VuZGlmCiNpZmRlZiBBRl9C
UklER0UKCS8qIE11bHRpcHJvdG9jb2wgYnJpZGdlICovCglQeU1vZHVsZV9BZGRJbnRDb25z
dGFudChtLCAiQUZfQlJJREdFIiwgQUZfQlJJREdFKTsKI2VuZGlmCiNpZmRlZiBBRl9BQUw1
CgkvKiBSZXNlcnZlZCBmb3IgV2VybmVyJ3MgQVRNICovCglQeU1vZHVsZV9BZGRJbnRDb25z
dGFudChtLCAiQUZfQUFMNSIsIEFGX0FBTDUpOwojZW5kaWYKI2lmZGVmIEFGX1gyNQoJLyog
UmVzZXJ2ZWQgZm9yIFguMjUgcHJvamVjdCAqLwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQo
bSwgIkFGX1gyNSIsIEFGX1gyNSk7CiNlbmRpZgojaWZkZWYgQUZfSU5FVDYKCVB5TW9kdWxl
X0FkZEludENvbnN0YW50KG0sICJBRl9JTkVUNiIsIEFGX0lORVQ2KTsgLyogSVAgdmVyc2lv
biA2ICovCiNlbmRpZgojaWZkZWYgQUZfUk9TRQoJLyogQW1hdGV1ciBSYWRpbyBYLjI1IFBM
UCAqLwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIkFGX1JPU0UiLCBBRl9ST1NFKTsK
I2VuZGlmCiNpZmRlZiBIQVZFX05FVFBBQ0tFVF9QQUNLRVRfSAoJUHlNb2R1bGVfQWRkSW50
Q29uc3RhbnQobSwgIkFGX1BBQ0tFVCIsIEFGX1BBQ0tFVCk7CglQeU1vZHVsZV9BZGRJbnRD
b25zdGFudChtLCAiUEZfUEFDS0VUIiwgUEZfUEFDS0VUKTsKCVB5TW9kdWxlX0FkZEludENv
bnN0YW50KG0sICJQQUNLRVRfSE9TVCIsIFBBQ0tFVF9IT1NUKTsKCVB5TW9kdWxlX0FkZElu
dENvbnN0YW50KG0sICJQQUNLRVRfQlJPQURDQVNUIiwgUEFDS0VUX0JST0FEQ0FTVCk7CglQ
eU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiUEFDS0VUX01VTFRJQ0FTVCIsIFBBQ0tFVF9N
VUxUSUNBU1QpOwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlBBQ0tFVF9PVEhFUkhP
U1QiLCBQQUNLRVRfT1RIRVJIT1NUKTsKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJQ
QUNLRVRfT1VUR09JTkciLCBQQUNLRVRfT1VUR09JTkcpOwoJUHlNb2R1bGVfQWRkSW50Q29u
c3RhbnQobSwgIlBBQ0tFVF9MT09QQkFDSyIsIFBBQ0tFVF9MT09QQkFDSyk7CglQeU1vZHVs
ZV9BZGRJbnRDb25zdGFudChtLCAiUEFDS0VUX0ZBU1RST1VURSIsIFBBQ0tFVF9GQVNUUk9V
VEUpOwojZW5kaWYKCgkvKiBTb2NrZXQgdHlwZXMgKi8KCVB5TW9kdWxlX0FkZEludENvbnN0
YW50KG0sICJTT0NLX1NUUkVBTSIsIFNPQ0tfU1RSRUFNKTsKCVB5TW9kdWxlX0FkZEludENv
bnN0YW50KG0sICJTT0NLX0RHUkFNIiwgU09DS19ER1JBTSk7CiNpZm5kZWYgX19CRU9TX18K
LyogV2UgaGF2ZSBpbmNvbXBsZXRlIHNvY2tldCBzdXBwb3J0LiAqLwoJUHlNb2R1bGVfQWRk
SW50Q29uc3RhbnQobSwgIlNPQ0tfUkFXIiwgU09DS19SQVcpOwoJUHlNb2R1bGVfQWRkSW50
Q29uc3RhbnQobSwgIlNPQ0tfU0VRUEFDS0VUIiwgU09DS19TRVFQQUNLRVQpOwoJUHlNb2R1
bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPQ0tfUkRNIiwgU09DS19SRE0pOwojZW5kaWYKCiNp
ZmRlZglTT19ERUJVRwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPX0RFQlVHIiwg
U09fREVCVUcpOwojZW5kaWYKI2lmZGVmCVNPX0FDQ0VQVENPTk4KCVB5TW9kdWxlX0FkZElu
dENvbnN0YW50KG0sICJTT19BQ0NFUFRDT05OIiwgU09fQUNDRVBUQ09OTik7CiNlbmRpZgoj
aWZkZWYJU09fUkVVU0VBRERSCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09fUkVV
U0VBRERSIiwgU09fUkVVU0VBRERSKTsKI2VuZGlmCiNpZmRlZglTT19LRUVQQUxJVkUKCVB5
TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJTT19LRUVQQUxJVkUiLCBTT19LRUVQQUxJVkUp
OwojZW5kaWYKI2lmZGVmCVNPX0RPTlRST1VURQoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQo
bSwgIlNPX0RPTlRST1VURSIsIFNPX0RPTlRST1VURSk7CiNlbmRpZgojaWZkZWYJU09fQlJP
QURDQVNUCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09fQlJPQURDQVNUIiwgU09f
QlJPQURDQVNUKTsKI2VuZGlmCiNpZmRlZglTT19VU0VMT09QQkFDSwoJUHlNb2R1bGVfQWRk
SW50Q29uc3RhbnQobSwgIlNPX1VTRUxPT1BCQUNLIiwgU09fVVNFTE9PUEJBQ0spOwojZW5k
aWYKI2lmZGVmCVNPX0xJTkdFUgoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPX0xJ
TkdFUiIsIFNPX0xJTkdFUik7CiNlbmRpZgojaWZkZWYJU09fT09CSU5MSU5FCglQeU1vZHVs
ZV9BZGRJbnRDb25zdGFudChtLCAiU09fT09CSU5MSU5FIiwgU09fT09CSU5MSU5FKTsKI2Vu
ZGlmCiNpZmRlZglTT19SRVVTRVBPUlQKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJT
T19SRVVTRVBPUlQiLCBTT19SRVVTRVBPUlQpOwojZW5kaWYKI2lmZGVmCVNPX1NOREJVRgoJ
UHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPX1NOREJVRiIsIFNPX1NOREJVRik7CiNl
bmRpZgojaWZkZWYJU09fUkNWQlVGCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09f
UkNWQlVGIiwgU09fUkNWQlVGKTsKI2VuZGlmCiNpZmRlZglTT19TTkRMT1dBVAoJUHlNb2R1
bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPX1NORExPV0FUIiwgU09fU05ETE9XQVQpOwojZW5k
aWYKI2lmZGVmCVNPX1JDVkxPV0FUCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09f
UkNWTE9XQVQiLCBTT19SQ1ZMT1dBVCk7CiNlbmRpZgojaWZkZWYJU09fU05EVElNRU8KCVB5
TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJTT19TTkRUSU1FTyIsIFNPX1NORFRJTUVPKTsK
I2VuZGlmCiNpZmRlZglTT19SQ1ZUSU1FTwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwg
IlNPX1JDVlRJTUVPIiwgU09fUkNWVElNRU8pOwojZW5kaWYKI2lmZGVmCVNPX0VSUk9SCglQ
eU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09fRVJST1IiLCBTT19FUlJPUik7CiNlbmRp
ZgojaWZkZWYJU09fVFlQRQoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPX1RZUEUi
LCBTT19UWVBFKTsKI2VuZGlmCgoJLyogTWF4aW11bSBudW1iZXIgb2YgY29ubmVjdGlvbnMg
Zm9yICJsaXN0ZW4iICovCiNpZmRlZglTT01BWENPTk4KCVB5TW9kdWxlX0FkZEludENvbnN0
YW50KG0sICJTT01BWENPTk4iLCBTT01BWENPTk4pOwojZWxzZQoJUHlNb2R1bGVfQWRkSW50
Q29uc3RhbnQobSwgIlNPTUFYQ09OTiIsIDUpOyAvKiBDb21tb24gdmFsdWUgKi8KI2VuZGlm
CgoJLyogRmxhZ3MgZm9yIHNlbmQsIHJlY3YgKi8KI2lmZGVmCU1TR19PT0IKCVB5TW9kdWxl
X0FkZEludENvbnN0YW50KG0sICJNU0dfT09CIiwgTVNHX09PQik7CiNlbmRpZgojaWZkZWYJ
TVNHX1BFRUsKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJNU0dfUEVFSyIsIE1TR19Q
RUVLKTsKI2VuZGlmCiNpZmRlZglNU0dfRE9OVFJPVVRFCglQeU1vZHVsZV9BZGRJbnRDb25z
dGFudChtLCAiTVNHX0RPTlRST1VURSIsIE1TR19ET05UUk9VVEUpOwojZW5kaWYKI2lmZGVm
CU1TR19ET05UV0FJVAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIk1TR19ET05UV0FJ
VCIsIE1TR19ET05UV0FJVCk7CiNlbmRpZgojaWZkZWYJTVNHX0VPUgoJUHlNb2R1bGVfQWRk
SW50Q29uc3RhbnQobSwgIk1TR19FT1IiLCBNU0dfRU9SKTsKI2VuZGlmCiNpZmRlZglNU0df
VFJVTkMKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJNU0dfVFJVTkMiLCBNU0dfVFJV
TkMpOwojZW5kaWYKI2lmZGVmCU1TR19DVFJVTkMKCVB5TW9kdWxlX0FkZEludENvbnN0YW50
KG0sICJNU0dfQ1RSVU5DIiwgTVNHX0NUUlVOQyk7CiNlbmRpZgojaWZkZWYJTVNHX1dBSVRB
TEwKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJNU0dfV0FJVEFMTCIsIE1TR19XQUlU
QUxMKTsKI2VuZGlmCiNpZmRlZglNU0dfQlRBRwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQo
bSwgIk1TR19CVEFHIiwgTVNHX0JUQUcpOwojZW5kaWYKI2lmZGVmCU1TR19FVEFHCglQeU1v
ZHVsZV9BZGRJbnRDb25zdGFudChtLCAiTVNHX0VUQUciLCBNU0dfRVRBRyk7CiNlbmRpZgoK
CS8qIFByb3RvY29sIGxldmVsIGFuZCBudW1iZXJzLCB1c2FibGUgZm9yIFtnc11ldHNvY2tv
cHQgKi8KI2lmZGVmCVNPTF9TT0NLRVQKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJT
T0xfU09DS0VUIiwgU09MX1NPQ0tFVCk7CiNlbmRpZgojaWZkZWYJU09MX0lQCglQeU1vZHVs
ZV9BZGRJbnRDb25zdGFudChtLCAiU09MX0lQIiwgU09MX0lQKTsKI2Vsc2UKCVB5TW9kdWxl
X0FkZEludENvbnN0YW50KG0sICJTT0xfSVAiLCAwKTsKI2VuZGlmCiNpZmRlZglTT0xfSVBY
CglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09MX0lQWCIsIFNPTF9JUFgpOwojZW5k
aWYKI2lmZGVmCVNPTF9BWDI1CglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09MX0FY
MjUiLCBTT0xfQVgyNSk7CiNlbmRpZgojaWZkZWYJU09MX0FUQUxLCglQeU1vZHVsZV9BZGRJ
bnRDb25zdGFudChtLCAiU09MX0FUQUxLIiwgU09MX0FUQUxLKTsKI2VuZGlmCiNpZmRlZglT
T0xfTkVUUk9NCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiU09MX05FVFJPTSIsIFNP
TF9ORVRST00pOwojZW5kaWYKI2lmZGVmCVNPTF9ST1NFCglQeU1vZHVsZV9BZGRJbnRDb25z
dGFudChtLCAiU09MX1JPU0UiLCBTT0xfUk9TRSk7CiNlbmRpZgojaWZkZWYJU09MX1RDUAoJ
UHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPTF9UQ1AiLCBTT0xfVENQKTsKI2Vsc2UK
CVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJTT0xfVENQIiwgNik7CiNlbmRpZgojaWZk
ZWYJU09MX1VEUAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlNPTF9VRFAiLCBTT0xf
VURQKTsKI2Vsc2UKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJTT0xfVURQIiwgMTcp
OwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fSVAKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0s
ICJJUFBST1RPX0lQIiwgSVBQUk9UT19JUCk7CiNlbHNlCglQeU1vZHVsZV9BZGRJbnRDb25z
dGFudChtLCAiSVBQUk9UT19JUCIsIDApOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fSE9QT1BU
UwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fSE9QT1BUUyIsIElQUFJP
VE9fSE9QT1BUUyk7CiNlbmRpZgojaWZkZWYJSVBQUk9UT19JQ01QCglQeU1vZHVsZV9BZGRJ
bnRDb25zdGFudChtLCAiSVBQUk9UT19JQ01QIiwgSVBQUk9UT19JQ01QKTsKI2Vsc2UKCVB5
TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RPX0lDTVAiLCAxKTsKI2VuZGlmCiNp
ZmRlZglJUFBST1RPX0lHTVAKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RP
X0lHTVAiLCBJUFBST1RPX0lHTVApOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fR0dQCglQeU1v
ZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBQUk9UT19HR1AiLCBJUFBST1RPX0dHUCk7CiNl
bmRpZgojaWZkZWYJSVBQUk9UT19JUFY0CglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAi
SVBQUk9UT19JUFY0IiwgSVBQUk9UT19JUFY0KTsKI2VuZGlmCiNpZmRlZglJUFBST1RPX0lQ
SVAKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RPX0lQSVAiLCBJUFBST1RP
X0lQSVApOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fVENQCglQeU1vZHVsZV9BZGRJbnRDb25z
dGFudChtLCAiSVBQUk9UT19UQ1AiLCBJUFBST1RPX1RDUCk7CiNlbHNlCglQeU1vZHVsZV9B
ZGRJbnRDb25zdGFudChtLCAiSVBQUk9UT19UQ1AiLCA2KTsKI2VuZGlmCiNpZmRlZglJUFBS
T1RPX0VHUAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fRUdQIiwgSVBQ
Uk9UT19FR1ApOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fUFVQCglQeU1vZHVsZV9BZGRJbnRD
b25zdGFudChtLCAiSVBQUk9UT19QVVAiLCBJUFBST1RPX1BVUCk7CiNlbmRpZgojaWZkZWYJ
SVBQUk9UT19VRFAKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RPX1VEUCIs
IElQUFJPVE9fVURQKTsKI2Vsc2UKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBS
T1RPX1VEUCIsIDE3KTsKI2VuZGlmCiNpZmRlZglJUFBST1RPX0lEUAoJUHlNb2R1bGVfQWRk
SW50Q29uc3RhbnQobSwgIklQUFJPVE9fSURQIiwgSVBQUk9UT19JRFApOwojZW5kaWYKI2lm
ZGVmCUlQUFJPVE9fSEVMTE8KCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RP
X0hFTExPIiwgSVBQUk9UT19IRUxMTyk7CiNlbmRpZgojaWZkZWYJSVBQUk9UT19ORAoJUHlN
b2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fTkQiLCBJUFBST1RPX05EKTsKI2Vu
ZGlmCiNpZmRlZglJUFBST1RPX1RQCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBQ
Uk9UT19UUCIsIElQUFJPVE9fVFApOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fSVBWNgoJUHlN
b2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fSVBWNiIsIElQUFJPVE9fSVBWNik7
CiNlbmRpZgojaWZkZWYJSVBQUk9UT19ST1VUSU5HCglQeU1vZHVsZV9BZGRJbnRDb25zdGFu
dChtLCAiSVBQUk9UT19ST1VUSU5HIiwgSVBQUk9UT19ST1VUSU5HKTsKI2VuZGlmCiNpZmRl
ZglJUFBST1RPX0ZSQUdNRU5UCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBQUk9U
T19GUkFHTUVOVCIsIElQUFJPVE9fRlJBR01FTlQpOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9f
UlNWUAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fUlNWUCIsIElQUFJP
VE9fUlNWUCk7CiNlbmRpZgojaWZkZWYJSVBQUk9UT19HUkUKCVB5TW9kdWxlX0FkZEludENv
bnN0YW50KG0sICJJUFBST1RPX0dSRSIsIElQUFJPVE9fR1JFKTsKI2VuZGlmCiNpZmRlZglJ
UFBST1RPX0VTUAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fRVNQIiwg
SVBQUk9UT19FU1ApOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fQUgKCVB5TW9kdWxlX0FkZElu
dENvbnN0YW50KG0sICJJUFBST1RPX0FIIiwgSVBQUk9UT19BSCk7CiNlbmRpZgojaWZkZWYJ
SVBQUk9UT19NT0JJTEUKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RPX01P
QklMRSIsIElQUFJPVE9fTU9CSUxFKTsKI2VuZGlmCiNpZmRlZglJUFBST1RPX0lDTVBWNgoJ
UHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fSUNNUFY2IiwgSVBQUk9UT19J
Q01QVjYpOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fTk9ORQoJUHlNb2R1bGVfQWRkSW50Q29u
c3RhbnQobSwgIklQUFJPVE9fTk9ORSIsIElQUFJPVE9fTk9ORSk7CiNlbmRpZgojaWZkZWYJ
SVBQUk9UT19EU1RPUFRTCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBQUk9UT19E
U1RPUFRTIiwgSVBQUk9UT19EU1RPUFRTKTsKI2VuZGlmCiNpZmRlZglJUFBST1RPX1hUUAoJ
UHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9fWFRQIiwgSVBQUk9UT19YVFAp
OwojZW5kaWYKI2lmZGVmCUlQUFJPVE9fRU9OCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudCht
LCAiSVBQUk9UT19FT04iLCBJUFBST1RPX0VPTik7CiNlbmRpZgojaWZkZWYJSVBQUk9UT19Q
SU0KCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RPX1BJTSIsIElQUFJPVE9f
UElNKTsKI2VuZGlmCiNpZmRlZglJUFBST1RPX0lQQ09NUAoJUHlNb2R1bGVfQWRkSW50Q29u
c3RhbnQobSwgIklQUFJPVE9fSVBDT01QIiwgSVBQUk9UT19JUENPTVApOwojZW5kaWYKI2lm
ZGVmCUlQUFJPVE9fVlJSUAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQUFJPVE9f
VlJSUCIsIElQUFJPVE9fVlJSUCk7CiNlbmRpZgojaWZkZWYJSVBQUk9UT19CSVAKCVB5TW9k
dWxlX0FkZEludENvbnN0YW50KG0sICJJUFBST1RPX0JJUCIsIElQUFJPVE9fQklQKTsKI2Vu
ZGlmCi8qKi8KI2lmZGVmCUlQUFJPVE9fUkFXCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudCht
LCAiSVBQUk9UT19SQVciLCBJUFBST1RPX1JBVyk7CiNlbHNlCglQeU1vZHVsZV9BZGRJbnRD
b25zdGFudChtLCAiSVBQUk9UT19SQVciLCAyNTUpOwojZW5kaWYKI2lmZGVmCUlQUFJPVE9f
TUFYCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBQUk9UT19NQVgiLCBJUFBST1RP
X01BWCk7CiNlbmRpZgoKCS8qIFNvbWUgcG9ydCBjb25maWd1cmF0aW9uICovCiNpZmRlZglJ
UFBPUlRfUkVTRVJWRUQKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBPUlRfUkVT
RVJWRUQiLCBJUFBPUlRfUkVTRVJWRUQpOwojZWxzZQoJUHlNb2R1bGVfQWRkSW50Q29uc3Rh
bnQobSwgIklQUE9SVF9SRVNFUlZFRCIsIDEwMjQpOwojZW5kaWYKI2lmZGVmCUlQUE9SVF9V
U0VSUkVTRVJWRUQKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUFBPUlRfVVNFUlJF
U0VSVkVEIiwgSVBQT1JUX1VTRVJSRVNFUlZFRCk7CiNlbHNlCglQeU1vZHVsZV9BZGRJbnRD
b25zdGFudChtLCAiSVBQT1JUX1VTRVJSRVNFUlZFRCIsIDUwMDApOwojZW5kaWYKCgkvKiBT
b21lIHJlc2VydmVkIElQIHYuNCBhZGRyZXNzZXMgKi8KI2lmZGVmCUlOQUREUl9BTlkKCVB5
TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJTkFERFJfQU5ZIiwgSU5BRERSX0FOWSk7CiNl
bHNlCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX0FOWSIsIDB4MDAwMDAw
MDApOwojZW5kaWYKI2lmZGVmCUlOQUREUl9CUk9BRENBU1QKCVB5TW9kdWxlX0FkZEludENv
bnN0YW50KG0sICJJTkFERFJfQlJPQURDQVNUIiwgSU5BRERSX0JST0FEQ0FTVCk7CiNlbHNl
CglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX0JST0FEQ0FTVCIsIDB4ZmZm
ZmZmZmYpOwojZW5kaWYKI2lmZGVmCUlOQUREUl9MT09QQkFDSwoJUHlNb2R1bGVfQWRkSW50
Q29uc3RhbnQobSwgIklOQUREUl9MT09QQkFDSyIsIElOQUREUl9MT09QQkFDSyk7CiNlbHNl
CglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX0xPT1BCQUNLIiwgMHg3RjAw
MDAwMSk7CiNlbmRpZgojaWZkZWYJSU5BRERSX1VOU1BFQ19HUk9VUAoJUHlNb2R1bGVfQWRk
SW50Q29uc3RhbnQobSwgIklOQUREUl9VTlNQRUNfR1JPVVAiLCBJTkFERFJfVU5TUEVDX0dS
T1VQKTsKI2Vsc2UKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJTkFERFJfVU5TUEVD
X0dST1VQIiwgMHhlMDAwMDAwMCk7CiNlbmRpZgojaWZkZWYJSU5BRERSX0FMTEhPU1RTX0dS
T1VQCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX0FMTEhPU1RTX0dST1VQ
IiwKCQkJCUlOQUREUl9BTExIT1NUU19HUk9VUCk7CiNlbHNlCglQeU1vZHVsZV9BZGRJbnRD
b25zdGFudChtLCAiSU5BRERSX0FMTEhPU1RTX0dST1VQIiwgMHhlMDAwMDAwMSk7CiNlbmRp
ZgojaWZkZWYJSU5BRERSX01BWF9MT0NBTF9HUk9VUAoJUHlNb2R1bGVfQWRkSW50Q29uc3Rh
bnQobSwgIklOQUREUl9NQVhfTE9DQUxfR1JPVVAiLAoJCQkJSU5BRERSX01BWF9MT0NBTF9H
Uk9VUCk7CiNlbHNlCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX01BWF9M
T0NBTF9HUk9VUCIsIDB4ZTAwMDAwZmYpOwojZW5kaWYKI2lmZGVmCUlOQUREUl9OT05FCglQ
eU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX05PTkUiLCBJTkFERFJfTk9ORSk7
CiNlbHNlCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSU5BRERSX05PTkUiLCAweGZm
ZmZmZmZmKTsKI2VuZGlmCgoJLyogSVB2NCBbZ3NdZXRzb2Nrb3B0IG9wdGlvbnMgKi8KI2lm
ZGVmCUlQX09QVElPTlMKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUF9PUFRJT05T
IiwgSVBfT1BUSU9OUyk7CiNlbmRpZgojaWZkZWYJSVBfSERSSU5DTAoJUHlNb2R1bGVfQWRk
SW50Q29uc3RhbnQobSwgIklQX0hEUklOQ0wiLCBJUF9IRFJJTkNMKTsKI2VuZGlmCiNpZmRl
ZglJUF9UT1MKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUF9UT1MiLCBJUF9UT1Mp
OwojZW5kaWYKI2lmZGVmCUlQX1RUTAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQ
X1RUTCIsIElQX1RUTCk7CiNlbmRpZgojaWZkZWYJSVBfUkVDVk9QVFMKCVB5TW9kdWxlX0Fk
ZEludENvbnN0YW50KG0sICJJUF9SRUNWT1BUUyIsIElQX1JFQ1ZPUFRTKTsKI2VuZGlmCiNp
ZmRlZglJUF9SRUNWUkVUT1BUUwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQX1JF
Q1ZSRVRPUFRTIiwgSVBfUkVDVlJFVE9QVFMpOwojZW5kaWYKI2lmZGVmCUlQX1JFQ1ZEU1RB
RERSCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBfUkVDVkRTVEFERFIiLCBJUF9S
RUNWRFNUQUREUik7CiNlbmRpZgojaWZkZWYJSVBfUkVUT1BUUwoJUHlNb2R1bGVfQWRkSW50
Q29uc3RhbnQobSwgIklQX1JFVE9QVFMiLCBJUF9SRVRPUFRTKTsKI2VuZGlmCiNpZmRlZglJ
UF9NVUxUSUNBU1RfSUYKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUF9NVUxUSUNB
U1RfSUYiLCBJUF9NVUxUSUNBU1RfSUYpOwojZW5kaWYKI2lmZGVmCUlQX01VTFRJQ0FTVF9U
VEwKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJJUF9NVUxUSUNBU1RfVFRMIiwgSVBf
TVVMVElDQVNUX1RUTCk7CiNlbmRpZgojaWZkZWYJSVBfTVVMVElDQVNUX0xPT1AKCVB5TW9k
dWxlX0FkZEludENvbnN0YW50KG0sICJJUF9NVUxUSUNBU1RfTE9PUCIsIElQX01VTFRJQ0FT
VF9MT09QKTsKI2VuZGlmCiNpZmRlZglJUF9BRERfTUVNQkVSU0hJUAoJUHlNb2R1bGVfQWRk
SW50Q29uc3RhbnQobSwgIklQX0FERF9NRU1CRVJTSElQIiwgSVBfQUREX01FTUJFUlNISVAp
OwojZW5kaWYKI2lmZGVmCUlQX0RST1BfTUVNQkVSU0hJUAoJUHlNb2R1bGVfQWRkSW50Q29u
c3RhbnQobSwgIklQX0RST1BfTUVNQkVSU0hJUCIsIElQX0RST1BfTUVNQkVSU0hJUCk7CiNl
bmRpZgojaWZkZWYJSVBfREVGQVVMVF9NVUxUSUNBU1RfVFRMCglQeU1vZHVsZV9BZGRJbnRD
b25zdGFudChtLCAiSVBfREVGQVVMVF9NVUxUSUNBU1RfVFRMIiwKCQkJCUlQX0RFRkFVTFRf
TVVMVElDQVNUX1RUTCk7CiNlbmRpZgojaWZkZWYJSVBfREVGQVVMVF9NVUxUSUNBU1RfTE9P
UAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIklQX0RFRkFVTFRfTVVMVElDQVNUX0xP
T1AiLAoJCQkJSVBfREVGQVVMVF9NVUxUSUNBU1RfTE9PUCk7CiNlbmRpZgojaWZkZWYJSVBf
TUFYX01FTUJFUlNISVBTCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBfTUFYX01F
TUJFUlNISVBTIiwgSVBfTUFYX01FTUJFUlNISVBTKTsKI2VuZGlmCgoJLyogSVB2NiBbZ3Nd
ZXRzb2Nrb3B0IG9wdGlvbnMsIGRlZmluZWQgaW4gUkZDMjU1MyAqLwojaWZkZWYJSVBWNl9K
T0lOX0dST1VQCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBWNl9KT0lOX0dST1VQ
IiwgSVBWNl9KT0lOX0dST1VQKTsKI2VuZGlmCiNpZmRlZglJUFY2X0xFQVZFX0dST1VQCglQ
eU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiSVBWNl9MRUFWRV9HUk9VUCIsIElQVjZfTEVB
VkVfR1JPVVApOwojZW5kaWYKI2lmZGVmCUlQVjZfTVVMVElDQVNUX0hPUFMKCVB5TW9kdWxl
X0FkZEludENvbnN0YW50KG0sICJJUFY2X01VTFRJQ0FTVF9IT1BTIiwgSVBWNl9NVUxUSUNB
U1RfSE9QUyk7CiNlbmRpZgojaWZkZWYJSVBWNl9NVUxUSUNBU1RfSUYKCVB5TW9kdWxlX0Fk
ZEludENvbnN0YW50KG0sICJJUFY2X01VTFRJQ0FTVF9JRiIsIElQVjZfTVVMVElDQVNUX0lG
KTsKI2VuZGlmCiNpZmRlZglJUFY2X01VTFRJQ0FTVF9MT09QCglQeU1vZHVsZV9BZGRJbnRD
b25zdGFudChtLCAiSVBWNl9NVUxUSUNBU1RfTE9PUCIsIElQVjZfTVVMVElDQVNUX0xPT1Ap
OwojZW5kaWYKI2lmZGVmCUlQVjZfVU5JQ0FTVF9IT1BTCglQeU1vZHVsZV9BZGRJbnRDb25z
dGFudChtLCAiSVBWNl9VTklDQVNUX0hPUFMiLCBJUFY2X1VOSUNBU1RfSE9QUyk7CiNlbmRp
ZgoKCS8qIFRDUCBvcHRpb25zICovCiNpZmRlZglUQ1BfTk9ERUxBWQoJUHlNb2R1bGVfQWRk
SW50Q29uc3RhbnQobSwgIlRDUF9OT0RFTEFZIiwgVENQX05PREVMQVkpOwojZW5kaWYKI2lm
ZGVmCVRDUF9NQVhTRUcKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJUQ1BfTUFYU0VH
IiwgVENQX01BWFNFRyk7CiNlbmRpZgojaWZkZWYJVENQX0NPUksKCVB5TW9kdWxlX0FkZElu
dENvbnN0YW50KG0sICJUQ1BfQ09SSyIsIFRDUF9DT1JLKTsKI2VuZGlmCiNpZmRlZglUQ1Bf
S0VFUElETEUKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJUQ1BfS0VFUElETEUiLCBU
Q1BfS0VFUElETEUpOwojZW5kaWYKI2lmZGVmCVRDUF9LRUVQSU5UVkwKCVB5TW9kdWxlX0Fk
ZEludENvbnN0YW50KG0sICJUQ1BfS0VFUElOVFZMIiwgVENQX0tFRVBJTlRWTCk7CiNlbmRp
ZgojaWZkZWYJVENQX0tFRVBDTlQKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJUQ1Bf
S0VFUENOVCIsIFRDUF9LRUVQQ05UKTsKI2VuZGlmCiNpZmRlZglUQ1BfU1lOQ05UCglQeU1v
ZHVsZV9BZGRJbnRDb25zdGFudChtLCAiVENQX1NZTkNOVCIsIFRDUF9TWU5DTlQpOwojZW5k
aWYKI2lmZGVmCVRDUF9MSU5HRVIyCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiVENQ
X0xJTkdFUjIiLCBUQ1BfTElOR0VSMik7CiNlbmRpZgojaWZkZWYJVENQX0RFRkVSX0FDQ0VQ
VAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIlRDUF9ERUZFUl9BQ0NFUFQiLCBUQ1Bf
REVGRVJfQUNDRVBUKTsKI2VuZGlmCiNpZmRlZglUQ1BfV0lORE9XX0NMQU1QCglQeU1vZHVs
ZV9BZGRJbnRDb25zdGFudChtLCAiVENQX1dJTkRPV19DTEFNUCIsIFRDUF9XSU5ET1dfQ0xB
TVApOwojZW5kaWYKI2lmZGVmCVRDUF9JTkZPCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudCht
LCAiVENQX0lORk8iLCBUQ1BfSU5GTyk7CiNlbmRpZgojaWZkZWYJVENQX1FVSUNLQUNLCglQ
eU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiVENQX1FVSUNLQUNLIiwgVENQX1FVSUNLQUNL
KTsKI2VuZGlmCgoKCS8qIElQWCBvcHRpb25zICovCiNpZmRlZglJUFhfVFlQRQoJUHlNb2R1
bGVfQWRkSW50Q29uc3RhbnQobSwgIklQWF9UWVBFIiwgSVBYX1RZUEUpOwojZW5kaWYKCgkv
KiBnZXR7YWRkcixuYW1lfWluZm8gcGFyYW1ldGVycyAqLwojaWZkZWYgRUFJX0FERFJGQU1J
TFkKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJFQUlfQUREUkZBTUlMWSIsIEVBSV9B
RERSRkFNSUxZKTsKI2VuZGlmCiNpZmRlZiBFQUlfQUdBSU4KCVB5TW9kdWxlX0FkZEludENv
bnN0YW50KG0sICJFQUlfQUdBSU4iLCBFQUlfQUdBSU4pOwojZW5kaWYKI2lmZGVmIEVBSV9C
QURGTEFHUwoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIkVBSV9CQURGTEFHUyIsIEVB
SV9CQURGTEFHUyk7CiNlbmRpZgojaWZkZWYgRUFJX0ZBSUwKCVB5TW9kdWxlX0FkZEludENv
bnN0YW50KG0sICJFQUlfRkFJTCIsIEVBSV9GQUlMKTsKI2VuZGlmCiNpZmRlZiBFQUlfRkFN
SUxZCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiRUFJX0ZBTUlMWSIsIEVBSV9GQU1J
TFkpOwojZW5kaWYKI2lmZGVmIEVBSV9NRU1PUlkKCVB5TW9kdWxlX0FkZEludENvbnN0YW50
KG0sICJFQUlfTUVNT1JZIiwgRUFJX01FTU9SWSk7CiNlbmRpZgojaWZkZWYgRUFJX05PREFU
QQoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIkVBSV9OT0RBVEEiLCBFQUlfTk9EQVRB
KTsKI2VuZGlmCiNpZmRlZiBFQUlfTk9OQU1FCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudCht
LCAiRUFJX05PTkFNRSIsIEVBSV9OT05BTUUpOwojZW5kaWYKI2lmZGVmIEVBSV9TRVJWSUNF
CglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiRUFJX1NFUlZJQ0UiLCBFQUlfU0VSVklD
RSk7CiNlbmRpZgojaWZkZWYgRUFJX1NPQ0tUWVBFCglQeU1vZHVsZV9BZGRJbnRDb25zdGFu
dChtLCAiRUFJX1NPQ0tUWVBFIiwgRUFJX1NPQ0tUWVBFKTsKI2VuZGlmCiNpZmRlZiBFQUlf
U1lTVEVNCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiRUFJX1NZU1RFTSIsIEVBSV9T
WVNURU0pOwojZW5kaWYKI2lmZGVmIEVBSV9CQURISU5UUwoJUHlNb2R1bGVfQWRkSW50Q29u
c3RhbnQobSwgIkVBSV9CQURISU5UUyIsIEVBSV9CQURISU5UUyk7CiNlbmRpZgojaWZkZWYg
RUFJX1BST1RPQ09MCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiRUFJX1BST1RPQ09M
IiwgRUFJX1BST1RPQ09MKTsKI2VuZGlmCiNpZmRlZiBFQUlfTUFYCglQeU1vZHVsZV9BZGRJ
bnRDb25zdGFudChtLCAiRUFJX01BWCIsIEVBSV9NQVgpOwojZW5kaWYKI2lmZGVmIEFJX1BB
U1NJVkUKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJBSV9QQVNTSVZFIiwgQUlfUEFT
U0lWRSk7CiNlbmRpZgojaWZkZWYgQUlfQ0FOT05OQU1FCglQeU1vZHVsZV9BZGRJbnRDb25z
dGFudChtLCAiQUlfQ0FOT05OQU1FIiwgQUlfQ0FOT05OQU1FKTsKI2VuZGlmCiNpZmRlZiBB
SV9OVU1FUklDSE9TVAoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIkFJX05VTUVSSUNI
T1NUIiwgQUlfTlVNRVJJQ0hPU1QpOwojZW5kaWYKI2lmZGVmIEFJX01BU0sKCVB5TW9kdWxl
X0FkZEludENvbnN0YW50KG0sICJBSV9NQVNLIiwgQUlfTUFTSyk7CiNlbmRpZgojaWZkZWYg
QUlfQUxMCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiQUlfQUxMIiwgQUlfQUxMKTsK
I2VuZGlmCiNpZmRlZiBBSV9WNE1BUFBFRF9DRkcKCVB5TW9kdWxlX0FkZEludENvbnN0YW50
KG0sICJBSV9WNE1BUFBFRF9DRkciLCBBSV9WNE1BUFBFRF9DRkcpOwojZW5kaWYKI2lmZGVm
IEFJX0FERFJDT05GSUcKCVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJBSV9BRERSQ09O
RklHIiwgQUlfQUREUkNPTkZJRyk7CiNlbmRpZgojaWZkZWYgQUlfVjRNQVBQRUQKCVB5TW9k
dWxlX0FkZEludENvbnN0YW50KG0sICJBSV9WNE1BUFBFRCIsIEFJX1Y0TUFQUEVEKTsKI2Vu
ZGlmCiNpZmRlZiBBSV9ERUZBVUxUCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiQUlf
REVGQVVMVCIsIEFJX0RFRkFVTFQpOwojZW5kaWYKI2lmZGVmIE5JX01BWEhPU1QKCVB5TW9k
dWxlX0FkZEludENvbnN0YW50KG0sICJOSV9NQVhIT1NUIiwgTklfTUFYSE9TVCk7CiNlbmRp
ZgojaWZkZWYgTklfTUFYU0VSVgoJUHlNb2R1bGVfQWRkSW50Q29uc3RhbnQobSwgIk5JX01B
WFNFUlYiLCBOSV9NQVhTRVJWKTsKI2VuZGlmCiNpZmRlZiBOSV9OT0ZRRE4KCVB5TW9kdWxl
X0FkZEludENvbnN0YW50KG0sICJOSV9OT0ZRRE4iLCBOSV9OT0ZRRE4pOwojZW5kaWYKI2lm
ZGVmIE5JX05VTUVSSUNIT1NUCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiTklfTlVN
RVJJQ0hPU1QiLCBOSV9OVU1FUklDSE9TVCk7CiNlbmRpZgojaWZkZWYgTklfTkFNRVJFUUQK
CVB5TW9kdWxlX0FkZEludENvbnN0YW50KG0sICJOSV9OQU1FUkVRRCIsIE5JX05BTUVSRVFE
KTsKI2VuZGlmCiNpZmRlZiBOSV9OVU1FUklDU0VSVgoJUHlNb2R1bGVfQWRkSW50Q29uc3Rh
bnQobSwgIk5JX05VTUVSSUNTRVJWIiwgTklfTlVNRVJJQ1NFUlYpOwojZW5kaWYKI2lmZGVm
IE5JX0RHUkFNCglQeU1vZHVsZV9BZGRJbnRDb25zdGFudChtLCAiTklfREdSQU0iLCBOSV9E
R1JBTSk7CiNlbmRpZgoKCS8qIEluaXRpYWxpemUgZ2V0aG9zdGJ5bmFtZSBsb2NrICovCiNp
ZmRlZiBVU0VfR0VUSE9TVEJZTkFNRV9MT0NLCglnZXRob3N0YnluYW1lX2xvY2sgPSBQeVRo
cmVhZF9hbGxvY2F0ZV9sb2NrKCk7CiNlbmRpZgp9CgoKI2lmbmRlZiBIQVZFX0lORVRfUFRP
TgoKLyogU2ltcGxpc3RpYyBlbXVsYXRpb24gY29kZSBmb3IgaW5ldF9wdG9uIHRoYXQgb25s
eSB3b3JrcyBmb3IgSVB2NCAqLwoKaW50CmluZXRfcHRvbihpbnQgYWYsIGNvbnN0IGNoYXIg
KnNyYywgdm9pZCAqZHN0KQp7CglpZiAoYWYgPT0gQUZfSU5FVCkgewoJCWxvbmcgcGFja2Vk
X2FkZHI7CgkJcGFja2VkX2FkZHIgPSBpbmV0X2FkZHIoc3JjKTsKCQlpZiAocGFja2VkX2Fk
ZHIgPT0gSU5BRERSX05PTkUpCgkJCXJldHVybiAwOwoJCW1lbWNweShkc3QsICZwYWNrZWRf
YWRkciwgNCk7CgkJcmV0dXJuIDE7Cgl9CgkvKiBTaG91bGQgc2V0IGVycm5vIHRvIEVBRk5P
U1VQUE9SVCAqLwoJcmV0dXJuIC0xOwp9Cgpjb25zdCBjaGFyICoKaW5ldF9udG9wKGludCBh
ZiwgY29uc3Qgdm9pZCAqc3JjLCBjaGFyICpkc3QsIHNvY2tsZW5fdCBzaXplKQp7CglpZiAo
YWYgPT0gQUZfSU5FVCkgewoJCXN0cnVjdCBpbl9hZGRyIHBhY2tlZF9hZGRyOwoJCWlmIChz
aXplIDwgMTYpCgkJCS8qIFNob3VsZCBzZXQgZXJybm8gdG8gRU5PU1BDLiAqLwoJCQlyZXR1
cm4gTlVMTDsKCQltZW1jcHkoJnBhY2tlZF9hZGRyLCBzcmMsIHNpemVvZihwYWNrZWRfYWRk
cikpOwoJCXJldHVybiBzdHJuY3B5KGRzdCwgaW5ldF9udG9hKHBhY2tlZF9hZGRyKSwgc2l6
ZSk7Cgl9CgkvKiBTaG91bGQgc2V0IGVycm5vIHRvIEVBRk5PU1VQUE9SVCAqLwoJcmV0dXJu
IE5VTEw7Cn0KCiNlbmRpZgo=
--------------67BDF2B7F2453AAB6129FCFC--




From drifty@bigfoot.com  Sun Jul 14 20:57:31 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Sun, 14 Jul 2002 12:57:31 -0700 (PDT)
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <20020714110513.GA2280@hishome.net>
Message-ID: <Pine.SOL.4.44.0207141226480.1816-100000@death.OCF.Berkeley.EDU>

[Oren Tirosh]

> On Sun, Jul 14, 2002 at 01:16:10AM -0700, Brett Cannon wrote:
> > [Oren Tirosh]
>
> This anthropomorphic description has too many irrelevant associations.  Let's
> leave the parents out of this.

=) OK.

> The logic is simple:  StopIteration is not an error. It's not even a warning,
> it's a normal part of program operation. It uses the exception mechanism
> because it is the most convenient form of out-of-band signalling.  The
> hypothetical IteratorExhausted is an error. The fact that both of them
> happen to be exceptions is almost a coincidence.
>

It still doesn't "feel" right.  I completely understand this explanation
for what StopIteration is supposed to be viewed as, but I just don't
naturally view it that way.

And it isn't because it is an exception.  I am sure all of us have used
exceptions to break out of some deeply nested loops or pull off some other
fancy control flow.  I think my view stems from what it is saying "I have
reached what I believe to be the end or what you have requested to be the
end".  I don't think that should be some notice.

> > definitely will be.  I know I thought that StopIteration was continuously
> > raised until the emails on this subject started.
>
> For most Python iterators it is.  This behavior is OK but it could be changed
<snip>

And if most are already like that, then maybe it should be the default
behavior.  Unless my understanding is faulty, two-arg iterators are a
convenience to make an iterator out of a callable function by specifying
where it the iterator will raise StopIteration.  My view is that if you
doin't want to stop where the iterator says you said to signal, then an
explicit 'if' would be better.  I mean you don't see iterators on lists
raising IndexError if you keep calling .next().

I am obviously +1 on forcing StopIteration to be permanently raised.  If
you need to go beyond the signalled end, you can do the old-fashioned way
before we had iterators.  I say make iterators so that they have the least
chance of causing errors.  They are supposed to simplify our lives not
cause us to have a new possible bug to watch out for in code.

Anyway, since I am not about to come up with some clever code chunk that
will show why the current state of affairs is bad beyond my opinion I will
leave it up to Guido to make a choice.  At least, based on the tone of
these emails on this topic, this is not going to be a decision that is
going to ruffle some feathers.

-Brett C.




From tim.one@comcast.net  Sun Jul 14 21:42:13 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 14 Jul 2002 16:42:13 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <15665.37242.446627.141013@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKDAEAB.tim.one@comcast.net>

[Barry]
> I think it would be fine to leave the situation as is
> (i.e. undefined).

As is, the PEP makes promises the implementation doesn't keep, so that part
isn't fine.  If we don't want to change the implementation(s), then that
part of the PEP should be changed to, e.g.,

      Resolution:  once StopIteration is raised, the effect of calling
      it.next() again isn't defined by the iteration protocol.  Code
      manipulating arbitrary iterators must therefore not rely on any
      particular behavior in this case.  For example, a given iterator
      may choose to raise StopIteration again, raise some other
      exception, return a value, play the theme music for Monty Python's
      Flying Circus, or decline to define the effect.

That's a start.  Then another round of decisions needs to be made for each
iterator Python supplies:  should it define the effect or not, and if so
what is it, or if not should it be explicit about not defining it?
Generators already do:

    If an unhandled exception-- including, but not limited to,
    StopIteration --is raised by, or passes through, a generator function,
    then the exception is passed on to the caller in the usual way, and
    subsequent attempts to resume the generator function raise
    StopIteration.

The current docs for two-argument iter() also tell the truth about what
happens.  I'm not sure anything else does, unless we take the absence of
docs as implying that an iterator explicitly refuses to define what happens.

So from Fred's POV <wink>, it would be easier to change the implementations
to match the current PEP wording.

I'll note one pragmatic concern.  This idiom is becoming mildly popular:

for x in someiterator:
    if is_boundary_marker(x):
        break
    else:
        do_something_with(x)

followed by (in time, not necessarily in a physically distinct loop):

for x in someiterator:
    # and we expect this to pick up where the last loop left off

If StopIteration isn't a sink state, this falls under the "code manipulating
arbitrary iterators must therefore not rely on any particular behavior in
this case" warning in the reworded docs.  That is, if the first loop
terminated via iterator exhaustion, the obvious intent is that the second
loop never enter its body.  This is reliably true if and only if
StopIteration is guaranteed to be a sink state.  The more I ponder that, the
more I'm inclined to believe that the PEP made the right decision the first
time:  guaranteeing *something* makes it possible to write a larger class of
generic code.




From mal@lemburg.com  Sun Jul 14 21:44:26 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 14 Jul 2002 22:44:26 +0200
Subject: [Python-Dev] python package
References: <3D319681.9530.100DED6A@localhost>
Message-ID: <3D31E2AA.4030804@lemburg.com>

Gordon McMillan wrote:
>>You haven't commented on the sys.modules trick yet.
>>This one doesn't even use the __path__ hackery :-)
>>
>>DateTime.py:
>>import sys
>>import mx.DateTime
>>sys.modules[__name__] = mx.DateTime
> 
> [...]
> 
>>See: it's the same module :-)
> 
> 
> Anytime x != sys.modules[x].__name__,
> someone, sometime will suffer.
> 
> Installer and (I believe) py2exe have hooks
> so that this gets analyzed properly. The hook
> is keyed by "DateTime".
> 
> If you really find it intolerable to stick your
> users with making a one line change in their
> code, you might consider contributing hooks 
> to Installer (or patches to py2exe).

I don't. I'm just using my package series as
example of how moving a set of top-level modules/packages
to a single package can be accomplished. That's
all.

I told my users to upgrade their applications
from 1.x to 2.0 by switching from 'import DateTime'
to 'from mx import DateTime' when I made the move
and indeed, only one user complained -- which is
why I provided him with a backwards compatiblity
package along the lines of what I've posted here.
He only needed it to be able to read back pickled
data, BTW.

> Particularly for your non-free packages, since
> I'm not going to download those and reverse-engineer 
> them.

Hmm, I don't understand this comment.

> Or perhaps you could do like Pmw, and
> include a "bundle" script.

py2exe works just fine with the mx stuff. I suppose
your installer does too.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Sun Jul 14 22:09:12 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 14 Jul 2002 23:09:12 +0200
Subject: [Python-Dev] PEP 263 - Defining Python Source Code Encodings
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>	<018901c22aa4$e0f41190$ced241d5@hagrid>	<m37kjyohyw.fsf@mira.informatik.hu-berlin.de>	<3D31A50E.4050800@lemburg.com>	<m3it3i1den.fsf@mira.informatik.hu-berlin.de>	<3D31AE95.6070804@lemburg.com> <m3sn2myuls.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D31E878.3020304@lemburg.com>

Martin v. Loewis wrote:
> "M.-A. Lemburg" <mal@lemburg.com> writes:
>>Of course, we no longer need to convert the tokenizer to
>>work on Py_UNICODE, so the updated text should mention
>>that compile() encodes Unicode input to UTF-8 to the continue
>>with the usual processing.
> 
> 
> The PEP currently does not say that.

I know, it should be updated to the solution found by
Hisao.

>>>2. convert to byte string using "utf-8" encoding,
>>
> [...]
> 
>>Option 2. 
> 
> 
> I think this contradicts the current wording of the PEP. It says
> 
> "5. ... and creating string objects from the Unicode literal data by
> first reencoding the UTF-8 data into 8-bit string data using the given
> file encoding"
> 
> The phrasing "the given file encoding" is a bit lax, but given the
> string
> 
> u"""
> # -*- coding: iso-8859-1 -*-
> s = 'some latin-1 text'
> """
> 
> I would expect that the encoding "given" is iso-8859-1, not utf-8.
> Now, I interpret your message to mean that s will be encoded in
> utf-8. Correct?

Hmm, good point. 8-bit string literals will have to be reencoded
using the encoding stated in the coding comment... skipping that
comment for Unicode argument to compile() would break this.

> If so, I think Fredrik is right, and
> 
>   compile(unicode(script, extract_encoding(script)))
> 
> does indeed something different than
> 
>   compile(script)
> 
> as the latter would give the string value assigned to s in its
> original encoding, i.e. latin-1.

Right. We don't want that.

compile(unicode(script, extract_encoding(script)))
should be the same as
compile(script)

>>Ideal would be to have the tokenizer skip the encoding declaration
>>detection and start directly with the UTF-8 string 
> 
> 
> "skip the encoding declaration" can't really work; you have to parse
> the source code line by line. You can tell the implementation to
> ignore the encoding declaration, if desired.

No, this wouldn't be right. I withdraw that comment :-)

>>(this also solves the problems you'd run into in case the Unicode
>>source code has a source code encoding comment).
> 
> 
> Well, that is precisely the issue that I'm trying to address here. I
> still believe that the resulting behaviour is not specified in the PEP
> at the moment (which is no big deal, since the current implementation
> does not touch compile() at all).

I'll try to come up with a proper wording tomorrow.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From gward@python.net  Sun Jul 14 22:26:20 2002
From: gward@python.net (Greg Ward)
Date: Sun, 14 Jul 2002 17:26:20 -0400
Subject: [Python-Dev] PEP 11: unsupported platforms
In-Reply-To: <m3ele89dyj.fsf@mira.informatik.hu-berlin.de>
References: <m3ele89dyj.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20020714212620.GA3192@cthulhu.gerg.ca>

On 13 July 2002, Martin v. Loewis said:
> PEP: 11
> Title: Unsupported Platforms

The only feedback I have is consider changing the name to "Removing
Support for Obsolete Platforms", since that's what most of the PEP is
about.  However, since it also includes a list of those obsolete
platforms, your title is not without merit.

        Greg
-- 
Greg Ward - Linux nerd                                  gward@python.net
http://starship.python.net/~gward/
I love ROCK 'N ROLL!  I memorized all the WORDS to "WIPE-OUT" in 1965!!



From guido@python.org  Sun Jul 14 23:19:36 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 14 Jul 2002 18:19:36 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: Your message of "Sun, 14 Jul 2002 16:42:13 EDT."
 <LNBBLJKPBEHFEDALKOLCGEKDAEAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCGEKDAEAB.tim.one@comcast.net>
Message-ID: <200207142219.g6EMJaJ28788@pcp02138704pcs.reston01.va.comcast.net>

> I'll note one pragmatic concern.  This idiom is becoming mildly popular:
> 
> for x in someiterator:
>     if is_boundary_marker(x):
>         break
>     else:
>         do_something_with(x)
> 
> followed by (in time, not necessarily in a physically distinct loop):
> 
> for x in someiterator:
>     # and we expect this to pick up where the last loop left off
> 
> If StopIteration isn't a sink state, this falls under the "code
> manipulating arbitrary iterators must therefore not rely on any
> particular behavior in this case" warning in the reworded docs.
> That is, if the first loop terminated via iterator exhaustion, the
> obvious intent is that the second loop never enter its body.  This
> is reliably true if and only if StopIteration is guaranteed to be a
> sink state.  The more I ponder that, the more I'm inclined to
> believe that the PEP made the right decision the first time:
> guaranteeing *something* makes it possible to write a larger class
> of generic code.

But if you fall through the end of the first loop, i.e. you exhaust
the iterator prematurely, you should do something else in your logic.
An else clause on the for loop might be a good place to do something
appropriate.

I haven't used this idiom often enough to know whether that places an
undue burden on the programmer.  I think the reported cases fall
mostly in the category "I didn't know it could do that and it took me
a long time to track it down."

I also note that even if the PEP specifies that StopIteration is a
sink state and we fix all built-in iterators to make it so, it's easy
for an iterator implementation to do the wrong thing (especially since
often an extra state bit is necessary to implement the sinkstate
property).

The question is, should we place the burden on iterator users to avoid
calling next() after the first StopIteration, or should we place the
burden on iterator implementations?  Since by far the most common
iterator use case is still a single for loop, which already does the
right thing, it's not at all clear to me which is worse.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tdelaney@avaya.com  Sun Jul 14 23:46:01 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Mon, 15 Jul 2002 08:46:01 +1000
Subject: [Python-Dev] Single- vs. Multi-pass iterability
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A421@natasha.auslabs.avaya.com>

> From: David Abrahams [mailto:david.abrahams@rcn.com]
> 
> One other possibility: if x.__iter__() is x, it's a 
> single-pass sequence. I
> realize this involves actually invoking the __iter__ method 
> and conjuring
> up a new iterator, but that's generally a lightweight operation...

Definitely not reliable - it will fail for a file object ... (even with the
changes currently going in).

What would be more reliable (but still not infallible) would be:

if iter(x) == iter(x):
    # this is *definitely* a single-pass iterable

All iterators are by definition single-pass iterables, and with the changes
being made to the file object, the above code would work for all builtin
iterable types as well.

Tim Delaney



From tdelaney@avaya.com  Sun Jul 14 23:51:36 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Mon, 15 Jul 2002 08:51:36 +1000
Subject: [Python-Dev] Single- vs. Multi-pass iterability
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A422@natasha.auslabs.avaya.com>

> From: Delaney, Timothy [mailto:tdelaney@avaya.com]
> 
> if iter(x) == iter(x):
>     # this is *definitely* a single-pass iterable

Of course, that should have been

if iter(x) is iter(x):
     # this is *definitely* a single-pass iterable

Too much damned Java at the moment :(

Tim Delaney



From tim.one@comcast.net  Mon Jul 15 01:03:33 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 14 Jul 2002 20:03:33 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <200207142219.g6EMJaJ28788@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKMAEAB.tim.one@comcast.net>

[Guido]
> But if you fall through the end of the first loop, i.e. you exhaust
> the iterator prematurely, you should do something else in your logic.

I'm not clear on why falling through should be consided premature
termination of the iterator.  If you're looking for a boundary, it may be
normal for it not to be there.  For example, here's something that
suppresses #if 0 blocks, copying everything else to stdout; there's really
nothing special about an input file that doesn't have any #if 0 blocks.

"""
f = file("some_file")
get = iter(f.readline, "")

depth = 0
while True:
    # Copy lines until #if 0.
    for line in get:
        if line == "#if 0\n":
            depth = 1
            break
        else:
            print line,

    # Ignore lines through matching #endif.
    for line in get:
        if line.startswith("#if "):
            depth += 1
        elif line == "#endif\n":
            depth -=1
            if depth == 0:
                break
    else:
        break

if depth:
    raise SyntaxError("%d unclosed #if blocks" % depth)
"""

This is quite natural -- even elegant.

> An else clause on the for loop might be a good place to do something
> appropriate.

It is, but doing it on more than one of the loops is clutter provided that
StopIteration is sticky (if it is, either loop can detect EOF, and there's
no need for both to).

> I haven't used this idiom often enough to know whether that places an
> undue burden on the programmer.  I think the reported cases fall
> mostly in the category "I didn't know it could do that and it took me
> a long time to track it down."

If you can't guess what .next() might do after raising StopIteration the
first time, that can't make things easier to track down <wink>.

> I also note that even if the PEP specifies that StopIteration is a
> sink state and we fix all built-in iterators to make it so, it's easy
> for an iterator implementation to do the wrong thing (especially since
> often an extra state bit is necessary to implement the sinkstate
> property).

I agree, although if the docs are clear about the requirement, it's not
beyond ordinary skill to implement it.

> The question is, should we place the burden on iterator users to avoid
> calling next() after the first StopIteration, or should we place the
> burden on iterator implementations?

I don't think that's the real choice.  If it's left undefined by the
protocol, then some iterators will be deliberately defined to "do something
useful" if called after their first StopIteration.  Then the burden isn't on
the user to avoid it, but to keep track of which iterators do and don't "do
something useful" after they said they stopped.

> Since by far the most common iterator use case is still a single for
> loop, which already does the right thing, it's not at all clear to me
> which is worse.

Well, there are more users of iterators than implementers.  Or if there
aren't, we screwed up <wink>.




From greg@cosc.canterbury.ac.nz  Mon Jul 15 01:45:11 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 15 Jul 2002 12:45:11 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <20020712171626.A2253@hishome.net>
Message-ID: <200207150045.g6F0jBb11587@oma.cosc.canterbury.ac.nz>

Oren Tirosh <oren-py-d@hishome.net>:

> A file object is an iterator pretending to be a container. For historical
> reasons it uses 'readline' instead of 'next'

I think it's more complicated than that. If the file
object were to become an object obeying the iterator
protocol, its next() method should really return the
next *byte* of the file. Then you'd still want methods
like read(), readline() etc. for reading in larger
chunks.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Mon Jul 15 02:40:41 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 15 Jul 2002 13:40:41 +1200 (NZST)
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <20020714110513.GA2280@hishome.net>
Message-ID: <200207150140.g6F1efK11783@oma.cosc.canterbury.ac.nz>

Oren Tirosh <oren-py-d@hishome.net>:

> The hypothetical IteratorExhausted is an error.

Calling it IteratorExhaustedError would make this clearer.

But I'm not sure it would be a good idea to complexify
the iterator protocol any more than absolutely necessary,
and thus place an extra burden on all iterator implementors.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Mon Jul 15 02:51:56 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 15 Jul 2002 13:51:56 +1200 (NZST)
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <200207140042.g6E0gEp19165@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200207150151.g6F1puv11830@oma.cosc.canterbury.ac.nz>

> What about this example?
> >>> l = []
> >>> li = iter(l)
> >>> li.next()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> StopIteration
> >>> l.extend([1, 2, 3])
> >>> li.next()
> 1
> 
> does the list iterator violate the proposed behavior?

Perhaps the docs should say something like "The next()
method raises StopIteration if there are no more items
remaining in the sequence at the time of the call."

This would both imply the repeated raising of
StopIteration in the case where the sequence hasn't
been modified in the meantime, and also allow the
above behaviour (which seems entirely logical, to
my way of thinking).

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Mon Jul 15 02:05:10 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 15 Jul 2002 13:05:10 +1200 (NZST)
Subject: [Python-Dev] python package
In-Reply-To: <200207121842.g6CIgQo13399@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200207150105.g6F15Au11634@oma.cosc.canterbury.ac.nz>

Guido:
> [Michael]
> > I've read the entire thread and still do not understand why you are
> > suggesting the new standard package hirearchy should be named
> > "new".
> 
> Uh?  Who is proposing to name it "new"?

Maybe he's getting it mixed up with the thead about
the "new" module?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Mon Jul 15 02:12:41 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 15 Jul 2002 13:12:41 +1200 (NZST)
Subject: Further suggestion (RE: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEDDAEAB.tim.one@comcast.net>
Message-ID: <200207150112.g6F1Cfv11646@oma.cosc.canterbury.ac.nz>

Tim:
> [Aahz]
> > I've used "%20s" * 5 frequently enough in the past to do crude tables.
> > That's not a feature I'd like to lose.
> 
> So has Guido -- he'll remember that before it's too late <wink>.  Ditto "-"
> to switch string justification.

Addendum to my suggestion: The "{...}" plays the role of
the "s" in a normal string format, so that you can do

  %-20{foo}

etc.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Mon Jul 15 01:48:44 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 15 Jul 2002 12:48:44 +1200 (NZST)
Subject: Suggestion for fixing %(foo)s (Re: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting)
In-Reply-To: <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200207150048.g6F0mi111593@oma.cosc.canterbury.ac.nz>

Guido:
> Somebody else:
> > Guido, can you please, for our enlightenment, tell us what are the
> > reasons you feel %(foo)s was a mistake?
> 
> Because of the trailing 's'.  It's very easy to leave it out by
> mistake

How about introducing a new format

  %{foo}

which is defined to be the same as %(foo)s.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Mon Jul 15 02:00:03 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 15 Jul 2002 13:00:03 +1200 (NZST)
Subject: [Python-Dev] PEP 294: Type Names in the types Module
In-Reply-To: <20020712194151.A6406@hishome.net>
Message-ID: <200207150100.g6F103f11614@oma.cosc.canterbury.ac.nz>

Oren Tirosh <oren-py-d@hishome.net>:

> The primary objection was that the documentation for the types module
> says that names exported by future versions will all end in "Type".

Suggestion: Introduce a new module called "newtypes". You can
interpret this name two ways: the module containing all the
new names for the types, and the module you use when you
want to create a new instance of a type!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From barry@zope.com  Mon Jul 15 03:41:16 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sun, 14 Jul 2002 22:41:16 -0400
Subject: [Python-Dev] Termination of two-arg iter()
References: <Pine.SOL.4.44.0207140057270.15930-100000@death.OCF.Berkeley.EDU>
 <20020714023729.Y79323-100000@mail.allcaps.org>
 <20020714112745.GB2280@hishome.net>
 <200207141320.g6EDKpJ27752@pcp02138704pcs.reston01.va.comcast.net>
 <15665.37242.446627.141013@anthem.wooz.org>
 <20020714160611.GA25950@hishome.net>
Message-ID: <15666.13900.581653.909094@anthem.wooz.org>

The problem that Jeff Epler brought up (extending the list after
StopIterator was returned, and having a subsequent .next() not give
StopIterator) has a precedence in dict iterators:

-------------------- snip snip --------------------
>>> d = {1:2, 3:4}
>>> it = iter(d)
>>> for x in d: print x
... 
1
3
>>> d[5] = 6
>>> it.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
RuntimeError: dictionary changed size during iteration
-------------------- snip snip --------------------

So why doesn't that last .next() also return StopIterator? <wink>.
StopIterator is a sink state for dict iterators if I don't change the
size of the dict.  Shouldn't list and dict iterators should behave
similarly for mutation (or at least resizing) between .next() calls?

-Barry



From cce@clarkevans.com  Mon Jul 15 05:06:51 2002
From: cce@clarkevans.com (Clark C . Evans)
Date: Mon, 15 Jul 2002 00:06:51 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <E17T36k-0003xt-00@mail.python.org>; from aleax@aleax.it on Fri, Jul 12, 2002 at 06:16:54PM +0200
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17T0ku-0008Ap-00@mail.python.org> <15662.65488.741894.155099@anthem.wooz.org> <E17T36k-0003xt-00@mail.python.org>
Message-ID: <20020715000651.A35319@doublegemini.com>

On Fri, Jul 12, 2002 at 06:16:54PM +0200, Alex Martelli wrote:
| On Friday 12 July 2002 06:12 pm, Barry A. Warsaw wrote:
| > >>>>> "AM" == Alex Martelli <aleax@aleax.it> writes:
| >     >> I think Alex is in a great position to become co-author of PEP
| >     >> 246.
| >
| >     AM> Aye aye, cap'n.  What's the procedure for "becoming co-author"
| >     AM> -- edit python/nondist/peps/pep-0246.txt and send the cvs diff
| >     AM> to Barry, or ... ?
| >
| > That would work fine, although I would like to get /some/
| > acknowledgement from Clark Evans that passing the torch (or sharing
| > the flame as it were) was okay with him.
| 
| Makes sense (& thanks to the others who suggested the same thing).
| I mailed Clark and I'll wait to hear from him.

Wow!  I'm thrilled to hear that this PEP hasn't died of 
neglect.  When I wrote it I was relatively new to Python.
Python makes you think differently about a great many things.
What it means to be a particular "type" is one of those 
mind-bending experiences I had.  If it looks like a file, 
acts like a file, it's a file.   I love this straight-forward
mentality and this clarity of thought which carries through
all of Python makes it a true pleasure to code with.

This PEP was written just by listening to people (and getting
their feedback) on the interfaces list.  It just seemed to me
that people wanted a way to ask Python:  

    Hey is this object a Thymagig?

Although this is a nice question to ask; as a programmer with 10+
years of building components, and using other's components to 
build larger applications I often ask a similar but related question:

   Well, if it ain't a Thymagig, where is the
   wrapper so I can treat it like one?

It is this second question that the Object Adaptation PEP is based.
To me, this is the stuff FAQ's are made for ... and I wonder, why
can't the language do this for me?  Why not have a language where
the library writers (who usually know each other) can't build in
the glue to connect their components in such a way that the application
builder doesn't have to do the "interface hunt".

Speaking of which, I personally don't feel that interfaces
is the way to go...  there are many reasons why I'm using Python
and not using Java.  Interfaces are too inflexible and often times
can cause more headaches than they save with additional typing.
Frankly, I think that the whole "interface paradaigm" brings with
it alot of extra baggage to the "Is this object a Thymagig?" 
question; and I think this extra baggage is just not needed --
especially for Python.   For example, interface inheritance is
one of those bits of baggage (that others may disagree with me on).
Interface inheritance is one of those "givens" that one must do to
do interfaces right.  Interface inheritance isn't needed.  Why?
Mix-ins are far more powerful mechanism as they make you think about
operations which are othogonal to each other.  You think that
interface inheritance helps, but in my experience it just screws
with your thought process... ;)

Anyway, I'm so glad that Alex has taken up the cause; I'm not all 
that actively involved in Python internals... but as a user I can't
advocate more for something like this.  Alex, I'm delighted if 
you would take ownership of the PEP.  

ON A RELATED NOTE, if you have not otherwise found out, YAML
(YAML Ain't Markup Language) is doing wonderfully.  It is a pythonish
serialization format for native data structures of Python/Perl/Ruby/Etc.
You should check it out... http://yaml.org ; we will have a last
call for our working draft on Sept 1st.   YAML is progressing nicely
and feedback from the core Python team would be wonderful.  There is
a pure Python implementation written by Steve Howell and a "C" 
library written by Neil Watkiss with python glue written by yours-truly.
The "C" library is still private for another few weeks, but the pure
Python one is available now as a work-in-progress.   The specification
itself is very near the finish line.


Kind Regards,

Clark
Yo! Try YAML on fer size.  YAML is
serialization for the masses.

-- 
Clark C. Evans                   Axista, Inc.
http://www.axista.com            800.926.5525
XCOLLA Collaborative Project Management Software



From tim.one@comcast.net  Mon Jul 15 05:22:19 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 15 Jul 2002 00:22:19 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <15666.13900.581653.909094@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCEELEAEAB.tim.one@comcast.net>

[Barry]
> The problem that Jeff Epler brought up (extending the list after
> StopIterator was returned, and having a subsequent .next() not give
> StopIterator) has a precedence in dict iterators:
>
> -------------------- snip snip --------------------
> >>> d = {1:2, 3:4}
> >>> it = iter(d)
> >>> for x in d: print x
> ...
> 1
> 3
> >>> d[5] = 6
> >>> it.next()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> RuntimeError: dictionary changed size during iteration
> -------------------- snip snip --------------------
>
> So why doesn't that last .next() also return StopIterator? <wink>.

According to the PEP as it exists, it should.

> StopIterator is a sink state for dict iterators if I don't change the
> size of the dict.

That's an illusion <wink>.  See below for why.

> Shouldn't list and dict iterators should behave similarly for
> mutation (or at least resizing) between .next() calls?

Within a single for-loop, list iterators are constrained to be compatible
with their previous implementation via the __getitem__ protocol.  So, for
example, this must work:

>>> x = [1]
>>> for y in x:
...     print y
...     x.append(y+1)
...     if y == 5:
...         break
...
1
2
3
4
5
>>>

because that's the way "for elt in list" has always worked.  Nothing about
that violates what the PEP says, though (in particular, StopIteration isn't
an issue there, as it's never raised).  It's too difficult to do something
similar for dict iterators, and that's why they raise an exception if they
detect a size change.  However, they *really* want to raise an exception if
the dict mutates, but that's also too hard to do -- checking for a size
change is a cheap & easy but vulnerable approximation.  Ponder the output
from this on a run, and then across several runs:

from random import random

for limit in range(1, 100):
    d = {}
    for i in range(limit):
        d[random()] = 1

    i = d.iterkeys()
    x = list(i)  # exhausts the iterator
    d.popitem()
    d[random()] = 1  # probably mutated, but # of elements is the same
    try:
        print i.next()  # tries poking the iterator again
        print limit, list(i)
    except StopIteration:
        pass

You'll find that this *usually* raises StopIteration on the lone i.next()
call (and you don't get output in those cases).  However, for *some* list
sizes, it's quite likely that poking the iterator again not only produces
another value, but that it can produce several more values.  There's no
predicting how many, which or when, though.

It's a bit of a stretch to call that "a feature" too <wink>.




From tim.one@comcast.net  Mon Jul 15 05:54:17 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 15 Jul 2002 00:54:17 -0400
Subject: [Python-Dev] AtExit Functions
In-Reply-To: <3D2AFA97.2030402@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGELGAEAB.tim.one@comcast.net>

[Guido]
>> I think you may be making a wrong use of Py_AtExit().  The docs state
>> (since 1998):
>>
>>   Since Python's internal finallization will have completed before the
>>   cleanup function, no Python APIs should be called by *func*.

[Guido]
> Hmm, and that includes Py_DECREF() and PyObject_Del() ?

Certainly.  In particular, Py_DECREF() can end up calling any Python code at
all, via __del__ methods.

> In that case, I have a problem since I'm using those
> two to clean up caches and free lists in the mx tools.

We have two sets of exit-function gimmicks, one that runs at the very start
of Py_Finalize(), and the other at the very end.  If you need to clean up
Python objects, you have to get into the first group.  The interpreter has
been torn down beyond usefulness by the time we get to the second group
(that's only useful for low-level OS and external non-Python C library
cleanup).

>> You may want to use the atexit.py module instead to schedule your
>> module's cleanup action; these exit functions are called much earlier.

> That's difficult to get right since I have to register such a
> function from C.

? You know how to write Python-callable C functions.  I'm not sure why you
would need to call atexit.register from C, but if you must then that's easy
too (PyObject_Call).

> Also, atexit.py is not present in Python 1.5.2.

What's that <wink>?




From oren-py-d@hishome.net  Mon Jul 15 06:27:09 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 15 Jul 2002 01:27:09 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <200207150140.g6F1efK11783@oma.cosc.canterbury.ac.nz>
References: <20020714110513.GA2280@hishome.net> <200207150140.g6F1efK11783@oma.cosc.canterbury.ac.nz>
Message-ID: <20020715052709.GA13426@hishome.net>

On Mon, Jul 15, 2002 at 01:40:41PM +1200, Greg Ewing wrote:
> Oren Tirosh <oren-py-d@hishome.net>:
> 
> > The hypothetical IteratorExhausted is an error.
> 
> Calling it IteratorExhaustedError would make this clearer.
> 
> But I'm not sure it would be a good idea to complexify
> the iterator protocol any more than absolutely necessary,
> and thus place an extra burden on all iterator implementors.

Let's look at the options:  (are there any I forgot?)

1. Define StopIteration as a sticky state.  People will write code that 
relies on this behavior. The code will sometimes fail when run on 2.2.x or
with certain existing user iterators. It's probably the worst possible 
combination: you have to implement this in your iterators but you can't 
rely on it in code that may run on 2.2 or get iterators from libraries
written before this was made into a requirement.

2. Leave things the way they are.  Since *almost* all builtin iterators 
behave this way people will continue to write code that relies on this.  
It will silently fail for some builtin iterators and user iterators.

3. Silently fix all iterators to be in a StopIteration sink state.  Even
worse than #2. It looks like version 2.2 is going to live a long time.
This will cause subtle and hard-to-find differences in behavior between 2.2
and 2.3.

4. Require iterators to raise an exception.  Places an extra burden on all
iterator implementors.  A lot of existing code will suddenly be redefined
as not kosher.

5. Leave it officially undefined but raise an exception for all or even some
builtin iterators. Raising an exception for even one popular type (listiter) 
would be more than enough to discourage code that relies on this behavior.
No extra burden is placed on iterator implementers. No change to iterator
protocol definition. No existing code is suddenly non-conforming. A small
amount of code may break but at least it will raise a meaningful exception.

  silent-errors-delenda-est-ly yours,

	Oren




From aleax@aleax.it  Mon Jul 15 07:15:11 2002
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 15 Jul 2002 08:15:11 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <20020715000651.A35319@doublegemini.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17T36k-0003xt-00@mail.python.org> <20020715000651.A35319@doublegemini.com>
Message-ID: <E17Tz95-0003Cz-00@mail.python.org>

On Monday 15 July 2002 06:06 am, Clark C . Evans wrote:
	...
> Although this is a nice question to ask; as a programmer with 10+
> years of building components, and using other's components to
> build larger applications I often ask a similar but related question:
>
>    Well, if it ain't a Thymagig, where is the
>    wrapper so I can treat it like one?
>
> It is this second question that the Object Adaptation PEP is based.

Right.  The concept of adaptation definitely comes from the Components
world, where "is-A" is a dirty word.  High time the Objects world listened:-).


> Speaking of which, I personally don't feel that interfaces
> is the way to go...  there are many reasons why I'm using Python
> and not using Java.  Interfaces are too inflexible and often times
> can cause more headaches than they save with additional typing.

A suitably Pythonic approach to interfaces won't cause any more
headaches than, say, booleans (check Google for the firestorm that
the Booleans PEP caused...:-).  We do need the CONCEPT of an
interface, just as we need the concept of true and false values.

Whether reifying the concept into a type or language-blessed
category buys you more than it costs -- much depends on how it's
done.  Personally, I'd rather have that superset of "interface" that
is known in Haskell as a "typeclass" -- and I'd settle for that middle
ground that is known in C++ or Java as an "abstract class".  If
interfaces/typeclasses/&c came with Eiffelish 'contracts', so much
the better.  But it's all in the narrow range between -0 and +0 for me.

> especially for Python.   For example, interface inheritance is
> one of those bits of baggage (that others may disagree with me on).
> Interface inheritance is one of those "givens" that one must do to

I'm not sure what you mean by "interface inheritance".  The ability
to define an interface by adding some stuff to another interface --
that's the only sense in which one could possibly speak of such
a thing in Java, say -- is extremely convenient but not a must-do...
COM manages without, not conveniently but OK all the same.  I
suspect you mean something quite different.

> do interfaces right.  Interface inheritance isn't needed.  Why?
> Mix-ins are far more powerful mechanism as they make you think about

Mix-in inheritance is basically about _implementation_.  Just like
most inheritance in Python most of the time -- issubclass, isinstance
and exception handling being the exceptions to the 'most'.  Nobody's
planning to take it away from Python, anyway.

> operations which are othogonal to each other.  You think that
> interface inheritance helps, but in my experience it just screws
> with your thought process... ;)

I've had no particular trouble designing interfaces either in Java,
with interfaces able to inherit from each other, or in COM, without
such convenience.  In the latter case one ends up with a bit of
copy and paste, not pleasant, but WTH.  I see it mostly as a bug
in COM's MIDL and supporting tools/wizards -- they let you ask for
interface inheritance even though the underlying object model does
not support it (by doing the copy-and-paste implicitly) but then don't
go all the way (e.g., they'll freely and erroneously reuse in the
inheriting interface some method dispatch-IDs that the interface
inherited-from has already assigned -- eccch).


The big question is rather: given that Isub inherits from Isuper,
does any object implementing Isub also implicitly implement Isuper?

That's the object-philosophy, where inheritance is thought to reflect
deep IS-A relations.  It's NOT the component-philosophy, where
inheritance is an implementation-convenience detail.  I much prefer
the component-approach, where my component has full control, if
it wants to, on exactly what interfaces it exposes.  This lets you
factor out any commonality between interfaces without giving any
actual reality to the factored-out common subset -- it can even stay
an "abstract interface" that no object actually supplies, if you want.

I do think it's a respectable thesis, though I don't agree with it, that
the OO rather than component approach to inheritance is, while less 
flexible for advanced uses, easier to use -- made simpler by conflating
different concerns (how is this interface exactly -- what set of interfaces
can I get from this object) into one powerful general concept.  I think
it's harder when you have to learn that said "one powerful general
concept" has several rather separate uses, and the ability to have some
specific uses fall in the gray zone between the typical use cases does
not exactly help learning and understanding, either.  But I do think that
quite a reasonable debate could be held about this.


> Anyway, I'm so glad that Alex has taken up the cause; I'm not all
> that actively involved in Python internals... but as a user I can't
> advocate more for something like this.  Alex, I'm delighted if
> you would take ownership of the PEP.

OK, thanks!


Alex



From tim.one@comcast.net  Mon Jul 15 07:18:32 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 15 Jul 2002 02:18:32 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <20020715052709.GA13426@hishome.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCMELIAEAB.tim.one@comcast.net>

[Oren Tirosh]
> Let's look at the options:  (are there any I forgot?)

We can always pretend the issue was never raised <0.9 wink>.

> 1. Define StopIteration as a sticky state.  People will write code that
> relies on this behavior.

You say that like it's a bad thing.

> The code will sometimes fail when run on 2.2.x or with certain existing
> user iterators.

But it's unlikely.  Most people stick to "for x in object:".  Those who
don't and rely on anything other than StopIteration being sticky are relying
on things explictly documented as not kosher.  If we did decide to enforce
the PEP, everything in the core that doesn't follow it is "a bug", and fixes
would get backported to the 2.2 line.

> It's probably the worst possible combination: you have to implement this
> in your iterators but you can't rely on it in code that may run on 2.2 or
> get iterators from libraries written before this was made into a
> requirement.

But it's already a requirement, according to the PEP.  Regardless of what
the PEP says or what things do, the safest course for users is not to
provoke the issue, i.e. never to write code that pokes an iterator after it
raises StopIteration.  All these choices become irrelevant to code doing so.
Your code may be an exception, but I'm sure the vast bulk of 2.2 code
already plays that way; for example, I doubt there's any code in the std
distribution that cares.

> 2. Leave things the way they are.  Since *almost* all builtin iterators
> behave this way people will continue to write code that relies on this.

It appears that more than half of the builtin iterators don't arrange to
make StopIteration sticky (sequence iterators and three flavors of dict
iterators and two-argument iter() iterators definitely do not; generator
iterators definitely do; Zope3 BTree iterators definitely do, but they're
not part of the Python core; the meta-rule here is that an iterator follows
the PEP if and only if I wrote it <wink>).

> It will silently fail for some builtin iterators and user iterators.

I'm not sure what "fail" means here.

> 3. Silently fix all iterators to be in a StopIteration sink state.  Even
> worse than #2. It looks like version 2.2 is going to live a long time.
> This will cause subtle and hard-to-find differences in behavior
> between 2.2 and 2.3.

We actively backport bugfixes to the 2.2 line.

> 4. Require iterators to raise an exception.  Places an extra burden on all
> iterator implementors.  A lot of existing code will suddenly be redefined
> as not kosher.

Raising any exception other than StopIteration is going to be a very hard
sell.

> 5. Leave it officially undefined but raise an exception for all
> or even some builtin iterators. Raising an exception for even one
> popular type (listiter)  would be more than enough to discourage
> code that relies on this behavior.

But not to stop it, and then users can't predict what will happen.

> No extra burden is placed on iterator implementers.

Didn't you just propose raising exceptions in "all or even some" builtin
iterators?  They weren't implemented by elves <wink>.

> No change to iterator protocol definition.

The only way to achieve that is your #1:  the current definition *is* sticky
state, albeit honored mostly in the breach.  If Guido doesn't want that now,
the definition has to change.

> No existing code is suddenly non-conforming.

Any existing code that relies on, or supplies, anything other than sticky
state is non-conforming right now.  You could turn that into "an advantage"
by flipping the claim to:

    Some non-conforming existing code would suddenly become officially
    blessed.




From martin@v.loewis.de  Mon Jul 15 08:04:54 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 15 Jul 2002 09:04:54 +0200
Subject: [Python-Dev] PEP 11: unsupported platforms
In-Reply-To: <20020714212620.GA3192@cthulhu.gerg.ca>
References: <m3ele89dyj.fsf@mira.informatik.hu-berlin.de>
 <20020714212620.GA3192@cthulhu.gerg.ca>
Message-ID: <m3u1n1ii95.fsf@mira.informatik.hu-berlin.de>

Greg Ward <gward@python.net> writes:

> The only feedback I have is consider changing the name to "Removing
> Support for Obsolete Platforms", since that's what most of the PEP is
> about.  However, since it also includes a list of those obsolete
> platforms, your title is not without merit.

I deliberately did not chose the word "obsolete platform", since this
PEP does not judge the obsoleteness of the platform: we do not
recommend to use other platforms instead, and so forth. Instead, all
this PEP says that we won't support Python anymore on those platforms,
as we believe that nobody is interested in Python on those systems
(for whatever reasons - mostly because the platform itself is dead).

Regards,
Martin




From fredrik@pythonware.com  Mon Jul 15 08:37:21 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 15 Jul 2002 09:37:21 +0200
Subject: [Python-Dev] PEP 11: unsupported platforms
References: <m3ele89dyj.fsf@mira.informatik.hu-berlin.de><20020714212620.GA3192@cthulhu.gerg.ca> <m3u1n1ii95.fsf@mira.informatik.hu-berlin.de>
Message-ID: <00b501c22bd2$77fbea80$ced241d5@hagrid>

martin wrote:

> I deliberately did not chose the word "obsolete platform", since this
> PEP does not judge the obsoleteness of the platform: we do not
> recommend to use other platforms instead, and so forth. Instead, all
> this PEP says that we won't support Python anymore on those platforms,

well, the title "unsupported platforms" sort of implies that
if my favourite oddball platform is not mentioned in there,
it is supported.

wouldn't something like "no longer supported platforms" or
"removing support for little used platforms" be more accurate?

</F>




From mal@lemburg.com  Mon Jul 15 08:52:55 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jul 2002 09:52:55 +0200
Subject: [Python-Dev] AtExit Functions
References: <LNBBLJKPBEHFEDALKOLCGELGAEAB.tim.one@comcast.net>
Message-ID: <3D327F57.4040705@lemburg.com>

Tim Peters wrote:
> [Guido]
> 
>>>I think you may be making a wrong use of Py_AtExit().  The docs state
>>>(since 1998):
>>>
>>>  Since Python's internal finallization will have completed before the
>>>  cleanup function, no Python APIs should be called by *func*.
>>
> 
> [Marc]
> 
>>Hmm, and that includes Py_DECREF() and PyObject_Del() ?
> 
> 
> Certainly.  In particular, Py_DECREF() can end up calling any Python code at
> all, via __del__ methods.

PyObject_Del() as well ?

>>In that case, I have a problem since I'm using those
>>two to clean up caches and free lists in the mx tools.
> 
> 
> We have two sets of exit-function gimmicks, one that runs at the very start
> of Py_Finalize(), and the other at the very end.  If you need to clean up
> Python objects, you have to get into the first group.  The interpreter has
> been torn down beyond usefulness by the time we get to the second group
> (that's only useful for low-level OS and external non-Python C library
> cleanup).

I suppose the first one is what the atexit module exposes
in Python 2.0+, right ?

The problem with that approach is that there may still be some
references to objects left in lists and dicts which are cleaned
up after having called the atexit functions. This is not so
much a problem in my cases, but something to watch out in other
applications which use C level Python objects as globals.

>>>You may want to use the atexit.py module instead to schedule your
>>>module's cleanup action; these exit functions are called much earlier.
>>
> 
>>That's difficult to get right since I have to register such a
>>function from C.
> 
> 
> ? You know how to write Python-callable C functions.  I'm not sure why you
> would need to call atexit.register from C, but if you must then that's easy
> too (PyObject_Call).

Well, yeah :-)

>>Also, atexit.py is not present in Python 1.5.2.
> 
> 
> What's that <wink>?

That's the Python version which was brand new just 3 years
ago. I know... in US terms that's for history books ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mhammond@skippinet.com.au  Mon Jul 15 09:37:05 2002
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Mon, 15 Jul 2002 18:37:05 +1000
Subject: [Python-Dev] threads, SIGINT and time.sleep()
Message-ID: <LCEPIIGDJPKCOIHOBJEPAEMAFPAA.mhammond@skippinet.com.au>

Tim and I have been thrashing around in http://python.org/sf/581232 trying
to make time.sleep() interruptible for Windows.  This turns out to be quite
simple, but has unearthed some questions about thread interactions, and
seems to have changed semantics on Linux.

While I understand that the docs make almost no guarantees WRT threads and
signals, I am wondering what the "most desirable" semantics would be
assuming the platform supports it.

Consider a Python program with a main thread + 2 extra threads.  The 2 extra
threads are both in time.sleep().  When Ctrl+C is pressed, the docs seem to
clearly state that only the main thread should see a KeyboardInterrupt.  My
question is: what should happen to the time.sleep() threads?

It seems that Python 1.5.2 on Linux (as supplied by RedHat) would interrupt
the 2 threads with IOError(EINTR).  CVS Python currently seems to not
interrupt the threads at all, allowing the sleep() to continue the full
period.  (A time.sleep() in the main thread *is* interrupted in both
versions)

For Windows I can do either.  However, the Python 1.5.2 semantics seems to
make the most sense to me.  Was this change in behaviour post 1.5
intentional?  The code does not imply the new behaviour is intented (but the
code doesn't imply much at all!)

Test code and results below.  All clues welcomed!

Thanks,

Mark.

Test code:
----------
import time, threading

threads=[]
for i in range(2):
    t=threading.Thread(target=time.sleep, args=(30,))
    t.start()
    threads.append(t)
for t in threads:
    t.join()

Python 1.5.2 on Linux:
----------------------
Exception in thread Thread-1:
Traceback (innermost last):
...
  File "/usr/lib/python1.5/site-packages/threading.py", line 364, in run
    apply(self.__target, self.__args, self.__kwargs)
IOError: [Errno 4] Interrupted system call

Exception in thread Thread-2:
Traceback (innermost last):
...
  File "/usr/lib/python1.5/site-packages/threading.py", line 364, in run
    apply(self.__target, self.__args, self.__kwargs)
IOError: [Errno 4] Interrupted system call

Traceback (innermost last):
...
  File "/usr/lib/python1.5/threading.py", line 189, in wait
    waiter.acquire()
KeyboardInterrupt

Current CVS on Linux:
---------------------
[Pressing Ctrl+C has no effect - sleep() period expires, then...]
Traceback (most recent call last):
...
  File "/home/skip/src/python/dist/src/Lib/threading.py", line 190, in wait
    waiter.acquire()
KeyboardInterrupt




From mwh@python.net  Mon Jul 15 10:36:12 2002
From: mwh@python.net (Michael Hudson)
Date: 15 Jul 2002 10:36:12 +0100
Subject: [Python-Dev] Python version of PySlice_GetIndicesEx
In-Reply-To: Guido van Rossum's message of "Fri, 12 Jul 2002 14:38:31 -0400"
References: <000d01c21cdb$eb03b720$91d8accf@othello> <20020630173903.GA37045@hishome.net> <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net> <20020712212105.A8666@hishome.net> <200207121838.g6CIcV813352@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <2m8z4ds583.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> (I changed the subject)
> 
> > When I was going through the sources of sliceobject.c I found the function 
> > PySlice_GetIndicesEx.  It performs the magic of trimming a slice into the 
> > range of indices of a sequence, including negative indices and intervals 
> > with None as start or stop value.  A comment in this function says:
> > 
> >  /* this is harder to get right than you might think */

And it is.

> > Wouldn't it be a good idea to expose this nontrivial functionality to 
> > Python code as a method of slice objects?
> 
> I dunno.  It seems that most code that actually uses slices is written
> in C anyway.
> 
> > The method would take an integer argument (length) and return an
> > xrange object.
> 
> Why an xrange object?  That's not inspectable.  *If* we were to do
> this (which I doubt) it should return a tuple of three ints.

Yes.

> > It should make it much 
> > easier to implement user types that support extended slicing:
> > 
> >     def __getitem__(self, index):
> >         if isinstance(index, slice):
> >             return [get_item_at(i) for i in index.trim(len(self))]
> >         else:
> >             return get_item_at(index)
> > 
> > Suggestions for a better name than trim?
> 
> getindices()

When I was debugging this function, I wrote a method called indices().
Actually, I think I'm probably in favour of adding this method, if
only to make writing clearer test cases easier.

[...]

[Tim]
> Just to be helpfully irritating, I'll note that Zope's C
> implementation of slice index normalization for BTreeItems objects
> was off in nearly every way possible, until a few weeks ago.  It
> really is difficult to get this right.

No kidding.

Cheers,
M.

-- 
  I think perhaps we should have electoral collages and construct
  our representatives entirely of little bits of cloth and papier 
  mache.
                  -- Owen Dunn, ucam.chat, from his review of the year



From mwh@python.net  Mon Jul 15 11:03:33 2002
From: mwh@python.net (Michael Hudson)
Date: 15 Jul 2002 11:03:33 +0100
Subject: [Python-Dev] threads, SIGINT and time.sleep()
In-Reply-To: "Mark Hammond"'s message of "Mon, 15 Jul 2002 18:37:05 +1000"
References: <LCEPIIGDJPKCOIHOBJEPAEMAFPAA.mhammond@skippinet.com.au>
Message-ID: <2m3culs3yi.fsf@starship.python.net>

"Mark Hammond" <mhammond@skippinet.com.au> writes:

> Tim and I have been thrashing around in http://python.org/sf/581232 trying
> to make time.sleep() interruptible for Windows.  This turns out to be quite
> simple, but has unearthed some questions about thread interactions, and
> seems to have changed semantics on Linux.
> 
> While I understand that the docs make almost no guarantees WRT threads and
> signals, 

Don't go there.

> I am wondering what the "most desirable" semantics would be assuming
> the platform supports it.
> 
> Consider a Python program with a main thread + 2 extra threads.  The 2 extra
> threads are both in time.sleep().  When Ctrl+C is pressed, the docs seem to
> clearly state that only the main thread should see a KeyboardInterrupt.  My
> question is: what should happen to the time.sleep() threads?
> 
> It seems that Python 1.5.2 on Linux (as supplied by RedHat) would interrupt
> the 2 threads with IOError(EINTR).  CVS Python currently seems to not
> interrupt the threads at all, allowing the sleep() to continue the full
> period.  (A time.sleep() in the main thread *is* interrupted in both
> versions)

Are you saying that your patch changes behaviour, or that behaviour
changed somewhere between 1.5.2 and current CVS?  Or between 2.2 and
current CVS?

These lines:

        /* Mask all signals in the current thread before creating the new
         * thread.  This causes the new thread to start with all signals
         * blocked.
         */
        sigfillset(&newmask);
        SET_THREAD_SIGMASK(SIG_BLOCK, &newmask, &oldmask);

might have something to do with it.  Does anyone know where they come from?

> For Windows I can do either.  However, the Python 1.5.2 semantics seems to
> make the most sense to me.  Was this change in behaviour post 1.5
> intentional?  The code does not imply the new behaviour is intented (but the
> code doesn't imply much at all!)

I'd expect the 1.5.2 semantics, but ...

> Test code and results below.  All clues welcomed!

Well, the behaviour is probably (pairwise) different on FreeBSD,
Darwin and Solaris.

"Cheers",
M.

-- 
  Gullible editorial staff continues to post links to any and all
  articles that vaguely criticize Linux in any way.
         -- Reason #4 for quitting slashdot today, from
            http://www.cs.washington.edu/homes/klee/misc/slashdot.html



From cce@clarkevans.com  Mon Jul 15 12:24:00 2002
From: cce@clarkevans.com (Clark C . Evans)
Date: Mon, 15 Jul 2002 07:24:00 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <E17Tz95-0003Cz-00@mail.python.org>; from aleax@aleax.it on Mon, Jul 15, 2002 at 08:15:11AM +0200
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17T36k-0003xt-00@mail.python.org> <20020715000651.A35319@doublegemini.com> <E17Tz95-0003Cz-00@mail.python.org>
Message-ID: <20020715072400.B40101@doublegemini.com>

On Mon, Jul 15, 2002 at 08:15:11AM +0200, Alex Martelli wrote:
| >
| >    Well, if it ain't a Thymagig, where is the
| >    wrapper so I can treat it like one?
| >
| 
| Right.  The concept of adaptation definitely comes from the Components
| world, where "is-A" is a dirty word.  High time the Objects world 
| listened:-).

*grins*

| Whether reifying the concept into a type or language-blessed
| category buys you more than it costs -- much depends on how it's
| done.  Personally, I'd rather have that superset of "interface" that
| is known in Haskell as a "typeclass" -- and I'd settle for that middle
| ground that is known in C++ or Java as an "abstract class".  If
| interfaces/typeclasses/&c came with Eiffelish 'contracts', so much
| the better.  But it's all in the narrow range between -0 and +0 for me.

I could see where something like Eiffel's "contracts" would be a
neat addition to Python, but I must confess I don't have any serious
experience with them. 

| I'm not sure what you mean by "interface inheritance".  The ability
| to define an interface by adding some stuff to another interface --
| that's the only sense in which one could possibly speak of such
| a thing in Java, say -- is extremely convenient but not a must-do...

My exposition was awful, sorry.   Perhaps a specific (albeit contrived)
example would better reflect the intent.   Suppose that I have a iterator
with one method, next().   Now suppose that I want a "mutable iterator",
one which adds the change() method.   This is well and good, but the
concept of something being mutable is quite othogonal to iteration and
perhaps should have its own interface rather than using inheritance.
So, what I'm asserting is that once the inheritance "feature" is 
there... people use it even though other approaches are available.

Once headed down this path, people start defining these massivly
ugly interfaces (see XML's DOM) where "lazy" implementations throw
a NotImplemented error if the object doesn't support particular
methods of the interface.   Yikes.  The other thing that interface
inheritance implies is substitutabilty; yet in practice this isn't
always practical (or even needed).   

But, alas we are digressing here.  The point of the PEP wasn't to
confront interfaces; as people who believe in them won't be swayed.
But frankly... I love python without interfaces.   

| The big question is rather: given that Isub inherits from Isuper,
| does any object implementing Isub also implicitly implement Isuper?
| 
| That's the object-philosophy, where inheritance is thought to reflect
| deep IS-A relations.  It's NOT the component-philosophy, where
| inheritance is an implementation-convenience detail.  I much prefer
| the component-approach, where my component has full control, if
| it wants to, on exactly what interfaces it exposes.  This lets you
| factor out any commonality between interfaces without giving any
| actual reality to the factored-out common subset -- it can even stay
| an "abstract interface" that no object actually supplies, if you want.

Yes, this is the bigger question.  And I'd rather see python swing
more towards the component/delegation model; it isn't really strictly
object oriented as it is, and I'd hate to see it become that way.

Best,

Clark
Yo! Check out YAML!  http://yaml.org
Serialization for the masses



From skip@pobox.com  Mon Jul 15 13:43:37 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 15 Jul 2002 07:43:37 -0500
Subject: [Python-Dev] AtExit Functions
In-Reply-To: <3D327F57.4040705@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCGELGAEAB.tim.one@comcast.net>
 <3D327F57.4040705@lemburg.com>
Message-ID: <15666.50041.362691.287914@localhost.localdomain>

    mal> I suppose the first one is what the atexit module exposes in Python
    mal> 2.0+, right ?

Not really.  The atexit module is just a wrapper around sys.exitfunc which
provides a standard protocol for registering more than one function to be
called at exit.  You should be able to easily backport it to 1.5.2 and
deliver it with your package for installation on systems still running
1.5.2.  Or, just deal directly with sys.exitfunc.  Before 2.0 there was no
rational way to use sys.exitfunc.  The application, libraries, and core code
had no rules about who could or couldn't set sys.exitfunc.

Skip



From mhammond@skippinet.com.au  Mon Jul 15 14:03:23 2002
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Mon, 15 Jul 2002 23:03:23 +1000
Subject: [Python-Dev] threads, SIGINT and time.sleep()
In-Reply-To: <2m3culs3yi.fsf@starship.python.net>
Message-ID: <LCEPIIGDJPKCOIHOBJEPMEMDFPAA.mhammond@skippinet.com.au>

[Michael]
> > While I understand that the docs make almost no guarantees WRT
> > threads and signals,
>
> Don't go there.

Didn't mean to <wink>

> Are you saying that your patch changes behaviour, or that behaviour
> changed somewhere between 1.5.2 and current CVS?  Or between 2.2 and
> current CVS?

My patch is for Windows only.  While examining my Linux builds to seek out
the most desirable behaviour, I stumbled across the difference between my
Linux 1.5.2 and CVS builds.  I have no other Linux builds to try, but if
someone has a few versions handy it would be interesting to know exactly
where it changes (or indeed if others can even repro this behaviour).

It appears that cygwin on Windows aborts the 2 threads without error - ie,
sleep() silently returns early.

> I'd expect the 1.5.2 semantics, but ...
>
> > Test code and results below.  All clues welcomed!
>
> Well, the behaviour is probably (pairwise) different on FreeBSD,
> Darwin and Solaris.

Yeah, I appreciate that there will always be platform differences - but I
still wouldn't mind knowing a "most desirable" behaviour should the platform
support it and anyone be bothered - if for no better reason than for me to
know what behaviour to check in for Windows!

Thanks,

Mark.




From guido@python.org  Mon Jul 15 14:05:29 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 09:05:29 -0400
Subject: Suggestion for fixing %(foo)s (Re: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting)
In-Reply-To: Your message of "Mon, 15 Jul 2002 12:48:44 +1200."
 <200207150048.g6F0mi111593@oma.cosc.canterbury.ac.nz>
References: <200207150048.g6F0mi111593@oma.cosc.canterbury.ac.nz>
Message-ID: <200207151305.g6FD5Tp30343@pcp02138704pcs.reston01.va.comcast.net>

> > > Guido, can you please, for our enlightenment, tell us what are the
> > > reasons you feel %(foo)s was a mistake?
> > 
> > Because of the trailing 's'.  It's very easy to leave it out by
> > mistake
> 
> How about introducing a new format
> 
>   %{foo}
> 
> which is defined to be the same as %(foo)s.

Maybe too subtle (you'd really have to explain the history to make
people understand why there's both %() and %()), and doesn't solve the
compile time / run time issue IMO.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jul 15 14:27:53 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 09:27:53 -0400
Subject: [Python-Dev] Python version of PySlice_GetIndicesEx
In-Reply-To: Your message of "Mon, 15 Jul 2002 10:36:12 BST."
 <2m8z4ds583.fsf@starship.python.net>
References: <000d01c21cdb$eb03b720$91d8accf@othello> <20020630173903.GA37045@hishome.net> <200207121709.g6CH9Wb12714@pcp02138704pcs.reston01.va.comcast.net> <20020712212105.A8666@hishome.net> <200207121838.g6CIcV813352@pcp02138704pcs.reston01.va.comcast.net>
 <2m8z4ds583.fsf@starship.python.net>
Message-ID: <200207151327.g6FDRrL30498@pcp02138704pcs.reston01.va.comcast.net>

> > > Suggestions for a better name than trim?
> > 
> > getindices()
> 
> When I was debugging this function, I wrote a method called indices().
> Actually, I think I'm probably in favour of adding this method, if
> only to make writing clearer test cases easier.

OK.  Michael, if you want to check in indices(), go ahead.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jul 15 14:28:54 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 09:28:54 -0400
Subject: [Python-Dev] Python version of PySlice_GetIndicesEx
In-Reply-To: Your message of "Mon, 15 Jul 2002 09:27:53 EDT."
Message-ID: <200207151328.g6FDSsI30513@pcp02138704pcs.reston01.va.comcast.net>

> OK.  Michael, if you want to check in indices(), go ahead.

Of course, the possibility exists that indices() fails, when one of
the indices is not an int or None.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jul 15 14:38:45 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 09:38:45 -0400
Subject: [Python-Dev] threads, SIGINT and time.sleep()
In-Reply-To: Your message of "Mon, 15 Jul 2002 18:37:05 +1000."
 <LCEPIIGDJPKCOIHOBJEPAEMAFPAA.mhammond@skippinet.com.au>
References: <LCEPIIGDJPKCOIHOBJEPAEMAFPAA.mhammond@skippinet.com.au>
Message-ID: <200207151338.g6FDcjm30615@pcp02138704pcs.reston01.va.comcast.net>

Python has always documented that signals go only to the main thread.
Apparently in 2.1 and before this wasn't implemented properly (for
Linux; I don't know about other platforms and this is notoriously
platform-dependent).

I think that since ^C doesn't interrupt regular Python code running in
a thread, it's strange that time.sleep() (and presumably other I/O!)
would be interrupted.

So I'd like to see the CVS behavior.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jul 15 15:04:54 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 10:04:54 -0400
Subject: [Python-Dev] PEP 246 - Object Adaptation
In-Reply-To: Your message of "Mon, 15 Jul 2002 08:15:11 +0200."
 <E17Tz95-0003Cz-00@mail.python.org>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17T36k-0003xt-00@mail.python.org> <20020715000651.A35319@doublegemini.com>
 <E17Tz95-0003Cz-00@mail.python.org>
Message-ID: <200207151404.g6FE4sa30738@pcp02138704pcs.reston01.va.comcast.net>

(Changing the subject)

> The big question is rather: given that Isub inherits from Isuper,
> does any object implementing Isub also implicitly implement Isuper?

This probably shows my naivete more than anything else...

I'd say "of course", based on an example where Isuper is
FileOpenForReading and Isub is FileOpenForReadingAndWriting.
It would be strange if a file open for reading and writing was not
acceptable in a place where a file open for reading is accepted
(because it implements all the right methods).  Or is the fact that it
implements *more* the problem?

Am I missing something?

I also thought that there's a different dimension of interface
inheritance: if class C implements interface I, and class D derives
from class C, does D implicitly implement I also?  Again, I'd say
yes.  But I believe Jim Fulton disagrees with me.  And again, I
haven't tried to use interfaces enough to understand what problems you
could get into by this assumption.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jul 15 15:15:28 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 10:15:28 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Mon, 15 Jul 2002 12:45:11 +1200."
 <200207150045.g6F0jBb11587@oma.cosc.canterbury.ac.nz>
References: <200207150045.g6F0jBb11587@oma.cosc.canterbury.ac.nz>
Message-ID: <200207151415.g6FEFSP30815@pcp02138704pcs.reston01.va.comcast.net>

> I think it's more complicated than that. If the file
> object were to become an object obeying the iterator
> protocol, its next() method should really return the
> next *byte* of the file. Then you'd still want methods
> like read(), readline() etc. for reading in larger
> chunks.

I don't think so.  We should pick the most convenient chunking for the
default iterator, and provide explicit ways to ask for other iterators
(like dict iterators).  Also, since "for line in file" already works,
there's a strong precedent for iterating by line by default.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@zope.com  Mon Jul 15 15:19:33 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 15 Jul 2002 10:19:33 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com>
 <E17T36k-0003xt-00@mail.python.org>
 <20020715000651.A35319@doublegemini.com>
 <200207150615.g6F6FJq28099@smtp.zope.com>
Message-ID: <15666.55797.351811.317428@anthem.wooz.org>

>>>>> "AM" == Alex Martelli <aleax@aleax.it> writes:

    AM> The big question is rather: given that Isub inherits from
    AM> Isuper, does any object implementing Isub also implicitly
    AM> implement Isuper?

There's another issue that Jim Fulton likes to bring up, IIRC.  If
class Super implements IInterface, does class Sub(Super) also
(automatically) implement IInterface?

I could be totally misremembering, but I believe that Jim would say
"no".  Class Sub would have to explicitly declare that it also
implements IInterface.

-Barry



From aahz@pythoncraft.com  Mon Jul 15 15:22:25 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 15 Jul 2002 10:22:25 -0400
Subject: [Python-Dev] PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability)
In-Reply-To: <20020715072400.B40101@doublegemini.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17T36k-0003xt-00@mail.python.org> <20020715000651.A35319@doublegemini.com> <E17Tz95-0003Cz-00@mail.python.org> <20020715072400.B40101@doublegemini.com>
Message-ID: <20020715142225.GA9006@panix.com>

On Mon, Jul 15, 2002, Clark C . Evans wrote:
>
> My exposition was awful, sorry.   Perhaps a specific (albeit contrived)
> example would better reflect the intent.   Suppose that I have a iterator
> with one method, next().   Now suppose that I want a "mutable iterator",
> one which adds the change() method.   This is well and good, but the
> concept of something being mutable is quite othogonal to iteration and
> perhaps should have its own interface rather than using inheritance.
> So, what I'm asserting is that once the inheritance "feature" is 
> there... people use it even though other approaches are available.

On the whole, I'd say that Python is actually *less* prone to this
problem (because using an object doesn't generally require inheritance,
just protocol); in fact, it suffers from the obverse problem.  Consider
this:

    class C:
        def open(self, name, flags=None):
        def read(self):
        def write(self, value):
        def close(self):

Can instances of C be used where a file object is expected?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From barry@zope.com  Mon Jul 15 15:32:18 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 15 Jul 2002 10:32:18 -0400
Subject: [Python-Dev] PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability)
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com>
 <E17T36k-0003xt-00@mail.python.org>
 <20020715000651.A35319@doublegemini.com>
 <E17Tz95-0003Cz-00@mail.python.org>
 <20020715072400.B40101@doublegemini.com>
 <20020715142225.GA9006@panix.com>
Message-ID: <15666.56562.63130.943725@anthem.wooz.org>

>>>>> "A" == Aahz  <aahz@pythoncraft.com> writes:

    A> On the whole, I'd say that Python is actually *less* prone to
    A> this problem (because using an object doesn't generally require
    A> inheritance, just protocol); in fact, it suffers from the
    A> obverse problem.  Consider this:

    |     class C:
    |         def open(self, name, flags=None):
    |         def read(self):
    |         def write(self, value):
    |         def close(self):

    A> Can instances of C be used where a file object is expected?

Maybe <wink>.

That's why you tend to see things described like: "argument f must
have a write() method that accepts a string."  WIBNI we could define a
protocol/interface/thingie that encapsulated that requirement?  I'd
even be happy to start out with no officially blessed interfaces, to
give time to see what cream rises to the top.  Zope's Interface and
component model stuff is a good way to get some real experience with
using these concepts in Python.

-Barry



From guido@python.org  Mon Jul 15 15:39:51 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 10:39:51 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 12 Jul 2002 00:59:28 +0300."
 <20020712005928.A9833@hishome.net>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net> <E17SbpT-0002yd-00@mail.python.org> <200207111219.g6BCJVU30095@pcp02138704pcs.reston01.va.comcast.net>
 <20020712005928.A9833@hishome.net>
Message-ID: <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net>

> http://www.python.org/sf/580331
> 
> No, it's not a complete rewrite of file buffering.  This patch
> implements Just's idea of xreadlines caching in the file object.  It
> also makes a file into an iterator: __iter__ returns self and next
> calls the next method of the cached xreadlines object.

Hm.  What happens to the xreadlines object when you do a seek() on the
file?

With the old semantics, you could do f.seek(0) and get another
iterator (assuming it's a seekable file of course).  With the new
semantics, the cached iterator keeps getting in the way.

Maybe the xreadlines object could grow a flush() method that throws
away its buffer, and f.seek() could call that if there's a cached
xreadlines iterator?

> See my previous postings for why I think a file should be an iterator.

Haven't seen them but I would agree that this makes sense.

> With this patch any combination of multiple xreadlines and iterator
> protocol operations on a file object is safe. Using
> xreadlines/iterator followed by regular readline has the same
> buffering problem as before.

Agreed.

I just realized that the (existing) file_xreadlines() function has a
subtle bug.  It uses a local static variable to cache the function
xreadlines imported from the module xreadlines.  But if there are
multiple interpreters or Py_Finalize() is called and then
Py_Initialize() again, the cache is invalid.  Would you mind fixing
this?  I think the caching just isn't worth it -- just do the import
every time (it's fast enough if sys.modules['xreadlines'] already
exists).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jmiller@stsci.edu  Mon Jul 15 15:41:58 2002
From: jmiller@stsci.edu (Todd Miller)
Date: Mon, 15 Jul 2002 10:41:58 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
References: <LNBBLJKPBEHFEDALKOLCAEEPAEAB.tim.one@comcast.net>
Message-ID: <3D32DF36.5080906@stsci.edu>

Here's the numarray perspective on things.  

Tim Peters wrote:

>[Tim]
>
>>Fredrik pressed for details, but we haven't seen any concrete use cases.
>>In the absence of the latter, it's impossible to guess what would be
>>backward compatible for MAL's purposes.
>>
I updated my CVS copy of Python and tried out MAL's patch with numarray. 
 Nothing broke as far as I can tell.  I guess it probably doesn't matter 
anyway given that both buffer() and MAL's patch are headed to oblivion.

>>
>
>[M.-A. Lemburg]
>
>>For my purposes, the strategy buffer slice returns a buffer
>>would be more appropriate because it would save the buffer type
>>information across the slicing operation... I mean, you don't
>>want to get bananas when you slice an apple in real life either ;-)
>>
>>I use buffers to mean: this is a chunk of binary data. The purpose
>>is to recognize this type of data for pickling via xml-rpc,
>>soap and other rpc mechanisms etc.
>>
>
>How do you use buffers?  
>
We use buffers in numarray to store our array data.   We use readinto to 
load array buffers efficiently from a file.    We operate on the buffer 
data in-place.  Since numarrays are python classe instances, buffers 
provide a place for the data to live.

>Do you stick to their C API?  
>
We use the C-API, and currently use the buffer object too.   Using the 
buffer object has always seemed like a necessary evil, but having 
reviewed numarray usage of buffer(), ditching it sounds good to me.

>Do you use the
>Python-level buffer() function?  
>
Yes.  We go one step further, and expose writeable buffers using our own 
extension function.  I had a feeling I was on thin ice when I did this.

>If the latter, what do you do in Python
>code with a buffer object after you get one?  The only use I've seen made of
>a buffer object in Python code is as a way to trick the interpreter into
>crashing (via recycling the memory the buffer object points to).
>
I'm getting the following things by using the buffer object:

1.  Knowledge that the C-type the buffer refers to meets the buffer C-API.

2.  Mutable string behavior for any object which meets the buffer C-API.

3.  Storage.  At least we used to get storage until we found out that 
there's no guarantee on double alignment.

I plan to work around each of these uses as  follows:

1.  Create an extension function which determines whether an object 
meets the buffer C-API.

2.  Create an extension function which copies from one buffer region to 
another buffer region.  

3. We already have our own memory object which is now typically 
referenced by a buffer object.  With the above extensions, I don't need 
a buffer "wrapper" object around it anymore.

>
>
>And from where do you get a buffer?  There are darned few types in Python
>
We get ours from mmap and our own homegrown memory object.

>
>that buffer() accepts as an argument.  Do your extension types implement
>tp_as_buffer?  I'm blindly casting for a reason why your appreciation of the
>
>
>buffer object seems unique.
>
Numarray uses buffer() too, but dumping it sounds OK.

Todd

-- 
Todd Miller 
Space Telescope Science Institute






From guido@python.org  Mon Jul 15 15:50:32 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 10:50:32 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: Your message of "Mon, 15 Jul 2002 10:41:58 EDT."
 <3D32DF36.5080906@stsci.edu>
References: <LNBBLJKPBEHFEDALKOLCAEEPAEAB.tim.one@comcast.net>
 <3D32DF36.5080906@stsci.edu>
Message-ID: <200207151450.g6FEoW031433@pcp02138704pcs.reston01.va.comcast.net>

> >How do you use buffers?  

> We use buffers in numarray to store our array data.   We use readinto to 
> load array buffers efficiently from a file.    We operate on the buffer 
> data in-place.  Since numarrays are python classe instances, buffers 
> provide a place for the data to live.

AFAIK the buffer() function can only create read-only buffers.  How do
you create your buffers?  If you're just using the C buffer API,
that's not going away.

> >Do you stick to their C API?  
> >
> We use the C-API, and currently use the buffer object too.   Using the 
> buffer object has always seemed like a necessary evil, but having 
> reviewed numarray usage of buffer(), ditching it sounds good to me.

Good.

> >And from where do you get a buffer?  There are darned few types in Python

> We get ours from mmap and our own homegrown memory object.

Maybe instead of the buffer() function/type, there should be a way to
allocate raw memory?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jul 15 16:15:58 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 11:15:58 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: Your message of "Mon, 15 Jul 2002 02:18:32 EDT."
 <LNBBLJKPBEHFEDALKOLCMELIAEAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMELIAEAB.tim.one@comcast.net>
Message-ID: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net>

I'm still only considering two options:

  (a) leave the status quo, or
  (b) implement (and document!) the "sink-state" rule from the PEP.

If we end up adopting (b), what can we do to Python 2.2 that doesn't
break the "bug-fixes-only" promise of that branch?

If there's code that depends on the extendibility of list iterators,
are we breaking our promise by breaking that code?

OTOH, I just noticed that the general sequence iterator has a
different behavior than the list iterator (both in 2.3 and in 2.2):
the general sequence iterator increments its index before checking for
IndexError, while the list iterator only increments when it knows
it's got a valid item.  That means that if you use the general list
iterator over an extensible sequence, you miss an item!

-----------------------------
class Seq:
    def __init__(self, n):
        self.n = n
    def __getitem__(self, i):
        if 0 <= i < self.n:
            return i
        else:
            raise IndexError
a = Seq(3)
it = iter(a)
for i in it:
    print i,
a.n = 5
for i in it:
    print i,
-----------------------------

This prints "0 1 2 4".  This is sufficiently braindead that we can
assume that *if* this behavior is relied upon, it's only for lists.

Still, the question is, could "fixing" the list iterator in 2.2.2
become a problem?

I'd like to think that more people are surprised by this behavior than
rely on it, but I'm not sure.

A simple fix for the sequence and dict iterators is to let a negative
index signal exhaustion.  A simple fix for the callable iter is to set
the callable to NULL to signal exhaustion.  (Setting the main object
to NULL could also work for the others, actually, and has the added --
minuscule -- advantage of releasing a reference early.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jmiller@stsci.edu  Mon Jul 15 16:17:37 2002
From: jmiller@stsci.edu (Todd Miller)
Date: Mon, 15 Jul 2002 11:17:37 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
References: <LNBBLJKPBEHFEDALKOLCAEEPAEAB.tim.one@comcast.net>              <3D32DF36.5080906@stsci.edu> <200207151450.g6FEoW031433@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D32E791.40809@stsci.edu>

Guido van Rossum wrote:

>>>How do you use buffers?  
>>>
>
>>We use buffers in numarray to store our array data.   We use readinto to 
>>load array buffers efficiently from a file.    We operate on the buffer 
>>data in-place.  Since numarrays are python classe instances, buffers 
>>provide a place for the data to live.
>>
>
>AFAIK the buffer() function can only create read-only buffers.  How do you...
>
We have a very small extension function which creates writeable buffer 
objects using the buffer type C-API.  

We also wrap suitable type instances with a "buffer object wrapper". 
 I'm slowly gathering that this is unsafe.   :-(

>
>you create your buffers?  If you're just using the C buffer API,
>that's not going away.
>
>>>Do you stick to their C API?  
>>>
>>We use the C-API, and currently use the buffer object too.   Using the 
>>buffer object has always seemed like a necessary evil, but having 
>>reviewed numarray usage of buffer(), ditching it sounds good to me.
>>
>
>Good.
>
>>>And from where do you get a buffer?  There are darned few types in Python
>>>
>
>>We get ours from mmap and our own homegrown memory object.
>>
>
>Maybe instead of the buffer() function/type, there should be a way to
>allocate raw memory?
>
Yes.    It would also be nice to be able to:

1.  Know (at the python level) that a type supports the buffer C-API.

2.  Copy bytes from one buffer to another (writeable buffer).  

>
>
>--Guido van Rossum (home page: http://www.python.org/~guido/)
>

Todd

-- 
Todd Miller 			
Space Telescope Science Institute






From guido@python.org  Mon Jul 15 16:18:59 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 11:18:59 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: Your message of "Sun, 14 Jul 2002 22:41:16 EDT."
 <15666.13900.581653.909094@anthem.wooz.org>
References: <Pine.SOL.4.44.0207140057270.15930-100000@death.OCF.Berkeley.EDU> <20020714023729.Y79323-100000@mail.allcaps.org> <20020714112745.GB2280@hishome.net> <200207141320.g6EDKpJ27752@pcp02138704pcs.reston01.va.comcast.net> <15665.37242.446627.141013@anthem.wooz.org> <20020714160611.GA25950@hishome.net>
 <15666.13900.581653.909094@anthem.wooz.org>
Message-ID: <200207151518.g6FFIxp31610@pcp02138704pcs.reston01.va.comcast.net>

> StopIterator is a sink state for dict iterators if I don't change the
> size of the dict.  Shouldn't list and dict iterators should behave
> similarly for mutation (or at least resizing) between .next() calls?

No, mutating a list while the iterator is not exhausted is perfectly
well defined: the iterator's state has the next index to try.  This is
totally predictable, and useful or not depending on what you're trying
to do.  The dict iterator tests for mutating the dict because the
rehashing possibility makes this unpredictable.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jul 15 16:34:50 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 11:34:50 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: Your message of "Mon, 15 Jul 2002 11:17:37 EDT."
 <3D32E791.40809@stsci.edu>
References: <LNBBLJKPBEHFEDALKOLCAEEPAEAB.tim.one@comcast.net> <3D32DF36.5080906@stsci.edu> <200207151450.g6FEoW031433@pcp02138704pcs.reston01.va.comcast.net>
 <3D32E791.40809@stsci.edu>
Message-ID: <200207151534.g6FFYoE31790@pcp02138704pcs.reston01.va.comcast.net>

> We have a very small extension function which creates writeable buffer 
> objects using the buffer type C-API.  

That's how the buffer API was supposed to be used.

> We also wrap suitable type instances with a "buffer object wrapper". 
>  I'm slowly gathering that this is unsafe.   :-(

I don't understand what you say, but I believe you.

> >Maybe instead of the buffer() function/type, there should be a way to
> >allocate raw memory?

> Yes.    It would also be nice to be able to:
> 
> 1.  Know (at the python level) that a type supports the buffer C-API.

Good idea.  (I guess right now you can see if calling buffer() with an
instance as argument works. :-)

> 2.  Copy bytes from one buffer to another (writeable buffer).  

Maybe you would like to work on a requirements gathering for a memory
object?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From cce@clarkevans.com  Mon Jul 15 16:58:37 2002
From: cce@clarkevans.com (Clark C . Evans)
Date: Mon, 15 Jul 2002 11:58:37 -0400
Subject: [Python-Dev] Re: PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability)
In-Reply-To: <20020715142225.GA9006@panix.com>; from aahz@pythoncraft.com on Mon, Jul 15, 2002 at 10:22:25AM -0400
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17T36k-0003xt-00@mail.python.org> <20020715000651.A35319@doublegemini.com> <E17Tz95-0003Cz-00@mail.python.org> <20020715072400.B40101@doublegemini.com> <20020715142225.GA9006@panix.com>
Message-ID: <20020715115837.A45200@doublegemini.com>

On Mon, Jul 15, 2002 at 10:22:25AM -0400, Aahz wrote:
| in fact, it suffers from the obverse problem.  Consider this:
| 
|     class C:
|         def open(self, name, flags=None):
|         def read(self):
|         def write(self, value):
|         def close(self):
| 
| Can instances of C be used where a file object is expected?

>From the "Object Adaptation" perspective, you would have a file
protocol (perhaps the built-in File object works).    And then
you could call "check()" or "adapt()" built-in functions.

  check() at a high level, this built-in function first asks 
          the object iself directly: "Hey are you a File?"
          if the response is affirmative or negative, then the
          search is done.   If the object doesn't respond (either
          it lacks __check or __check returns None) then the 
          built-in then goes and asks the protocol object if the
          file complies.   When all else fails, the built-in
          could use some default logic of its own.

  adapt() returns the object itself if check() is true; otherwise
          it asks the object and then the protocol to provide
          a wrapper.  If neither provide the wrapper, then an 
          error is thrown.

The key thing about the Object Adaptation proposal is that it 
leaves wide open what it means to comply.  This flexibility is
necessary since the methods for determining compliance may vary
from situation to situation; no size fits all.  With this proposal, 
both the Object and the Protocol can use what ever methods are at 
their disposal to gauge compliance and/or create an adaptative wrapper.   

That said, what built-in compliance systems Python may choose to 
integrate into the core system are othogonal; or optionally, Python
could have multiple complance mechanism; Eiffelish contract based
mechanism for those who are in that school of thought, or a "type-safe"
interface based complance for those who think this is the best approach.
This proposal leaves all of those options open and favors no-one.

So, I'm sorry if I diverted this into "Are interfaces good or bad".
Clearly the idea of a protocol is good, interfaces are OK, but I have
my doubts about them being good enough ballence between power and
complexity.   It's nice to see a simple yet powerful mechanism like
this being considered... thanks!

Best,

Clark



From cce@clarkevans.com  Mon Jul 15 17:01:33 2002
From: cce@clarkevans.com (Clark C . Evans)
Date: Mon, 15 Jul 2002 12:01:33 -0400
Subject: [Python-Dev] PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability)
In-Reply-To: <15666.56562.63130.943725@anthem.wooz.org>; from barry@zope.com on Mon, Jul 15, 2002 at 10:32:18AM -0400
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17T36k-0003xt-00@mail.python.org> <20020715000651.A35319@doublegemini.com> <E17Tz95-0003Cz-00@mail.python.org> <20020715072400.B40101@doublegemini.com> <20020715142225.GA9006@panix.com> <15666.56562.63130.943725@anthem.wooz.org>
Message-ID: <20020715120133.B45200@doublegemini.com>

On Mon, Jul 15, 2002 at 10:32:18AM -0400, Barry A. Warsaw wrote:
|     |     class C:
|     |         def open(self, name, flags=None):
|     |         def read(self):
|     |         def write(self, value):
|     |         def close(self):
| 
|     A> Can instances of C be used where a file object is expected?
| 
| Maybe <wink>.
| 
| That's why you tend to see things described like: "argument f must
| have a write() method that accepts a string."  WIBNI we could define a
| protocol/interface/thingie that encapsulated that requirement?

Even if write accepts a string it may not do what you expect.  *grin*

| I'd even be happy to start out with no officially blessed interfaces, to
| give time to see what cream rises to the top.  Zope's Interface and
| component model stuff is a good way to get some real experience with
| using these concepts in Python.

Well, with the Object Adaptation proposal you don't even need to bless
a particular compliance mechanism.   For ease of use, the built-in
may want to use a few mechanisms; but this is a distinct difference.
What ever is chosen, I hope the core syntax isn't mucked with!

Best,

Clark

-- 
Clark C. Evans                   Axista, Inc.
http://www.axista.com            800.926.5525
XCOLLA Collaborative Project Management Software



From aahz@pythoncraft.com  Mon Jul 15 16:54:05 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 15 Jul 2002 11:54:05 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMELIAEAB.tim.one@comcast.net> <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020715155405.GA7009@panix.com>

On Mon, Jul 15, 2002, Guido van Rossum wrote:
> 
>   (b) implement (and document!) the "sink-state" rule from the PEP.
> 
> If we end up adopting (b), what can we do to Python 2.2 that doesn't
> break the "bug-fixes-only" promise of that branch?

Well, from my POV, given that the PEP is mostly clear about the intent,
fixing the implementation to match the PEP precisely matches the "bug-fix
only" rule.  We've been trying to move away from "reference defined by
implementation", and this seems like a perfect opportunity to exercise it.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From aleax@aleax.it  Mon Jul 15 16:56:56 2002
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 15 Jul 2002 17:56:56 +0200
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMELIAEAB.tim.one@comcast.net> <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17U8E8-00059g-00@mail.python.org>

On Monday 15 July 2002 05:15 pm, Guido van Rossum wrote:
> I'm still only considering two options:
>
>   (a) leave the status quo, or
>   (b) implement (and document!) the "sink-state" rule from the PEP.

For what it's worth, I strongly prefer (b).

> If we end up adopting (b), what can we do to Python 2.2 that doesn't
> break the "bug-fixes-only" promise of that branch?
>
> If there's code that depends on the extendibility of list iterators,
> are we breaking our promise by breaking that code?

I have no opinion on this specific issue.  Every other iterator could
surely be made to implement the sink behavior, but I do not know
if the empirically observed behavior of iterators on list could be
classified as a bug (I sure wish it could).


Alex



From aleax@aleax.it  Mon Jul 15 17:08:08 2002
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 15 Jul 2002 18:08:08 +0200
Subject: [Python-Dev] Re: PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability)
In-Reply-To: <20020715115837.A45200@doublegemini.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715142225.GA9006@panix.com> <20020715115837.A45200@doublegemini.com>
Message-ID: <E17U8OW-0006zm-00@mail.python.org>

On Monday 15 July 2002 05:58 pm, Clark C . Evans wrote:
> On Mon, Jul 15, 2002 at 10:22:25AM -0400, Aahz wrote:
> | in fact, it suffers from the obverse problem.  Consider this:
> |
> |     class C:
> |         def open(self, name, flags=None):
> |         def read(self):
> |         def write(self, value):
> |         def close(self):
> |
> | Can instances of C be used where a file object is expected?
>
> From the "Object Adaptation" perspective, you would have a file
> protocol (perhaps the built-in File object works).    And then

If so, then presumably the answer is "no", since the built-in
file object has many more important methods such as seek and
tell.  If the file type itself serves as the protocol, surely that should
mean "implement all of the methods" rather than just some of
them.

Moreover, a file's read method accepts an optional integer.  Class C's
read method does not.  So, even the methods that C does supply
are not compliant with those of a file object.

Some, but not all, current uses of "file-like objects" may be satisfied
with just a .read method that must be called without arguments --
other would need the argument to be accepted, others yet would
need readline instead, not to speak of seeking behavior (which all
file object expose, but not all _implement_...).

To use adaptation, we may need to be more precise than just saying
"a file object is expected" -- IF only a SUBSET of the file object's
methods (or a subset of their signatures) is indeed expected.


> you could call "check()" or "adapt()" built-in functions.
>
>   check() at a high level, this built-in function first asks
>           the object iself directly: "Hey are you a File?"
>           if the response is affirmative or negative, then the
>           search is done.   If the object doesn't respond (either
>           it lacks __check or __check returns None) then the
>           built-in then goes and asks the protocol object if the
>           file complies.   When all else fails, the built-in
>           could use some default logic of its own.
>
>   adapt() returns the object itself if check() is true; otherwise
>           it asks the object and then the protocol to provide
>           a wrapper.  If neither provide the wrapper, then an
>           error is thrown.

I don't see the need or opportunity to have a check() that
is separate from adapt().  COM's QueryInterface only has the
equivalent of adapt(), and that's quite enough.  PEP 246 does
not specify a check() built-in, either.  


> The key thing about the Object Adaptation proposal is that it
> leaves wide open what it means to comply.  This flexibility is

Yes, but I see it as a minimum that a "compliant" object has
a set of methods callable with given signatures.  If a protocol is
represented by a type, the set should comprise the type's methods.

While it WOULD be nice to extend this further, we can see just
from examining file objects that this is probably impractical -- they
all do have (e.g.) methods write and seek, but if you call those
methods on a given file object f, f may raise exceptions because
it's not really writable or seekable.  So "having a method" is not
a sufficient condition for REALLY having it, if you see what I mean.


Alex



From barry@zope.com  Mon Jul 15 17:09:33 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 15 Jul 2002 12:09:33 -0400
Subject: [Python-Dev] PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability)
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com>
 <E17T36k-0003xt-00@mail.python.org>
 <20020715000651.A35319@doublegemini.com>
 <E17Tz95-0003Cz-00@mail.python.org>
 <20020715072400.B40101@doublegemini.com>
 <20020715142225.GA9006@panix.com>
 <15666.56562.63130.943725@anthem.wooz.org>
 <20020715120133.B45200@doublegemini.com>
Message-ID: <15666.62397.411994.862638@anthem.wooz.org>

>>>>> "CC" == Clark C <cce@clarkevans.com> writes:

    CC> Even if write accepts a string it may not do what you expect.
    CC> *grin*

Very true.  It's the best we can do now, but we can do better by
<surprise> being more explicit. :)

    CC> Well, with the Object Adaptation proposal you don't even need
    CC> to bless a particular compliance mechanism.  For ease of use,
    CC> the built-in may want to use a few mechanisms; but this is a
    CC> distinct difference.  What ever is chosen, I hope the core
    CC> syntax isn't mucked with!

Me too!  I need to go re-read that PEP now.
-Barry



From guido@python.org  Mon Jul 15 17:12:23 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 12:12:23 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: Your message of "Mon, 15 Jul 2002 11:54:05 EDT."
 <20020715155405.GA7009@panix.com>
References: <LNBBLJKPBEHFEDALKOLCMELIAEAB.tim.one@comcast.net> <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net>
 <20020715155405.GA7009@panix.com>
Message-ID: <200207151612.g6FGCNb32176@pcp02138704pcs.reston01.va.comcast.net>

> > If we end up adopting (b), what can we do to Python 2.2 that doesn't
> > break the "bug-fixes-only" promise of that branch?
> 
> Well, from my POV, given that the PEP is mostly clear about the
> intent, fixing the implementation to match the PEP precisely matches
> the "bug-fix only" rule.  We've been trying to move away from
> "reference defined by implementation", and this seems like a perfect
> opportunity to exercise it.

Um, our docs are scattered enough that we prefer not to break anything
(at least not in a bugfix release) that might have been useful
before.  Given that even Tim didn't find this in the PEP upon his
first two readings, and that simple experimentation with the
implementation shows otherwise, and that I at first misremembered my
own ruling before I found it in the PEP, I'd say that *if* there's a
useful use of this, we shouldn't break that in the 2.2 branch.  2.3 is
a different issue.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From nas@python.ca  Mon Jul 15 17:26:09 2002
From: nas@python.ca (Neil Schemenauer)
Date: Mon, 15 Jul 2002 09:26:09 -0700
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <E17U8E8-00059g-00@mail.python.org>; from aleax@aleax.it on Mon, Jul 15, 2002 at 05:56:56PM +0200
References: <LNBBLJKPBEHFEDALKOLCMELIAEAB.tim.one@comcast.net> <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net> <E17U8E8-00059g-00@mail.python.org>
Message-ID: <20020715092608.A1879@glacier.arctrix.com>

Alex Martelli wrote:
> On Monday 15 July 2002 05:15 pm, Guido van Rossum wrote:
> > I'm still only considering two options:
> >
> >   (a) leave the status quo, or
> >   (b) implement (and document!) the "sink-state" rule from the PEP.
> 
> For what it's worth, I strongly prefer (b).

Me too.  I think option (b) is simpler for users to understand.

  Neil



From cce@clarkevans.com  Mon Jul 15 17:30:07 2002
From: cce@clarkevans.com (Clark C . Evans)
Date: Mon, 15 Jul 2002 12:30:07 -0400
Subject: [Python-Dev] Re: PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability)
In-Reply-To: <E17U8W4-000BwW-00@cauchy.clarkevans.com>; from aleax@aleax.it on Mon, Jul 15, 2002 at 06:08:08PM +0200
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <20020715142225.GA9006@panix.com> <20020715115837.A45200@doublegemini.com> <E17U8W4-000BwW-00@cauchy.clarkevans.com>
Message-ID: <20020715123007.A45942@doublegemini.com>

| > From the "Object Adaptation" perspective, you would have a file
| > protocol (perhaps the built-in File object works).    And then
| 
| If so, then presumably the answer is "no", since the built-in
| file object has many more important methods such as seek and
| tell.  If the file type itself serves as the protocol, surely
| that should mean "implement all of the methods" rather than 
| just some of them.

Yes.

| Some, but not all, current uses of "file-like objects" may be satisfied
| with just a .read method that must be called without arguments --
| other would need the argument to be accepted, others yet would
| need readline instead, not to speak of seeking behavior (which all
| file object expose, but not all _implement_...).
| 
| To use adaptation, we may need to be more precise than just saying
| "a file object is expected" -- IF only a SUBSET of the file object's
| methods (or a subset of their signatures) is indeed expected.

Exactly.


| 
| 
| > you could call "check()" or "adapt()" built-in functions.
| >
| >   check() at a high level, this built-in function first asks
| >           the object iself directly: "Hey are you a File?"
| >           if the response is affirmative or negative, then the
| >           search is done.   If the object doesn't respond (either
| >           it lacks __check or __check returns None) then the
| >           built-in then goes and asks the protocol object if the
| >           file complies.   When all else fails, the built-in
| >           could use some default logic of its own.
| >
| >   adapt() returns the object itself if check() is true; otherwise
| >           it asks the object and then the protocol to provide
| >           a wrapper.  If neither provide the wrapper, then an
| >           error is thrown.
| 
| I don't see the need or opportunity to have a check() that
| is separate from adapt().  COM's QueryInterface only has the
| equivalent of adapt(), and that's quite enough.  PEP 246 does
| not specify a check() built-in, either.

I agree here; having two methods doubles the complication 
without giving much additional value.   adapt() is more than
adequate, although I use check() to help explain the innerds.
If someone really insists on having check() exposed, the I 
don't see the harm... only that it makes the proposal seem more
complicated than it is.

| > The key thing about the Object Adaptation proposal is that it
| > leaves wide open what it means to comply.  This flexibility is
| 
| Yes, but I see it as a minimum that a "compliant" object has
| a set of methods callable with given signatures.  If a protocol is
| represented by a type, the set should comprise the type's methods.

Yes.  This would be an improvement of the proposal.  How do we
express this so that the protocol of core Types can do this
sort of enforcement.  Perhaps by giving the Protocol the ability
to "veto" the final result?

| While it WOULD be nice to extend this further, we can see just
| from examining file objects that this is probably impractical -- they
| all do have (e.g.) methods write and seek, but if you call those
| methods on a given file object f, f may raise exceptions because
| it's not really writable or seekable.  So "having a method" is not
| a sufficient condition for REALLY having it, if you see what I mean.

*nods*  

Clark
Yo!  Check out YAML.  http://yaml.org
YAML is language independent readable object serialization.

-- 
Clark C. Evans                   Axista, Inc.
http://www.axista.com            800.926.5525
XCOLLA Collaborative Project Management Software



From jmiller@stsci.edu  Mon Jul 15 17:36:29 2002
From: jmiller@stsci.edu (Todd Miller)
Date: Mon, 15 Jul 2002 12:36:29 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
References: <LNBBLJKPBEHFEDALKOLCAEEPAEAB.tim.one@comcast.net> <3D32DF36.5080906@stsci.edu> <200207151450.g6FEoW031433@pcp02138704pcs.reston01.va.comcast.net>              <3D32E791.40809@stsci.edu> <200207151534.g6FFYoE31790@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D32FA0D.6020200@stsci.edu>

Guido van Rossum wrote:

>>We have a very small extension function which creates writeable buffer 
>>objects using the buffer type C-API.  
>>
>
>That's how the buffer API was supposed to be used.
>
>>We also wrap suitable type instances with a "buffer object wrapper". 
>> I'm slowly gathering that this is unsafe.   :-(
>>
>
>I don't understand what you say, but I believe you.
>
I meant we call  PyBuffer_FromReadWriteObject and the resulting buffer 
lives longer than the extension function call that created it.   I have 
heard that it is possible for the original object to "move" leaving the 
buffer object pointer to it dangling.

>
>
>>>Maybe instead of the buffer() function/type, there should be a way to
>>>allocate raw memory?
>>>
>
>>Yes.    It would also be nice to be able to:
>>
>>1.  Know (at the python level) that a type supports the buffer C-API.
>>
>
>Good idea.  (I guess right now you can see if calling buffer() with an
>instance as argument works. :-)
>
>>2.  Copy bytes from one buffer to another (writeable buffer).  
>>
>
>Maybe you would like to work on a requirements gathering for a memory
>object
>
Sure.  I'd be willing to poll comp.lang.python (python-list?) and 
collate the results of any discussion that ensues.  Is that what you had 
in mind?

>
>
>--Guido van Rossum (home page: http://www.python.org/~guido/)
>
>
>_______________________________________________
>Python-Dev mailing list
>Python-Dev@python.org
>http://mail.python.org/mailman/listinfo/python-dev
>

Todd 







From pinard@iro.umontreal.ca  Mon Jul 15 18:22:08 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 15 Jul 2002 13:22:08 -0400
Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings
In-Reply-To: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <oqvg7g6h4v.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

> There's a full implementation for PEP 263.  Martin von Loewis is ready
> to commit it.  It's of course possible to let him do this and deal with
> the consequences once they're in CVS [...]

There is one thing which bothers me in the `Concepts' section:

       Note that Python identifiers are restricted to the ASCII
       subset of the encoding, and thus need no further conversion
       after step 4.

Could identifiers be produced according to the usual syntax (letters or
underscore, then letters, digits and underscore), but without going to
ASCII first?  The fact that I can now interactively (but not in batch) do:

---------------------------------------------------------------------->
12:24 0 pinard@titan:~ $ python
Python 2.2.1 (#1, Apr 29 2002, 14:27:21) 
[GCC 2.95.3 20010315 (SuSE)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> >>> >>> >>> >>> 
>>> élève = 3
>>> print élève
3
>>> 
----------------------------------------------------------------------<

surely let people dream.  Other members in our local development team are
even more excited than me about this!  They keep asking me if and when this
will become available for real, dependably, in Python! :-) They are eagerly
(and understandably) hoping to start spelling identifiers correctly.

We should try not missing the opportunity, if it happens to exist now.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From guido@python.org  Mon Jul 15 18:32:43 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 13:32:43 -0400
Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings
In-Reply-To: Your message of "Mon, 15 Jul 2002 13:22:08 EDT."
 <oqvg7g6h4v.fsf@titan.progiciels-bpi.ca>
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>
 <oqvg7g6h4v.fsf@titan.progiciels-bpi.ca>
Message-ID: <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net>

>        Note that Python identifiers are restricted to the ASCII
>        subset of the encoding, and thus need no further conversion
>        after step 4.
> 
> Could identifiers be produced according to the usual syntax (letters or
> underscore, then letters, digits and underscore), but without going to
> ASCII first?
[...]
> We should try not missing the opportunity, if it happens to exist now.

To the contrary, I wish GNU readline didn't call setlocale().
Allowing non-ASCII identifiers may eventually happen, but there are
lots of reasons why it's a bad idea (such as source code portability),
and tying such a proposal to this PEP is definitely the wrong thing.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jul 15 18:29:08 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 13:29:08 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: Your message of "Mon, 15 Jul 2002 12:36:29 EDT."
 <3D32FA0D.6020200@stsci.edu>
References: <LNBBLJKPBEHFEDALKOLCAEEPAEAB.tim.one@comcast.net> <3D32DF36.5080906@stsci.edu> <200207151450.g6FEoW031433@pcp02138704pcs.reston01.va.comcast.net> <3D32E791.40809@stsci.edu> <200207151534.g6FFYoE31790@pcp02138704pcs.reston01.va.comcast.net>
 <3D32FA0D.6020200@stsci.edu>
Message-ID: <200207151729.g6FHT8A07987@pcp02138704pcs.reston01.va.comcast.net>

> I meant we call  PyBuffer_FromReadWriteObject and the resulting buffer 
> lives longer than the extension function call that created it.   I have 
> heard that it is possible for the original object to "move" leaving the 
> buffer object pointer to it dangling.

Yes, that can happen (depending on what kind if object it is).

> Sure.  I'd be willing to poll comp.lang.python (python-list?) and 
> collate the results of any discussion that ensues.  Is that what you had 
> in mind?

Yes, but beware that you will have to decide which requirements make
sense and which ones don't -- the community is so large these days
that you can't get agreement any more. :-)

Feel free to come back with results to python-dev any time.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From xscottg@yahoo.com  Mon Jul 15 18:37:09 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Mon, 15 Jul 2002 10:37:09 -0700 (PDT)
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: <200207151450.g6FEoW031433@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020715173709.20711.qmail@web40106.mail.yahoo.com>

--- Guido van Rossum <guido@python.org> wrote:
> 
> Maybe instead of the buffer() function/type, there should be a way to
> allocate raw memory?
> 

This is a part of my soon to be issued PEP.  I've looked at their memory
object, and Numarray is one of the use cases that I'm catering to.




__________________________________________________
Do You Yahoo!?
Yahoo! Autos - Get free new car price quotes
http://autos.yahoo.com



From aleax@aleax.it  Mon Jul 15 18:40:11 2002
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 15 Jul 2002 19:40:11 +0200
Subject: [Python-Dev] Re: PEP 246, Object Adaptation (was Re: Single- vs. Multi-pass iterability)
In-Reply-To: <20020715123007.A45942@doublegemini.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17U8W4-000BwW-00@cauchy.clarkevans.com> <20020715123007.A45942@doublegemini.com>
Message-ID: <E17U9pX-0001Wh-00@mail.python.org>

On Monday 15 July 2002 06:30 pm, Clark C . Evans wrote:
	...
> If someone really insists on having check() exposed, the I
> don't see the harm... only that it makes the proposal seem more
> complicated than it is.

The harm of exposing check is encouraging the "look before you
leap" (LBYL) idiom:

    if allconditionsgreenformetodothis():
        dothis()
    else:
        print "oops, cant dothis"

rather than the generally more effective "it's easier to ask
forgiveness than permission" (EAFP) idiom:

    try:
        dothis()
    except DoingThisWasWrongError:
        print "oops, cant dothis"

With LBYL one more easily gets into duplication of work (the
effort of checking duplicates the effort of actually doing the
work) and multiprogramming issues (a check passes, but then
immediately afterwards the situation has changed...).


> | > The key thing about the Object Adaptation proposal is that it
> | > leaves wide open what it means to comply.  This flexibility is
> |
> | Yes, but I see it as a minimum that a "compliant" object has
> | a set of methods callable with given signatures.  If a protocol is
> | represented by a type, the set should comprise the type's methods.
>
> Yes.  This would be an improvement of the proposal.  How do we
> express this so that the protocol of core Types can do this
> sort of enforcement.  Perhaps by giving the Protocol the ability
> to "veto" the final result?

Dunno.  I do plan to devote substantial concentrated effort to
rewriting the PEP, and that's incompatible with my current
situation wrt finishing the Nutshell.  Further delay should be
little problem given the time PEP 246 has already waited AND
the BDFL's indication that it's not going to get into 2.3 anyway,
so there's definitely no hurry.


Alex



From guido@python.org  Mon Jul 15 18:43:40 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 13:43:40 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: Your message of "Mon, 15 Jul 2002 10:37:09 PDT."
 <20020715173709.20711.qmail@web40106.mail.yahoo.com>
References: <20020715173709.20711.qmail@web40106.mail.yahoo.com>
Message-ID: <200207151743.g6FHheg08123@pcp02138704pcs.reston01.va.comcast.net>

> This is a part of my soon to be issued PEP.  I've looked at their memory
> object, and Numarray is one of the use cases that I'm catering to.

OK, then I guess Todd doesn't have to go to c.l.py for requirements.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From xscottg@yahoo.com  Mon Jul 15 19:00:48 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Mon, 15 Jul 2002 11:00:48 -0700 (PDT)
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: <200207151743.g6FHheg08123@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020715180048.32619.qmail@web40103.mail.yahoo.com>

--- Guido van Rossum <guido@python.org> wrote:
> > This is a part of my soon to be issued PEP.  I've looked at their
> memory
> > object, and Numarray is one of the use cases that I'm catering to.
> 
> OK, then I guess Todd doesn't have to go to c.l.py for requirements.
> 

More information couldn't hurt too much, and since Todd Miller volunteered*
to herd the information, I'll be interested to see if any new perspectives
come out.


* - Actually it looked like you volunteered him, but he seemed willing
enough.  :-)




__________________________________________________
Do You Yahoo!?
Yahoo! Autos - Get free new car price quotes
http://autos.yahoo.com



From aahz@pythoncraft.com  Mon Jul 15 19:15:55 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 15 Jul 2002 14:15:55 -0400 (EDT)
Subject: [Python-Dev] OSCON: Community dinner Weds 7/24 6pm
Message-ID: <200207151815.g6FIFtQ09567@panix1.panix.com>

[posted to c.l.py with cc to c.l.py.announce and python-dev]

I'm proposing a Python community dinner at OSCON next week, for Weds
7/24 at 6pm.  Is there anyone familiar with the San Diego area who wants
to suggest a location near the Sheraton?  If I don't get any
recommendations, we'll probably just have the dinner at the Sheraton.

If you're interested, please send me an e-mail so I have some idea of
the number of people.  Also, please include a way of getting in touch
with you at OSCON in case plans change (phone numbers accepted, but
e-mail addresses preferred).

(There's a meeting for PSF members at 8pm, so some of us will likely
have to skip out early.)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/
-- 



From martin@v.loewis.de  Mon Jul 15 19:27:03 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 15 Jul 2002 20:27:03 +0200
Subject: [Python-Dev] PEP 11: unsupported platforms
In-Reply-To: <00b501c22bd2$77fbea80$ced241d5@hagrid>
References: <m3ele89dyj.fsf@mira.informatik.hu-berlin.de>
 <20020714212620.GA3192@cthulhu.gerg.ca>
 <m3u1n1ii95.fsf@mira.informatik.hu-berlin.de>
 <00b501c22bd2$77fbea80$ced241d5@hagrid>
Message-ID: <m3lm8czw20.fsf@mira.informatik.hu-berlin.de>

"Fredrik Lundh" <fredrik@pythonware.com> writes:

> wouldn't something like "no longer supported platforms" or
> "removing support for little used platforms" be more accurate?

Indeed. I'll go for "Removing support for little used platforms".

Regards,
Martin



From aleax@aleax.it  Mon Jul 15 20:04:59 2002
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 15 Jul 2002 21:04:59 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <20020712005928.A9833@hishome.net> <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17UBA5-0006uZ-00@mail.python.org>

On Monday 15 July 2002 04:39 pm, Guido van Rossum wrote:
	...
> Maybe the xreadlines object could grow a flush() method that throws
> away its buffer, and f.seek() could call that if there's a cached
> xreadlines iterator?

Couldn't f.seek just decref the xreadlines object and put a NULL
into f's pointer to the xreadlines object?


Alex



From aleax@aleax.it  Mon Jul 15 20:12:54 2002
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 15 Jul 2002 21:12:54 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <15666.55797.351811.317428@anthem.wooz.org>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <200207150615.g6F6FJq28099@smtp.zope.com> <15666.55797.351811.317428@anthem.wooz.org>
Message-ID: <E17UBHE-0000E3-00@mail.python.org>

On Monday 15 July 2002 04:19 pm, Barry A. Warsaw wrote:
> >>>>> "AM" == Alex Martelli <aleax@aleax.it> writes:
>
>     AM> The big question is rather: given that Isub inherits from
>     AM> Isuper, does any object implementing Isub also implicitly
>     AM> implement Isuper?
>
> There's another issue that Jim Fulton likes to bring up, IIRC.  If
> class Super implements IInterface, does class Sub(Super) also
> (automatically) implement IInterface?
>
> I could be totally misremembering, but I believe that Jim would say
> "no".  Class Sub would have to explicitly declare that it also
> implements IInterface.

I fully agree with Jim.  Inheritance is often the handiest way to
_implement_ some things, but not if it comes with a mandatory
contract that you have to respect (specifically, supplying some
interfaces because your superclasses supply them).

In C++, you distinguish by using private inheritance when you
are inheriting just to get implementation, public inheritance to
signify that you're also accepting the IS-A obligations (and then
it gets messy because private affects accessibility and not
visibility, but that's C++'s specific problem:-).

I _like_ to use inheritance of implementation exactly for that --
implementation purposes -- without mystical IS-A obligations.

It _may_ be because most of my experience is with COM, which
does not expose implementation inheritance and gives each
object full control on what interfaces it wants to supply -- behind
the scenes, the object's implementation is free to use inheritance,
delegation, or, as far as COM's concerned, bat wings and newt
blood -- that's the object's business.  But I do have enough
experience in bare (no-COM) C++ and Java to know that I
found the COM approach distinctly preferable (at least when the
tools offered easy ways to get typical behavior while still leaving
enough hooks and handles for me to get fine-grained control
when needed -- Microsoft's ATL library was quite good for that).


Alex



From pinard@iro.umontreal.ca  Mon Jul 15 20:18:34 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 15 Jul 2002 15:18:34 -0400
Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings
In-Reply-To: <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net>
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>
 <oqvg7g6h4v.fsf@titan.progiciels-bpi.ca>
 <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <oqlm8c6bqt.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

> To the contrary, I wish GNU readline didn't call setlocale().

So, that's why it works interactively, and not in batch, then!

> and tying such a proposal to this PEP is definitely the wrong thing.

Expected and understood.  I merely hope that this PEP's "concept" will
not later be quoted as rigid.  Going back from Unicode through ASCII may
be today's way for implementing PEP 263, but not necessarily the only one.

> Allowing non-ASCII identifiers may eventually happen, but there are
> lots of reasons why it's a bad idea (such as source code portability),

If Unicode letters get eventually accepted in Python identifiers, I do not
much see what portability problems it would create.  I mean, not more than
with generators, or any other Python feature.  It's forward compatible.

Unless you mean that by encouraging non-English writings, _this_ creates
a threat to portability, where English is the planetary computer language
for exchanges.  The point is that many Python programs, contrarily to
the Python distribution itself, are not aiming the planet, and for some
communities or teams, would be more useful and comfortable if not English.

Each thing in its proper time, of course.  Just let's keep the doors opened.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From aleax@aleax.it  Mon Jul 15 20:20:25 2002
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 15 Jul 2002 21:20:25 +0200
Subject: [Python-Dev] PEP 246 - Object Adaptation
In-Reply-To: <200207151404.g6FE4sa30738@pcp02138704pcs.reston01.va.comcast.net>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17Tz95-0003Cz-00@mail.python.org> <200207151404.g6FE4sa30738@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17UBOy-0004tE-00@mail.python.org>

On Monday 15 July 2002 04:04 pm, Guido van Rossum wrote:
> (Changing the subject)
>
> > The big question is rather: given that Isub inherits from Isuper,
> > does any object implementing Isub also implicitly implement Isuper?
>
> This probably shows my naivete more than anything else...
>
> I'd say "of course", based on an example where Isuper is
> FileOpenForReading and Isub is FileOpenForReadingAndWriting.
> It would be strange if a file open for reading and writing was not
> acceptable in a place where a file open for reading is accepted
> (because it implements all the right methods).  Or is the fact that it
> implements *more* the problem?

It often does look like a R/W container "implements more" than
the corresponding R/O container, but in many cases the R/W
"subclass" can guarantee fewer invariants -- and they're often
invariants quite hard to express even in languages that do
support contracts.  To see that in the case of a file, imagine
the file interface having a rewind method.  With a R/O file, I
know that:

def firstbyte(f):
    f.rewind()
    return f.read(1)

always returns the same byte for a given f.  If f is R/W, then I
can't be certain any more, which may change the caching
strategy I need to use, for example.

(Of course, the same uncertainty might be present if, while
f is R/O, the OS/whatever also allows other processes to
open the underlying file for R/W at the same time, so in the
case of files this only goes so far).

More generally, it's _nice_ to be able to use inheritance just
for implementation purposes, without necessarily having to
worry about IS-A.  When i have two interfaces with, say,
three methods in common, I can refactor those three methods
up to a common base-interface -- even if no object actually
deigns to supply that base-interface.  This simply avoids a
little nasty copy-and-paste coding -- not an earth-shaking
concern, admittedly.  But, still, nice.


Alex



From guido@python.org  Mon Jul 15 20:33:06 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 15:33:06 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Mon, 15 Jul 2002 21:12:54 +0200."
 <E17UBHE-0000E3-00@mail.python.org>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <200207150615.g6F6FJq28099@smtp.zope.com> <15666.55797.351811.317428@anthem.wooz.org>
 <E17UBHE-0000E3-00@mail.python.org>
Message-ID: <200207151933.g6FJX6f10238@pcp02138704pcs.reston01.va.comcast.net>

> > There's another issue that Jim Fulton likes to bring up, IIRC.  If
> > class Super implements IInterface, does class Sub(Super) also
> > (automatically) implement IInterface?
> >
> > I could be totally misremembering, but I believe that Jim would say
> > "no".  Class Sub would have to explicitly declare that it also
> > implements IInterface.
> 
> I fully agree with Jim.  Inheritance is often the handiest way to
> _implement_ some things, but not if it comes with a mandatory
> contract that you have to respect (specifically, supplying some
> interfaces because your superclasses supply them).

I'm happy to allow for a way to state explicitly that Sub doesn't
implement IInterface, despite deriving from Super which does.  But I
think it ought to inherit this property by default (this is in fact
what Zope does AFAIK).  Otherwise creating minor variations on a class
would be quite a pain -- you'd have to repeat all the interfaces
implemented by the base class; and what if a later version of Super
implements more interfaces?  I would think that it's much more common
to extend a class while maintaining its contract than to inherit for
implementation only, even though there are important examples of the
latter.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jul 15 20:38:01 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 15:38:01 -0400
Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings
In-Reply-To: Your message of "Mon, 15 Jul 2002 15:18:34 EDT."
 <oqlm8c6bqt.fsf@titan.progiciels-bpi.ca>
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <oqvg7g6h4v.fsf@titan.progiciels-bpi.ca> <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net>
 <oqlm8c6bqt.fsf@titan.progiciels-bpi.ca>
Message-ID: <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net>

> > and tying such a proposal to this PEP is definitely the wrong thing.
> 
> Expected and understood.  I merely hope that this PEP's "concept" will
> not later be quoted as rigid.  Going back from Unicode through ASCII may
> be today's way for implementing PEP 263, but not necessarily the only one.

Sure.

> > Allowing non-ASCII identifiers may eventually happen, but there are
> > lots of reasons why it's a bad idea (such as source code portability),
> 
> If Unicode letters get eventually accepted in Python identifiers, I do not
> much see what portability problems it would create.  I mean, not more than
> with generators, or any other Python feature.  It's forward compatible.

Well, for one, not everybody has an easy way to edit Unicode files.  I
expect I'd have to spend half a day downloading new stuff before I
could.

There's an issue with 8-bit encodings that is hopefully resolved by
the encoding cookie proposed by this PEP -- but we'll have to see how
well the PEP gets adapted.  Not only Python itself needs to recognize
these cookies -- also all tools that scan Python sources.
(E.g. pyclbr.py and tokenize.py in the standard library.)

> Unless you mean that by encouraging non-English writings, _this_ creates
> a threat to portability, where English is the planetary computer language
> for exchanges.  The point is that many Python programs, contrarily to
> the Python distribution itself, are not aiming the planet, and for some
> communities or teams, would be more useful and comfortable if not English.

This is exactly what the Chinese are already doing.  I'm just worried
that sooner or later they'll write someting that's useful outside
China.  I hope that English will remain the language for libraries
shared within the Python community at large.

> Each thing in its proper time, of course.  Just let's keep the doors opened.

Always.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jul 15 20:47:26 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 15:47:26 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Mon, 15 Jul 2002 21:04:59 +0200."
 <E17UBA5-0006uZ-00@mail.python.org>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <20020712005928.A9833@hishome.net> <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net>
 <E17UBA5-0006uZ-00@mail.python.org>
Message-ID: <200207151947.g6FJlQ310369@pcp02138704pcs.reston01.va.comcast.net>

> > Maybe the xreadlines object could grow a flush() method that throws
> > away its buffer, and f.seek() could call that if there's a cached
> > xreadlines iterator?
> 
> Couldn't f.seek just decref the xreadlines object and put a NULL
> into f's pointer to the xreadlines object?

Well, that wouldn't help for code that's hanging on to the iterator.

I also just realized that having the file object point to the
xreadlines object creates a cycle, since the xreadlines object already
points to the file.  And neither participates in GC.  I guess the
xreadlines object could drop the pointer to the file once it's raised
StopIteration, as a way to ensure that this is a sink state.  Or we
could add GC support to file objects and xreadline objects (sigh).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aleax@aleax.it  Mon Jul 15 20:49:11 2002
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 15 Jul 2002 21:49:11 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207151933.g6FJX6f10238@pcp02138704pcs.reston01.va.comcast.net>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17UBHE-0000E3-00@mail.python.org> <200207151933.g6FJX6f10238@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17UBqp-0000Ji-00@mail.python.org>

On Monday 15 July 2002 09:33 pm, Guido van Rossum wrote:
	...
> I'm happy to allow for a way to state explicitly that Sub doesn't
> implement IInterface, despite deriving from Super which does.  But I
> think it ought to inherit this property by default (this is in fact

Yes, such a default would probably be handier than the opposite one
in most cases.

> what Zope does AFAIK).  Otherwise creating minor variations on a class
> would be quite a pain -- you'd have to repeat all the interfaces
> implemented by the base class; and what if a later version of Super
> implements more interfaces?  

This is actually a difficult point.  If I have to explicitly state all the
interfaces of Super that I want to _exclude_, and Super adds some
more interfaces tomorrow, then it's quite possible that my class is
suddenly broken -- it doesn't guarantee the invariants that says it
guarantees, any more -- and I don't even know about it.

At this point I'm thinking of my class as "a component", used by
client code for its interfaces and contracts.  Implementation
inheritance is iffy enough in the component-world -- if it carries
a baggage of exposing an a priori unknown set of interfaces
it becomes basically unfeasible.

> I would think that it's much more common
> to extend a class while maintaining its contract than to inherit for
> implementation only, even though there are important examples of the
> latter.

This is probably true.  But maybe the explicitness we want is not
per-interface: it's suddenly become an explicitness of "inheriting for
implementation" vs "inheriting to extend" (with IS-A), just like C++'s
private vs public inheritance (except, one hopes, done right -- i.e.
with effect on visibility, not on accessibility).


Alex



From martin@v.loewis.de  Mon Jul 15 20:50:33 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 15 Jul 2002 21:50:33 +0200
Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings
In-Reply-To: <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net>
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>
 <oqvg7g6h4v.fsf@titan.progiciels-bpi.ca>
 <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net>
 <oqlm8c6bqt.fsf@titan.progiciels-bpi.ca>
 <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3r8i4ydme.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Well, for one, not everybody has an easy way to edit Unicode files.  I
> expect I'd have to spend half a day downloading new stuff before I
> could.

If the PEP is implemented, IDLE will be able to honor the encoding
declarations. As a side effect, this will allow you to edit UTF-8
files in IDLE.

Allowing arbitrary Unicode in identifiers is no challenge, either,
except that __dict__ dictionaries may suddenly find Unicode as keys.
I'm not sure what other implications this would have, so it definitely
is a separate issue.

Another issue with allowing Unicode is that a good definition of
"letter" must be given (it clearly should not depend on the
locale). The Unicode consortium gives guidelines, but those depend on
the Unicode version.

Regards,
Martin



From tim.one@comcast.net  Mon Jul 15 20:50:32 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 15 Jul 2002 15:50:32 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <200207151612.g6FGCNb32176@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEONAEAB.tim.one@comcast.net>

[Guido]
> ...
> Given that even Tim didn't find this in the PEP upon his first two
> readings,

While agreeing that caution is prudent, this specific reason is a poor one:
I didn't read the PEP like "a user", but like a standards geek.  It never
occurred to me that a once-open issue would be resolved *only* in an
addendum without the resolution also being reflected back into the main
text.  So I read the main text carefully, but barely even noticed the
existence of the rest.  On a Bell curve, I expect that way of reading a PEP
is hugging a tail.

> ...
> I'd say that *if* there's a useful use of this, we shouldn't break
> that in the 2.2 branch.  2.3 is a different issue.

If Marc-Andre hasn't complained yet, there's no use at all for it <wink>.




From guido@python.org  Mon Jul 15 21:00:12 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 16:00:12 -0400
Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings
In-Reply-To: Your message of "Mon, 15 Jul 2002 21:50:33 +0200."
 <m3r8i4ydme.fsf@mira.informatik.hu-berlin.de>
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <oqvg7g6h4v.fsf@titan.progiciels-bpi.ca> <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> <oqlm8c6bqt.fsf@titan.progiciels-bpi.ca> <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net>
 <m3r8i4ydme.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200207152000.g6FK0Ce10496@pcp02138704pcs.reston01.va.comcast.net>

> If the PEP is implemented, IDLE will be able to honor the encoding
> declarations. As a side effect, this will allow you to edit UTF-8
> files in IDLE.

Who's gonna make the necessary changes to IDLE?

> Allowing arbitrary Unicode in identifiers is no challenge, either,
> except that __dict__ dictionaries may suddenly find Unicode as keys.
> I'm not sure what other implications this would have, so it definitely
> is a separate issue.

As long as the only use of 8-bit strings is to contain pure ASCII,
this shouldn't be a problem.

> Another issue with allowing Unicode is that a good definition of
> "letter" must be given (it clearly should not depend on the
> locale). The Unicode consortium gives guidelines, but those depend on
> the Unicode version.

I'd just use the isalpha() method of Unicode string objects.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Mon Jul 15 21:06:42 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 16:06:42 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: Your message of "Mon, 15 Jul 2002 15:50:32 EDT."
 <LNBBLJKPBEHFEDALKOLCAEONAEAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCAEONAEAB.tim.one@comcast.net>
Message-ID: <200207152006.g6FK6gE10521@pcp02138704pcs.reston01.va.comcast.net>

> While agreeing that caution is prudent, this specific reason is a poor one:
> I didn't read the PEP like "a user", but like a standards geek.  It never
> occurred to me that a once-open issue would be resolved *only* in an
> addendum without the resolution also being reflected back into the main
> text.  So I read the main text carefully, but barely even noticed the
> existence of the rest.  On a Bell curve, I expect that way of reading a PEP
> is hugging a tail.

I guess the true meaning of the term "BDFL Pronouncement" hasn't quite
sunk in with you. :-)

> > ...
> > I'd say that *if* there's a useful use of this, we shouldn't break
> > that in the 2.2 branch.  2.3 is a different issue.
> 
> If Marc-Andre hasn't complained yet, there's no use at all for it <wink>.

OK, on to practicalities.

While preparing a patch, I discovered something strange: despite the
fact that listiter_next() never raises StopIteration when it returns
NULL, and despite the fact that it is used as the implementation for
the next() method, calling iter(list()).next() *does* raise
StopIteration, rather than a complaint about NULL without setting an
exception condition.  It took a brief debugging session to discover
that in the presence of a tp_iternext function, the type machinery
adds a next method that wraps tp_iternext.  Cute, though unexpected!
It means that the implementation of various iterators can be a little
simpler, because no next() implementation needs to be given.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal@lemburg.com  Mon Jul 15 21:14:26 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jul 2002 22:14:26 +0200
Subject: [Python-Dev] Termination of two-arg iter()
References: <LNBBLJKPBEHFEDALKOLCAEONAEAB.tim.one@comcast.net> <200207152006.g6FK6gE10521@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D332D22.9020103@lemburg.com>

>>>I'd say that *if* there's a useful use of this, we shouldn't break
>>>that in the 2.2 branch.  2.3 is a different issue.
>>
>>If Marc-Andre hasn't complained yet, there's no use at all for it <wink>.

I'm not following this thread... perhaps that's why ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From martin@v.loewis.de  Mon Jul 15 21:32:08 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 15 Jul 2002 22:32:08 +0200
Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings
In-Reply-To: <200207152000.g6FK0Ce10496@pcp02138704pcs.reston01.va.comcast.net>
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>
 <oqvg7g6h4v.fsf@titan.progiciels-bpi.ca>
 <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net>
 <oqlm8c6bqt.fsf@titan.progiciels-bpi.ca>
 <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net>
 <m3r8i4ydme.fsf@mira.informatik.hu-berlin.de>
 <200207152000.g6FK0Ce10496@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3ele4ybp3.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> > If the PEP is implemented, IDLE will be able to honor the encoding
> > declarations. As a side effect, this will allow you to edit UTF-8
> > files in IDLE.
> 
> Who's gonna make the necessary changes to IDLE?

I am. idlefork patch #508973 implements most of that, but doesn't
support UTF-8 signatures. It also doesn't give good diagnostics if the
user did not declare an encoding but uses non-ASCII.

> > Allowing arbitrary Unicode in identifiers is no challenge, either,
> > except that __dict__ dictionaries may suddenly find Unicode as keys.
> > I'm not sure what other implications this would have, so it definitely
> > is a separate issue.
> 
> As long as the only use of 8-bit strings is to contain pure ASCII,
> this shouldn't be a problem.

I thought we were talking about non-ASCII in identifiers.

> > Another issue with allowing Unicode is that a good definition of
> > "letter" must be given (it clearly should not depend on the
> > locale). The Unicode consortium gives guidelines, but those depend on
> > the Unicode version.
> 
> I'd just use the isalpha() method of Unicode string objects.

That might vary across platforms (which I consider a bug) and across
Python releases.

Regards,
Martin



From guido@python.org  Mon Jul 15 21:41:15 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 16:41:15 -0400
Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings
In-Reply-To: Your message of "Mon, 15 Jul 2002 22:32:08 +0200."
 <m3ele4ybp3.fsf@mira.informatik.hu-berlin.de>
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <oqvg7g6h4v.fsf@titan.progiciels-bpi.ca> <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> <oqlm8c6bqt.fsf@titan.progiciels-bpi.ca> <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net> <m3r8i4ydme.fsf@mira.informatik.hu-berlin.de> <200207152000.g6FK0Ce10496@pcp02138704pcs.reston01.va.comcast.net>
 <m3ele4ybp3.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200207152041.g6FKfFV10971@pcp02138704pcs.reston01.va.comcast.net>

> > Who's gonna make the necessary changes to IDLE?
> 
> I am. idlefork patch #508973 implements most of that, but doesn't
> support UTF-8 signatures. It also doesn't give good diagnostics if the
> user did not declare an encoding but uses non-ASCII.

Cool.

> > > Allowing arbitrary Unicode in identifiers is no challenge, either,
> > > except that __dict__ dictionaries may suddenly find Unicode as keys.
> > > I'm not sure what other implications this would have, so it definitely
> > > is a separate issue.
> > 
> > As long as the only use of 8-bit strings is to contain pure ASCII,
> > this shouldn't be a problem.
> 
> I thought we were talking about non-ASCII in identifiers.

Yes, but all the non-ASCII has to be represented as Unicode strings.
I.e. no Latin-1 in 8-bit strings!

> > > Another issue with allowing Unicode is that a good definition of
> > > "letter" must be given (it clearly should not depend on the
> > > locale). The Unicode consortium gives guidelines, but those depend on
> > > the Unicode version.
> > 
> > I'd just use the isalpha() method of Unicode string objects.
> 
> That might vary across platforms (which I consider a bug) and across
> Python releases.

Really?  I thought Unicode's isalpha() was built on the Unicode text
database?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal@lemburg.com  Mon Jul 15 22:42:21 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jul 2002 23:42:21 +0200
Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net> <oqvg7g6h4v.fsf@titan.progiciels-bpi.ca> <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net> <oqlm8c6bqt.fsf@titan.progiciels-bpi.ca> <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net> <m3r8i4ydme.fsf@mira.informatik.hu-berlin.de> <200207152000.g6FK0Ce10496@pcp02138704pcs.reston01.va.comcast.net>              <m3ele4ybp3.fsf@mira.informatik.hu-berlin.de> <200207152041.g6FKfFV10971@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D3341BD.30309@lemburg.com>

Guido van Rossum wrote:
>>>>Another issue with allowing Unicode is that a good definition of
>>>>"letter" must be given (it clearly should not depend on the
>>>>locale). The Unicode consortium gives guidelines, but those depend on
>>>>the Unicode version.
>>>
>>>I'd just use the isalpha() method of Unicode string objects.
>>
>>That might vary across platforms (which I consider a bug) and across
>>Python releases.
> 
> Really?  I thought Unicode's isalpha() was built on the Unicode text
> database?

It is, but on some platforms, the user can configure Python to use
the C lib's versions instead of the Python provided ones
(--with-ctype-functions).

Also note that the Unicode database in Python was created
from Unicode 3.0. Unicode 3.1 adds lots more characters and
also changed a few character properties.

I'd consider the case academic, though... I am not aware of any
editor which can display the full Unicode 3.1 character set.
The most complete font currently around seems to be the MS font
for Arial (both cover Unicode 2.0):

    http://www.unicode.org/unicode/onlinedat/products.html

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From guido@python.org  Mon Jul 15 22:47:28 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 17:47:28 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: Your message of "Mon, 15 Jul 2002 16:06:42 EDT."
Message-ID: <200207152147.g6FLlTQ12212@pcp02138704pcs.reston01.va.comcast.net>

I've placed a patch for this on SF: http://python.org/sf/581944 .

Comments please?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@v.loewis.de  Mon Jul 15 22:16:52 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 15 Jul 2002 23:16:52 +0200
Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings
In-Reply-To: <200207152041.g6FKfFV10971@pcp02138704pcs.reston01.va.comcast.net>
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>
 <oqvg7g6h4v.fsf@titan.progiciels-bpi.ca>
 <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net>
 <oqlm8c6bqt.fsf@titan.progiciels-bpi.ca>
 <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net>
 <m3r8i4ydme.fsf@mira.informatik.hu-berlin.de>
 <200207152000.g6FK0Ce10496@pcp02138704pcs.reston01.va.comcast.net>
 <m3ele4ybp3.fsf@mira.informatik.hu-berlin.de>
 <200207152041.g6FKfFV10971@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3k7nwadyz.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Yes, but all the non-ASCII has to be represented as Unicode strings.
> I.e. no Latin-1 in 8-bit strings!

Exactly. This might still cause problems for inspect and other
introspective tools.

For ASCII identifiers, I agree that using byte strings is sensible,
for best backwards compatibility.

> Really?  I thought Unicode's isalpha() was built on the Unicode text
> database?

It isn't if it has a "usable wchar_t", see unicodeobject.h:

#if defined(HAVE_USABLE_WCHAR_T) && defined(WANT_WCTYPE_FUNCTIONS)

#include <wctype.h>

#define Py_UNICODE_ISSPACE(ch) iswspace(ch)

...

I was missing the part that it also requires active selection of
wctype functions - that is probably a feature that is never used.  So
it is better than I thought: isletter might vary across builds on the
same platform, but likely never varies in practice.

Regards,
Martin



From pinard@iro.umontreal.ca  Mon Jul 15 23:33:37 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 15 Jul 2002 18:33:37 -0400
Subject: [Python-Dev] Re: PEP 263 - Defining Python Source Code Encodings
In-Reply-To: <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net>
References: <200207131918.g6DJIa718383@pcp02138704pcs.reston01.va.comcast.net>
 <oqvg7g6h4v.fsf@titan.progiciels-bpi.ca>
 <200207151732.g6FHWh708030@pcp02138704pcs.reston01.va.comcast.net>
 <oqlm8c6bqt.fsf@titan.progiciels-bpi.ca>
 <200207151938.g6FJc1010259@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <oq8z4c62pq.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

> This is exactly what the Chinese are already doing.  I'm just worried
> that sooner or later they'll write someting that's useful outside China.
> I hope that English will remain the language for libraries shared within
> the Python community at large.

You know, people are already quite aware that if they want to contribute
to the whole community, English is their best bet.  If not evident enough
already, this may be cut in writing within Python style guidelines: anything
being contributed to Python has to be documented and commented in English.
(On the other hand, the Python project might be kind enough for allowing
various contributors to write their own name the way they like it best.)

> Well, for one, not everybody has an easy way to edit Unicode files.

In practice, from your viewpoint, it is unlikely that you'll have much to
play with non-English and non-ASCII Python sources, if ever.  And if it
happens nevertheless, you are even in a position to request that modules be
translated before you look at them.  For a lot of years, in other projects,
I never witnessed that it has been a real problem in practice.

Of course, closed shops will take good care and make sure they have the
proper tools.  No need to protect them against themselves! :-)

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From greg@cosc.canterbury.ac.nz  Mon Jul 15 23:48:28 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 16 Jul 2002 10:48:28 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207151415.g6FEFSP30815@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200207152248.g6FMmSf19013@oma.cosc.canterbury.ac.nz>

Guido:
> Me:
> > If the file
> > object were to become an object obeying the iterator
> > protocol, its next() method should really return the
> > next *byte* of the file.
> 
> I don't think so.  We should pick the most convenient chunking for the
> default iterator

But we're talking here about making the file object
*be* an iterator itself, not just have a "default
iterator". If that's to happen, all the
other ways of iterating over a file ought to be
implemented on top of the basic iteration facility
provided by the file object -- lest we get unfortunate
interactions between the different iteration methods
a la xreadlines(). To me, this implies that the file
object must iterate by bytes.

I'm not necessarily advocating this, just following
the idea to its logical conclusion. If the conclusion
is distasteful, maybe that's a sign that the idea
(i.e. making file objects into iterators)
isn't so good in the first place.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From aahz@pythoncraft.com  Tue Jul 16 00:06:19 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 15 Jul 2002 19:06:19 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207152248.g6FMmSf19013@oma.cosc.canterbury.ac.nz>
References: <200207151415.g6FEFSP30815@pcp02138704pcs.reston01.va.comcast.net> <200207152248.g6FMmSf19013@oma.cosc.canterbury.ac.nz>
Message-ID: <20020715230619.GA12513@panix.com>

On Tue, Jul 16, 2002, Greg Ewing wrote:
> Guido:
>> Greg:
>>>
>>> If the file object were to become an object obeying the iterator
>>> protocol, its next() method should really return the next *byte* of
>>> the file.
>>
>> I don't think so.  We should pick the most convenient chunking for the
>> default iterator
> 
> But we're talking here about making the file object *be* an iterator
> itself, not just have a "default iterator". If that's to happen, all
> the other ways of iterating over a file ought to be implemented on top
> of the basic iteration facility provided by the file object -- lest we
> get unfortunate interactions between the different iteration methods a
> la xreadlines(). To me, this implies that the file object must iterate
> by bytes.

"Practicality beats purity"
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From ping@zesty.ca  Tue Jul 16 00:16:51 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Mon, 15 Jul 2002 16:16:51 -0700 (PDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207121336.g6CDaMD07592@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.LNX.4.44.0207151604350.17524-100000@ziggy>

On Fri, 12 Jul 2002, Guido van Rossum wrote:
> I don't see what's wrong with the file object.  Iterating over a file
> changes the file's state, that's just a fact of life.

That's exactly the point.  Iterators and containers are different.
Walking over a container shouldn't mutate it, whereas an iterator
has mutable state independent of the container.

The key problem is that the file's __iter__ method returns something
whose state depends on the file, thus breaking this expectation.
Either __iter__ should be implemented to fulfill its commitment, or
there shouldn't be an __iter__ method on files at all.

I'm not suggesting that the semantics of files themselves are "broken"
or have a "wart" that needs to be fixed -- merely that we should decide
on a place for files to live in our world of containers and iterators,
so we can set and maintain consistent expectations.


-- ?!ng




From ping@zesty.ca  Tue Jul 16 00:16:54 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Mon, 15 Jul 2002 16:16:54 -0700 (PDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207111616.g6BGGFB16385@europa.research.att.com>
Message-ID: <Pine.LNX.4.44.0207151532410.17524-100000@ziggy>

On Thu, 11 Jul 2002, Andrew Koenig wrote:
> More seriously, I can imagine distinguishing a multiple iterator by
> the presence of __copy__, but I can't imagine using the presence of
> __copy__ to determine whether a *container* supports multiple
> iteration.  For example, there surely exist containers today that
> support __copy__ but whose __iter__ methods yield iterators that do
> not themselves support __copy__.

Just fetch the iterator from the container and look for __copy__ on that.

Or, what if there is no container to begin with, but the iterator is still
copyable?  You can't flag that by putting __multiter__ on anything; again
it makes more sense to just provide __copy__ on the iterator.

All that's really necessary here is to document the convention about what
__copy__ is supposed to mean if it's available on an iterator.  If we
all agree that __copy__ should preserve an independent copy of the
current state of the iterator, we're all set.

> Another reason is that I can imagine this idea extended to encompass,
> say, ambidextrous iterators that support prev() as well as next(),
> and I would want to use __ambiter__ as a marker for those rather
> than having to create an iterator and see if it has prev().

I think a proliferation of iterator-fetching methods would be a messy
and unpleasant prospect.  After __iter__, __multiter__, and __ambiter__,
what next?  __mutableiter__?  __depthfirstiter__?  __breadthfirstiter__?


-- ?!ng

"If I have seen farther than others, it is because I was standing on a
really big heap of midgets."
    -- K. Eric Drexler





From ping@zesty.ca  Tue Jul 16 00:16:56 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Mon, 15 Jul 2002 16:16:56 -0700 (PDT)
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEFOADAB.tim_one@email.msn.com>
Message-ID: <Pine.LNX.4.44.0207151521400.17524-100000@ziggy>

On Tue, 9 Jul 2002, Tim Peters wrote:
> > The context does not make it clear that the iterator's __iter__ method is
> > *only* required whenever one *also* wants to use an iterator as an
> > iterable.
>
> That's not how the iteration protocol is defined, and isn't how it should be
> defined either.  Requiring *some* method with a reserved name is an aid to
> introspection

This is a terrible reason for the existence of an __iter__ method, because
(a) it's a bad way to do type-checking and (b) it doesn't even work.

(a) If we followed this logic, we'd insist on having a useless __dict__
method on dictionaries, a useless __list__ method on lists, etc. etc.
just so we could check types by looking for these methods.

As i understood it, the Python way is to let the protocol speak for itself.
Something that wants to give out keys can implement the keys() method,
something that wants to act like a container can implement __getitem__,
and so on.  There's no need to make an additional declaration of dict-ness
by adding a dummy __dict__ method -- indeed, sometimes we don't *want* to
make that kind of commitment, and Python allows that flexibility.

It seems to me that dictionaries are to keys() as iterators are to next().

(b) Looking for __iter__ is not a valid test for iterator-ness.  Files
and other iterable objects supply __iter__, but they are not iterators.
So it doesn't work as a type test.

I agree with Oren that it makes more sense for iterator-fetching to be
a convenience handled by the implementations of "for" and "in", rather
than foisting the extra hassle of "def __iter__(self): return self" on
every individual iterator implementation.


-- ?!ng





From ark@research.att.com  Tue Jul 16 00:25:13 2002
From: ark@research.att.com (Andrew Koenig)
Date: Mon, 15 Jul 2002 19:25:13 -0400 (EDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <Pine.LNX.4.44.0207151532410.17524-100000@ziggy> (message from
 Ka-Ping Yee on Mon, 15 Jul 2002 16:16:54 -0700 (PDT))
References: <Pine.LNX.4.44.0207151532410.17524-100000@ziggy>
Message-ID: <200207152325.g6FNPD128921@europa.research.att.com>

Ping> Just fetch the iterator from the container and look for __copy__ on that.

Yes, that's an alternative.

However the purpose my suggestion of __multiter__ was not to use it to
test for multiple iteration, but to enable a container to be able to
yield either a single or a multiple iterator on request.

Ping> Or, what if there is no container to begin with, but the iterator is still
Ping> copyable?  You can't flag that by putting __multiter__ on anything; again
Ping> it makes more sense to just provide __copy__ on the iterator.

You could flag it by putting __multiter__ on the iterator, just as iterators
presently have __iter__.

Ping> All that's really necessary here is to document the convention about what
Ping> __copy__ is supposed to mean if it's available on an iterator.  If we
Ping> all agree that __copy__ should preserve an independent copy of the
Ping> current state of the iterator, we're all set.

Not quite.  We also need an agreement that calling __iter__ on a container
is not a destructive operation unless you call next() on the iterator that
you get back.

>> Another reason is that I can imagine this idea extended to encompass,
>> say, ambidextrous iterators that support prev() as well as next(),
>> and I would want to use __ambiter__ as a marker for those rather
>> than having to create an iterator and see if it has prev().

Ping> I think a proliferation of iterator-fetching methods would be a
Ping> messy and unpleasant prospect.  After __iter__, __multiter__,
Ping> and __ambiter__, what next?  __mutableiter__?
Ping> __depthfirstiter__?  __breadthfirstiter__?

A data structure that supports several different kinds of iteration
has to provide that support somehow.  What's your suggestion?



From barry@zope.com  Tue Jul 16 00:46:08 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 15 Jul 2002 19:46:08 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com>
 <E17UBHE-0000E3-00@mail.python.org>
 <200207151933.g6FJX6f10238@pcp02138704pcs.reston01.va.comcast.net>
 <E17UBqp-0000Ji-00@mail.python.org>
Message-ID: <15667.24256.621942.555107@anthem.wooz.org>

>>>>> "AM" == Alex Martelli <aleax@aleax.it> writes:

    >> what Zope does AFAIK).  Otherwise creating minor variations on
    >> a class would be quite a pain -- you'd have to repeat all the
    >> interfaces implemented by the base class; and what if a later
    >> version of Super implements more interfaces?

    AM> This is actually a difficult point.  If I have to explicitly
    AM> state all the interfaces of Super that I want to _exclude_,
    AM> and Super adds some more interfaces tomorrow, then it's quite
    AM> possible that my class is suddenly broken -- it doesn't
    AM> guarantee the invariants that says it guarantees, any more --
    AM> and I don't even know about it.

You'd need a way to explicitly state that you implement /none/ of the
interfaces of your superclass, and then explicitly add back the ones
you do implement.

-Barry



From guido@python.org  Tue Jul 16 00:48:02 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 19:48:02 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Tue, 16 Jul 2002 10:48:28 +1200."
 <200207152248.g6FMmSf19013@oma.cosc.canterbury.ac.nz>
References: <200207152248.g6FMmSf19013@oma.cosc.canterbury.ac.nz>
Message-ID: <200207152348.g6FNm8C12883@pcp02138704pcs.reston01.va.comcast.net>

> > I don't think so.  We should pick the most convenient chunking for the
> > default iterator
> 
> But we're talking here about making the file object
> *be* an iterator itself, not just have a "default
> iterator". If that's to happen, all the
> other ways of iterating over a file ought to be
> implemented on top of the basic iteration facility
> provided by the file object -- lest we get unfortunate
> interactions between the different iteration methods
> a la xreadlines(). To me, this implies that the file
> object must iterate by bytes.

Why should all iteration over a file be defined in terms of its basic
iteration?  I don't see that as dogma.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ping@zesty.ca  Tue Jul 16 00:51:23 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Mon, 15 Jul 2002 16:51:23 -0700 (PDT)
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <200207142219.g6EMJaJ28788@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.LNX.4.44.0207151634200.17524-100000@ziggy>

On Sun, 14 Jul 2002, Guido van Rossum wrote:
> The question is, should we place the burden on iterator users to avoid
> calling next() after the first StopIteration, or should we place the
> burden on iterator implementations?  Since by far the most common
> iterator use case is still a single for loop, which already does the
> right thing, it's not at all clear to me which is worse.

As a general design philosophy question, my vote would be for
placing the burden on the implementations.  If code reuse is all
it's cracked up to be, you're going to use the iterator more times
than you implemented it.  Moreover, the more consistent the
implementation is, the more widely it can be used.  (Tim just said this.)

As for the specifics of the iterator protocol, there seem to be
two separate issues here:

1.  After StopIteration, should iterators be allowed to keep going?

2.  Should an empty iterator be distinguishable from an exhausted iterator?


For 1, i don't think i've seen anyone come down too strongly on
the "yes" side.  There have been a couple of examples as to why
this might be cute, but i don't think they are compelling.  My
opinion is that, if you are trying to make an iterator keep going
after it has stopped, it's just a way of abusing the iterator to
represent a sequence of sequences.

You can always get the behaviour you want by explicitly describing
both kinds of sequence.  Tim's example of getting paragraphs out
of a file demonstrates exactly why we don't want to encourage the
abuse of one iterator to represent a sequence of sequences: you're
going to be in trouble if you can't distinguish between the
termination conditions for the two kinds of sequences.


For 2, i believe Andrew and Oren want the answer to be "yes",
but Guido and Aahz want the answer to be "no".  I think the answer
should be "yes".  An exhausted iterator is not the same thing as
a freshly-created iterator on an empty sequence, and allowing one
to silently pass for the other is going to lead to problems.

I'm not going to insist that IndexError should be the effect, as
Guido's preference to keep IndexError for randomly-indexable
sequences seems reasonable; anything distinguishable from
StopIteration is fine.


-- ?!ng




From guido@python.org  Tue Jul 16 00:53:52 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jul 2002 19:53:52 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Mon, 15 Jul 2002 16:16:51 PDT."
 <Pine.LNX.4.44.0207151604350.17524-100000@ziggy>
References: <Pine.LNX.4.44.0207151604350.17524-100000@ziggy>
Message-ID: <200207152353.g6FNrqm13200@pcp02138704pcs.reston01.va.comcast.net>

> On Fri, 12 Jul 2002, Guido van Rossum wrote:
> > I don't see what's wrong with the file object.  Iterating over a file
> > changes the file's state, that's just a fact of life.

[Ping]
> That's exactly the point.  Iterators and containers are different.
> Walking over a container shouldn't mutate it, whereas an iterator
> has mutable state independent of the container.
> 
> The key problem is that the file's __iter__ method returns something
> whose state depends on the file, thus breaking this expectation.
> Either __iter__ should be implemented to fulfill its commitment, or
> there shouldn't be an __iter__ method on files at all.

What commitment?

Iterators don't have to have an undelying container!  (E.g. generators.)

> I'm not suggesting that the semantics of files themselves are "broken"
> or have a "wart" that needs to be fixed -- merely that we should decide
> on a place for files to live in our world of containers and iterators,
> so we can set and maintain consistent expectations.

What are your expectations?  I think that both file.__iter__()
returning file (as it does with Oren's patch) or file.__iter__()
returning an xreadlines object (as it still does in CVS) are fine as
far as reasonable expectations for iterators go.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Tue Jul 16 01:25:38 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 15 Jul 2002 20:25:38 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <Pine.LNX.4.44.0207151532410.17524-100000@ziggy> <200207152325.g6FNPD128921@europa.research.att.com>
Message-ID: <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com>

From: "Andrew Koenig" <ark@research.att.com>


> Ping> Just fetch the iterator from the container and look for __copy__ on
that.
>
> Yes, that's an alternative.
>
> However the purpose my suggestion of __multiter__ was not to use it to
> test for multiple iteration, but to enable a container to be able to
> yield either a single or a multiple iterator on request.

Why would you want that? Seems like a corner case at best.

> A data structure that supports several different kinds of iteration
> has to provide that support somehow.  What's your suggestion?

class DataStructure(object):
    def __init__(self):
        self._numbers = range(10);
        self._names = [ str(x) for x in range(10) ];

    names = property(lambda self: iter(self._names))
    numbers = property(lambda self: iter(self._numbers))

x = DataStructure();
for y in x.names:
    print repr(y),

print

for y in x.numbers:
    print repr(y),

[Y'know, Python is great. That worked the first time I ran it.]

-Dave







From ark@research.att.com  Tue Jul 16 01:32:36 2002
From: ark@research.att.com (Andrew Koenig)
Date: Mon, 15 Jul 2002 20:32:36 -0400 (EDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com>
 (david.abrahams@rcn.com)
References: <Pine.LNX.4.44.0207151532410.17524-100000@ziggy> <200207152325.g6FNPD128921@europa.research.att.com> <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com>
Message-ID: <200207160032.g6G0Wac29265@europa.research.att.com>

>> However the purpose my suggestion of __multiter__ was not to use it to
>> test for multiple iteration, but to enable a container to be able to
>> yield either a single or a multiple iterator on request.

David> Why would you want that? Seems like a corner case at best.

You're right -- I wasn't thinking clearly.

What I meant to say was that I would like a program that expects
to be able to use a multiple iterator to be able to say so simply
and efficiently in code.  For example:

    for i in multiter(x):
       // whatever

I would like this to fail cleanly if x does not support multiple
iterators.

>> A data structure that supports several different kinds of iteration
>> has to provide that support somehow.  What's your suggestion?

David> class DataStructure(object):
David>     def __init__(self):
David>         self._numbers = range(10);
David>         self._names = [ str(x) for x in range(10) ];

David>     names = property(lambda self: iter(self._names))
David>     numbers = property(lambda self: iter(self._numbers))

David> x = DataStructure();
David> for y in x.names:
David>     print repr(y),

David> print

David> for y in x.numbers:
David>     print repr(y),

David> [Y'know, Python is great. That worked the first time I ran it.]

I don't understand how this code answers my question.
You've asked for iterators over two different data structures.
What I was asking was, for example, how one might arrange for a single
tree to yield either a depth-first or breadth-first iterator.



From David Abrahams" <david.abrahams@rcn.com  Tue Jul 16 01:39:22 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 15 Jul 2002 20:39:22 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <Pine.LNX.4.44.0207151532410.17524-100000@ziggy> <200207152325.g6FNPD128921@europa.research.att.com> <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> <200207160032.g6G0Wac29265@europa.research.att.com>
Message-ID: <247901c22c61$3c8c9050$6601a8c0@boostconsulting.com>

From: "Andrew Koenig" <ark@research.att.com>

> >> A data structure that supports several different kinds of iteration
> >> has to provide that support somehow.  What's your suggestion?
> 
> David> class DataStructure(object):
> David>     def __init__(self):
> David>         self._numbers = range(10);
> David>         self._names = [ str(x) for x in range(10) ];
> 
> David>     names = property(lambda self: iter(self._names))
> David>     numbers = property(lambda self: iter(self._numbers))
> 
> David> x = DataStructure();
> David> for y in x.names:
> David>     print repr(y),
> 
> David> print
> 
> David> for y in x.numbers:
> David>     print repr(y),
> 
> David> [Y'know, Python is great. That worked the first time I ran it.]
> 
> I don't understand how this code answers my question.
> You've asked for iterators over two different data structures.
> What I was asking was, for example, how one might arrange for a single
> tree to yield either a depth-first or breadth-first iterator.

Just replace 'names' by breadth_first and 'numbers' by depth_first.

or-vice-versa-ly y'rs,
dave




From aahz@pythoncraft.com  Tue Jul 16 01:46:36 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 15 Jul 2002 20:46:36 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <Pine.LNX.4.44.0207151634200.17524-100000@ziggy>
References: <200207142219.g6EMJaJ28788@pcp02138704pcs.reston01.va.comcast.net> <Pine.LNX.4.44.0207151634200.17524-100000@ziggy>
Message-ID: <20020716004635.GA2384@panix.com>

On Mon, Jul 15, 2002, Ka-Ping Yee wrote:
>
> 2.  Should an empty iterator be distinguishable from an exhausted iterator?
> 
> For 2, i believe Andrew and Oren want the answer to be "yes",
> but Guido and Aahz want the answer to be "no".  I think the answer
> should be "yes".  An exhausted iterator is not the same thing as
> a freshly-created iterator on an empty sequence, and allowing one
> to silently pass for the other is going to lead to problems.

I don't think I expressed an opinion on this, and if you think I did,
either you misunderstood me or I misunderstood what I was expounding on.
I also think that's the wrong question, given the nature of iterators;
before you can ask that question, you need to demonstrate that there is
in fact a difference between an empty iterator and an exhausted iterator.
I think that you can't demonstrate that, but I'm certainly willing to be
convinced.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From tdelaney@avaya.com  Tue Jul 16 02:10:16 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Tue, 16 Jul 2002 11:10:16 +1000
Subject: [Python-Dev] Termination of two-arg iter()
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A42C@natasha.auslabs.avaya.com>

> From: Aahz [mailto:aahz@pythoncraft.com]
> On Mon, Jul 15, 2002, Ka-Ping Yee wrote:
> >
> I also think that's the wrong question, given the nature of iterators;
> before you can ask that question, you need to demonstrate 
> that there is
> in fact a difference between an empty iterator and an 
> exhausted iterator.
> I think that you can't demonstrate that, but I'm certainly 
> willing to be
> convinced.

I think the definition that some people are using is:

An exhausted iterator is one for which StopIteration has already been
raised.

An empty iterator OTOH is one which will raise StopIteration the next time
next() is called. An iterator for an empty list is the simplest example of
this, although it should be applied to any iterator.

FWIW I think the "best" behaviour for iterators is that once an iterator
begins raising StopIteration is must continue to do so under any
circumstances.  Given than, I don't see a lot of point in distinguishing
between the two above cases.

One way this could be enforced (and the burden removed from iterator
writers) is to have iter() always returned a wrapper around an iterator:

class EnforcementIterator:

    __slots__ = ('iterator', 'exhausted',)

    def __init__(self, iterator):
        self.iterator = iterator
        self.exhausted = False

    # getattr, setattr, delattr delegate to self.iterator

    def __iter__(self):
        return self

    def next (self):

        if self.exhausted:
            raise StopIteration()

        try:
            return self.iterator.next()
        except StopIteration:
            self.exhausted = True
            raise

def iter (iterable):
    # testing for type - optimisation ;)
    if iterable instanceof EnforcementIterator:
        return iterable
    else:
        return EnforcementIterator(iterable.__iter__())

Tim Delaney



From tim.one@comcast.net  Tue Jul 16 02:10:31 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 15 Jul 2002 21:10:31 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <Pine.LNX.4.44.0207151634200.17524-100000@ziggy>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEAIAFAB.tim.one@comcast.net>

[Ping]
> As a general design philosophy question, my vote would be for
> placing the burden on the implementations.  If code reuse is all
> it's cracked up to be, you're going to use the iterator more times
> than you implemented it.  Moreover, the more consistent the
> implementation is, the more widely it can be used.  (Tim just said this.)

OTOH, the less the protocol defines, the more open it is to unforeseen uses.
Tim just said that too <wink>.

> As for the specifics of the iterator protocol, there seem to be
> two separate issues here:
>
> 1.  After StopIteration, should iterators be allowed to keep going?
>
> 2.  Should an empty iterator be distinguishable from an exhausted
> iterator?
>
> For 1, i don't think i've seen anyone come down too strongly on
> the "yes" side.  There have been a couple of examples as to why
> this might be cute, but i don't think they are compelling.

I haven't seen an example of why it might useful, although I could have made
some up, and have been pleasantly surprised all along that nobody else made
one up either <wink>.  We saw a few examples illustrating that StopIteration
is in fact not sticky today, but nobody claimed such uses "were features".
Jeff Epler made one up to get clarification, and I showed a dict iter
example that demonstrated how unpredictable it can get now.

> My opinion is that, if you are trying to make an iterator keep going
> after it has stopped, it's just a way of abusing the iterator to
> represent a sequence of sequences.
>
> You can always get the behaviour you want by explicitly describing
> both kinds of sequence.  Tim's example of getting paragraphs out
> of a file demonstrates exactly why we don't want to encourage the
> abuse of one iterator to represent a sequence of sequences: you're
> going to be in trouble if you can't distinguish between the
> termination conditions for the two kinds of sequences.

That example relied on StopIteration being sticky (which it already happens
to be for the specific iter(file.readline, "") case), not on iteration doing
"something useful" after StopIteration had been raised.  A sequence is
either empty, or an element followed by a sequence.  Sticky StopIteration
makes the "empty" case at the end reliably empty, and, I think, for much the
same reason Python has always kept returning "" from file.read() after it
reaches EOF.  There's simply nothing erroneous about reaching the end of a
sequence, or about probing it again to determine emptiness instead of
carrying around fiddly flags in parallel.

> For 2, i believe Andrew and Oren want the answer to be "yes",
> but Guido and Aahz want the answer to be "no".  I think the answer
> should be "yes".  An exhausted iterator is not the same thing as
> a freshly-created iterator on an empty sequence, and allowing one
> to silently pass for the other is going to lead to problems.

I'm on the "no" side there -- an empty sequence is no more error-prone than
that range(10, 10) returns an empty list, or string[i:i] an empty string, or
that file("some_empty_file").read() returns an empty string.  An
iterator-based algorithm works on some prefix of the elements "from here
until the end":  an exhausted sequence and an empty sequence are indeed
indistinguishable from that view.  Indeed, I'm having a hard time imagining
*wanting* to distiguish the two.

> I'm not going to insist that IndexError should be the effect, as
> Guido's preference to keep IndexError for randomly-indexable
> sequences seems reasonable; anything distinguishable from
> StopIteration is fine.

OK, if we have to do this, let's call it StopIteration2 and make it a
subclass of StopIteration so my code won't have to know it exists <wink>.




From aahz@pythoncraft.com  Tue Jul 16 02:39:42 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 15 Jul 2002 21:39:42 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <B43D149A9AB2D411971300B0D03D7E8BF0A42C@natasha.auslabs.avaya.com>
References: <B43D149A9AB2D411971300B0D03D7E8BF0A42C@natasha.auslabs.avaya.com>
Message-ID: <20020716013942.GA11513@panix.com>

On Tue, Jul 16, 2002, Delaney, Timothy wrote:
> From: Aahz [mailto:aahz@pythoncraft.com]
>> 
>> I also think that's the wrong question, given the nature of
>> iterators; before you can ask that question, you need to demonstrate
>> that there is in fact a difference between an empty iterator and an
>> exhausted iterator.  I think that you can't demonstrate that, but I'm
>> certainly willing to be convinced.
>
> I think the definition that some people are using is:
> 
> An exhausted iterator is one for which StopIteration has already been
> raised.
> 
> An empty iterator OTOH is one which will raise StopIteration the next time
> next() is called. An iterator for an empty list is the simplest example of
> this, although it should be applied to any iterator.

In order to draw this distinction, you have to change the definition of
"iterator" that we've been using.  The sole protocol of iterator to date
has been the existence of a next() method that either returns an item or
raises StopIteration.  Making statements about what an iterator *will*
do counts as abuse IMO.  If you want a feature like that, go use
something else -- don't break the simplicity of iterators.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From tdelaney@avaya.com  Tue Jul 16 02:47:57 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Tue, 16 Jul 2002 11:47:57 +1000
Subject: [Python-Dev] Termination of two-arg iter()
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A42E@natasha.auslabs.avaya.com>

> From: Aahz [mailto:aahz@pythoncraft.com]
> 
> On Tue, Jul 16, 2002, Delaney, Timothy wrote:
> > From: Aahz [mailto:aahz@pythoncraft.com]
> >
> > I think the definition that some people are using is:
      ^^^^^
> > 
> > An exhausted iterator is one for which StopIteration has 
> > 
> > An empty iterator OTOH is one which will raise 
> 
> In order to draw this distinction, you have to change the 
> definition of
> "iterator" that we've been using.  The sole protocol of 
> iterator to date
> has been the existence of a next() method that either returns 
> an item or
> raises StopIteration.  Making statements about what an iterator *will*
> do counts as abuse IMO.  If you want a feature like that, go use
> something else -- don't break the simplicity of iterators.

Aahz - you did read the next paragraph didn't you.

"... I don't see a lot of point in distinguishing
between the two above cases."

I'm *against* distinguishing between the two - I do not want a feature like
that.

Tim Delaney



From ark@research.att.com  Tue Jul 16 03:05:33 2002
From: ark@research.att.com (Andrew Koenig)
Date: Mon, 15 Jul 2002 22:05:33 -0400 (EDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <247901c22c61$3c8c9050$6601a8c0@boostconsulting.com>
 (david.abrahams@rcn.com)
References: <Pine.LNX.4.44.0207151532410.17524-100000@ziggy> <200207152325.g6FNPD128921@europa.research.att.com> <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> <200207160032.g6G0Wac29265@europa.research.att.com> <247901c22c61$3c8c9050$6601a8c0@boostconsulting.com>
Message-ID: <200207160205.g6G25XC29723@europa.research.att.com>

David> Just replace 'names' by breadth_first and 'numbers' by depth_first.

David> or-vice-versa-ly y'rs,

which doesn't address the question of a uniform convention.






From aahz@pythoncraft.com  Tue Jul 16 03:06:12 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 15 Jul 2002 22:06:12 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <B43D149A9AB2D411971300B0D03D7E8BF0A42E@natasha.auslabs.avaya.com>
References: <B43D149A9AB2D411971300B0D03D7E8BF0A42E@natasha.auslabs.avaya.com>
Message-ID: <20020716020612.GA16827@panix.com>

On Tue, Jul 16, 2002, Delaney, Timothy wrote:
> From: Aahz [mailto:aahz@pythoncraft.com]
>> On Tue, Jul 16, 2002, Delaney, Timothy wrote:
>>> 
>>> I think the definition that some people are using is:
>       ^^^^^
>>> 
>>> An exhausted iterator is one for which StopIteration has 
>>> 
>>> An empty iterator OTOH is one which will raise 
>> 
>> In order to draw this distinction, you have to change the definition
>> of "iterator" that we've been using.  The sole protocol of iterator
>> to date has been the existence of a next() method that either returns
>> an item or raises StopIteration.  Making statements about what an
>> iterator *will* do counts as abuse IMO.  If you want a feature
>> like that, go use something else -- don't break the simplicity of
>> iterators.
>
> Aahz - you did read the next paragraph didn't you.
>
> "... I don't see a lot of point in distinguishing between the two
> above cases."

Sorry for being unclear; that was the generic "you", not pointing at you
(Tim) specifically.  s/you/one/
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From tim.one@comcast.net  Tue Jul 16 03:25:08 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 15 Jul 2002 22:25:08 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEAMAFAB.tim.one@comcast.net>

[David Abrahams]
> [Y'know, Python is great. That worked the first time I ran it.]

Oops!  Please file a bug report on SourceForge <wink>.



From David Abrahams" <david.abrahams@rcn.com  Tue Jul 16 04:02:52 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 15 Jul 2002 23:02:52 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <Pine.LNX.4.44.0207151532410.17524-100000@ziggy> <200207152325.g6FNPD128921@europa.research.att.com> <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> <200207160032.g6G0Wac29265@europa.research.att.com> <247901c22c61$3c8c9050$6601a8c0@boostconsulting.com> <200207160205.g6G25XC29723@europa.research.att.com>
Message-ID: <251d01c22c75$581bfc20$6601a8c0@boostconsulting.com>

From: "Andrew Koenig" <ark@research.att.com>


> David> Just replace 'names' by breadth_first and 'numbers' by
depth_first.
>
> David> or-vice-versa-ly y'rs,
>
> which doesn't address the question of a uniform convention.

I'm with you on the desire to have a way to get a
multipass-iterator-or-error in one swell foop. That says "I want to iterate
over this thing without changing it". I still think that hasattr(iter(x),
'__copy__') is a pretty clean way to do that, despite the fact that it
potentially creates an iterator (which some people apparently view as too
heavyweight as an introspection step).

However, I don't see any point in trying to define a protocol for every
different possible iteration view of a thing. Dicts have keys, values, and
items. Trees have breadth-first, depth-first, inorder, postorder, blah,
blah, blah. There are just too many of these, and they're all different.

-Dave





From ark@research.att.com  Tue Jul 16 04:22:46 2002
From: ark@research.att.com (Andrew Koenig)
Date: Mon, 15 Jul 2002 23:22:46 -0400 (EDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <251d01c22c75$581bfc20$6601a8c0@boostconsulting.com>
 (david.abrahams@rcn.com)
References: <Pine.LNX.4.44.0207151532410.17524-100000@ziggy> <200207152325.g6FNPD128921@europa.research.att.com> <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> <200207160032.g6G0Wac29265@europa.research.att.com> <247901c22c61$3c8c9050$6601a8c0@boostconsulting.com> <200207160205.g6G25XC29723@europa.research.att.com> <251d01c22c75$581bfc20$6601a8c0@boostconsulting.com>
Message-ID: <200207160322.g6G3Mkh00041@europa.research.att.com>

David> I'm with you on the desire to have a way to get a
David> multipass-iterator-or-error in one swell foop. That says "I
David> want to iterate over this thing without changing it". I still
David> think that hasattr(iter(x), '__copy__') is a pretty clean way
David> to do that, despite the fact that it potentially creates an
David> iterator (which some people apparently view as too heavyweight
David> as an introspection step).

In particular, creating an iterator had better not be a destructive operation.

David> However, I don't see any point in trying to define a protocol
David> for every different possible iteration view of a thing. Dicts
David> have keys, values, and items. Trees have breadth-first,
David> depth-first, inorder, postorder, blah, blah, blah. There are
David> just too many of these, and they're all different.

I wasn't suggesting defining a protocol for every possible iteration
view.  I was raising the question of whether multi-pass iteration
was likely to be a common enough operation that it is worth defining
a protocol for it, while leaving the door open to defining protocols
for others should it turn out to be desirable to do so.







From David Abrahams" <david.abrahams@rcn.com  Tue Jul 16 04:30:07 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Mon, 15 Jul 2002 23:30:07 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <Pine.LNX.4.44.0207151532410.17524-100000@ziggy> <200207152325.g6FNPD128921@europa.research.att.com> <243301c22c5f$50f7fd60$6601a8c0@boostconsulting.com> <200207160032.g6G0Wac29265@europa.research.att.com> <247901c22c61$3c8c9050$6601a8c0@boostconsulting.com> <200207160205.g6G25XC29723@europa.research.att.com> <251d01c22c75$581bfc20$6601a8c0@boostconsulting.com> <200207160322.g6G3Mkh00041@europa.research.att.com>
Message-ID: <254001c22c79$155d82b0$6601a8c0@boostconsulting.com>

From: "Andrew Koenig" <ark@research.att.com>

> I wasn't suggesting defining a protocol for every possible iteration
> view.  I was raising the question of whether multi-pass iteration
> was likely to be a common enough operation that it is worth defining
> a protocol for it, while leaving the door open to defining protocols
> for others should it turn out to be desirable to do so.

I think your examples are confusing different beasts, then. Multipass
(=copyable in these examples) should be a capability of iterators, just as
bidirectional or random-access would be. Breadth-first/depth-first is not a
capability of the iterator in that sense, but an implementatoin detail --
from the POV of the iterator's user, there's no way to tell what the
traversal order is.

-Dave






From oren-py-d@hishome.net  Tue Jul 16 06:25:03 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 16 Jul 2002 08:25:03 +0300
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jul 15, 2002 at 10:39:51AM -0400
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net> <E17SbpT-0002yd-00@mail.python.org> <200207111219.g6BCJVU30095@pcp02138704pcs.reston01.va.comcast.net> <20020712005928.A9833@hishome.net> <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020716082503.A20992@hishome.net>

On Mon, Jul 15, 2002 at 10:39:51AM -0400, Guido van Rossum wrote:
> > http://www.python.org/sf/580331
> > 
> > No, it's not a complete rewrite of file buffering.  This patch
> > implements Just's idea of xreadlines caching in the file object.  It
> > also makes a file into an iterator: __iter__ returns self and next
> > calls the next method of the cached xreadlines object.
> 
> Hm.  What happens to the xreadlines object when you do a seek() on the
> file?
> With the old semantics, you could do f.seek(0) and get another
> iterator (assuming it's a seekable file of course).  With the new
> semantics, the cached iterator keeps getting in the way.

On the new version of patch #580331 the cache is invalidated on a seek. 

> Maybe the xreadlines object could grow a flush() method that throws
> away its buffer, and f.seek() could call that if there's a cached
> xreadlines iterator?

The behavior of an xreadlines object is already undefined after a seek on 
the file.  This patch doesn't try to fix that.  The invalidation makes sure 
that the next iter() call will produce a fresh xreadlines, though.

Flushing would be too much work for this little hack. The right solution 
would be to fully integrate buffering into the file object and get rid of 
the dependency on the xreadlines module. The xreadlines method will then be 
equivalent to __iter__ (i.e. return self).  I assume that after this rewrite
the xreadlines module would be deprecated.

> > See my previous postings for why I think a file should be an iterator.
> 
> Haven't seen them but I would agree that this makes sense.

For some reason I got the impression that you disagreed.
 
> I just realized that the (existing) file_xreadlines() function has a
> subtle bug.  It uses a local static variable to cache the function
> xreadlines imported from the module xreadlines.  But if there are
> multiple interpreters or Py_Finalize() is called and then
> Py_Initialize() again, the cache is invalid.  Would you mind fixing
> this?  I think the caching just isn't worth it -- just do the import
> every time (it's fast enough if sys.modules['xreadlines'] already
> exists).

Done.

	Oren



From tim.one@comcast.net  Tue Jul 16 06:24:23 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 16 Jul 2002 01:24:23 -0400
Subject: [Python-Dev] AtExit Functions
In-Reply-To: <3D327F57.4040705@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEBEAFAB.tim.one@comcast.net>

[MAL, on Py_AtExit()]
> PyObject_Del() [must be avoided] as well ?

I don't think that one's a problem.  The Py{Object,Mem}_{Del,DEL,Free,FREE}
spellings have resolved to plain free().  Under pymalloc that's different,
but pymalloc never tears itself down so it's safe there too.


>> We have two sets of exit-function gimmicks, one that runs at
>> the very start of Py_Finalize(), and the other at the very end.
>> If you need to clean up Python objects, you have to get into
>> the first group.

> I suppose the first one is what the atexit module exposes
> in Python 2.0+, right ?

Yes, but do read Skip's message too.  atexit.py wraps a primitive gimmick
that was also in 1.5.2.  See the docs for sys.exitfunc at

    http://www.python.org/doc/1.5.2p2/lib/module-sys.html

The docs are pretty much the same now, except atexit.py provides a rational
way to register multiple exit functions now.  Hmm!  The logic in atexit.py
looks wrong if sys.exitfunc already exists:  then atexit appends it to the
module's own list of exit functions, but then forgets to do anything to
ensure that its own list gets run at the end.  I conclude that nobody has
tried mixing these gimmicks.

In any case, you can do something safe across all versions via:

import sys
try:
    inherited = sys.exitfunc
except AttributeError:
    def inherited():
        pass

def myexitfunc():
    clean_up_my_stuff()
    inherited()

sys.exitfunc = myexitfunc
del sys

You can get screwed then if somebody else sets sys.exitfunc later without
being as considerate of your hook as the code above is of a pre-existing
hook, but then that's why atexit.py was created and you should move to a
later Python if you want saner behavior <wink>.

> The problem with that approach is that there may still be some
> references to objects left in lists and dicts which are cleaned
> up after having called the atexit functions. This is not so
> much a problem in my cases, but something to watch out in other
> applications which use C level Python objects as globals.

I don't know specifically what you have in mind there, but I expect that it
would kick off another round of the every-18-months discussion of what kind
of module finalization protocol Python should start to support.  A PEP for
that is long overdue.

>>> Also, atexit.py is not present in Python 1.5.2.

>> What's that <wink>?

> That's the Python version which was brand new just 3 years
> ago. I know... in US terms that's for history books ;-)

Oh, 3 years ago is sooooo 20th century!  Goodness, they didn't even have
cellophane sleeping tubes back then.  May as well go back to worshipping
cats while you're at it.




From tim.one@comcast.net  Tue Jul 16 06:56:33 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 16 Jul 2002 01:56:33 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <200207152006.g6FK6gE10521@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEBFAFAB.tim.one@comcast.net>

[Guido]
> While preparing a patch, I discovered something strange: despite the
> fact that listiter_next() never raises StopIteration when it returns
> NULL,

That much is a documented part of the tp_iternext protocol.

> and despite the fact that it is used as the implementation for
> the next() method,

Oops.

> calling iter(list()).next() *does* raise StopIteration, rather than a
> complaint about NULL without setting an exception condition.

That is surprising!

> It took a brief debugging session to discover that in the presence
> of a tp_iternext function, the type machinery adds a next method that
> wraps tp_iternext.  Cute, though unexpected!

It also explains an old mystery I never got around to investigating:

>>> print iter([]).next.__doc__
x.next() -> the next value, or raise StopIteration
>>>

That was a mystery because that's not the docstring attached to the list
iterator's next() method:

static PyMethodDef listiter_methods[] = {
        {"next",        (PyCFunction)listiter_next,     METH_NOARGS,
         "it.next() -- get the next value, or raise StopIteration"},

> It means that the implementation of various iterators can be a little
> simpler, because no next() implementation needs to be given.

I'm not sure that's a feature we always want to use.  Going thru a wrapper
function (a) adds another layer of function call, and (b) adds a

	if (!PyArg_ParseTuple(args, ""))
		return NULL;

call via wrap_next().  Both expenses could be avoided if an existing next
method were left alone.  I suppose only the seoond expense is actually
"real", though, as most explicit xyz_next methods naturally call the
tp_iternext slot function anyway.  Still, when the body of a "next" method
is as simple as it is for lists, a call to PyArg_ParseTuple is a significant
overhead.




From oren-py-d@hishome.net  Tue Jul 16 07:08:26 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 16 Jul 2002 02:08:26 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMELIAEAB.tim.one@comcast.net> <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020716060826.GA92278@hishome.net>

On Mon, Jul 15, 2002 at 11:15:58AM -0400, Guido van Rossum wrote:
> I'm still only considering two options:
> 
>   (a) leave the status quo, or
>   (b) implement (and document!) the "sink-state" rule from the PEP.

(c) leave it officially undefined but make all builtin iterator behave
consistently.

Implementing consistent post-StopIteration behavior for builtin iterators 
is not too hard and doesn't require adding flags and special cases - when 
the iterator is exhausted it can clean up and decref any referenced objects 
and change its type to a StoppedIterator type.  I can write a patch.  

I would prefer this StoppedIterator type to raise a new exception when its
next() is called.  I assume you would want it to be a StopIteration sink.

As the risk of sounding like a broken record I will repeat my position:
I consider the StopIteration sink state to be a silent error. It makes an
exhausted iterator behave just like an iterator of an empty sequence. 
Because iterators and iterables can be mixed freely it results in silent 
failures when a function that requires a re-iterable object gets an iterator. 
Iterables can serve as a replacement for sequences in most cases.  When 
they are not I'd like to get an error, please.

When I pass a popened pipe to a function that expects a real file I will get 
an error if the function tries to perform a seek. I wouldn't want the seek 
operation to fail silently but that's more-or-less the equivalent of what 
iterators currently do. 

silent errors delenda est

	Oren



From aleax@aleax.it  Tue Jul 16 08:51:11 2002
From: aleax@aleax.it (Alex Martelli)
Date: Tue, 16 Jul 2002 09:51:11 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <15667.24256.621942.555107@anthem.wooz.org>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17UBqp-0000Ji-00@mail.python.org> <15667.24256.621942.555107@anthem.wooz.org>
Message-ID: <E17UN7X-0001Up-00@mail.python.org>

On Tuesday 16 July 2002 01:46 am, Barry A. Warsaw wrote:
> >>>>> "AM" == Alex Martelli <aleax@aleax.it> writes:
>     >> what Zope does AFAIK).  Otherwise creating minor variations on
>     >> a class would be quite a pain -- you'd have to repeat all the
>     >> interfaces implemented by the base class; and what if a later
>     >> version of Super implements more interfaces?
>
>     AM> This is actually a difficult point.  If I have to explicitly
>     AM> state all the interfaces of Super that I want to _exclude_,
>     AM> and Super adds some more interfaces tomorrow, then it's quite
>     AM> possible that my class is suddenly broken -- it doesn't
>     AM> guarantee the invariants that says it guarantees, any more --
>     AM> and I don't even know about it.
>
> You'd need a way to explicitly state that you implement /none/ of the
> interfaces of your superclass, and then explicitly add back the ones
> you do implement.

Right -- "i inherit the implementation but none of the interfaces".  You
can express this either by appropriately tagging the "I inherit" part, as
C++ does (private inheritance -- the default, but that's yet _another_
C++ issue... defaults that may or may not be appropriate for typical
use!-), or with a variation of "exclude-interfaces" however you spell that.

Alternatively, "I inherit" could default to "not the interfaces", and, if
needed, one might add a clause "oh, and all the interfaces too, please"
when that is positively desired.  Maybe the default would best be
chosen on the basis of "what is good for them" rather on "what appears
most desirable intuitively", as is currently done for module imports.

Defaulting to "i inherit all" is roughly as convenient as defaulting to
"from amodule import *" would be felt to be by naive users unfamiliar
with the issues of namespace pollution.  You know, "convenience" is
getting to be something of a dirty word:-).  Knuth said "premature
optimization is the root of all evil in programming", and no doubt he
was right for HIS generation -- people who grew up on machines with
a few KB of memory and small fractions of MIPs were warped for life
by the need to squeeze every possible drop of optimization.  We still
have some of that, no doubt.  But current generations of programmers
grew up on machines of overwhelming power -- gradually, the pitfall of
premature optimization becomes less pervasive.  OTOH, the same
programmers grew up on machines overburdened to the gills with a
surfeit of "convenient" features... a new "root of some evil" is
emerging, and it's spelled "convenience".  Perl's surfeit of ad-hoc,
context-dependent, highly-"convenient" surprises just waiting to trip
you at every step should be an object lesson in "convenience".

Simple, clean, orthogonal, predictable, clear, unsurprising, regular.

Now THESE are the buzzwords I long for... "convenient", OTOH, makes
me wary.  Convenience has its place, just as does optimization, and
Python has traditionally done a great job of supplying just enough
optimization and just enough convenience without compromising the
really important buzzwords above listed.  OTOH, the BDFL does say
that he's not very experienced with "components" (interface-based
programming); and few can claim extensive experience with many
different nuances of that (on introspection, I _would_ claim for myself
extensive experience with production use of COM, but a bit lesser with
"bare C++" [a la Lakos, say] and definitely not "extensive" for Java, Haskell 
and others).

Would we REALLY like to have:
	import foo
do the equivalent of today's
	from foo import *
and have to explicitly say
	import foo dont_pollute_my_namespace
to get today's "import foo" behavior?  I'm sure many beginners would love
it -- you have to pound their heads with a mallet to wean them off the
"import *" even today, particularly if they come from languages which
offer the equivalent facility (e.g., C++'s "using namespace" -- back in 
think3, I got hoarse from having to repeat over and over again that making
"using namespace std;" a standard prologue of every source file was NOT
clever -- and I'm talking about able, mature programmers, quite used to
large-scale programming in C++... but namespaces were new, and were
perceived as "inconvenient"...!).  Now, the amount of desirable separation
between components may be lesser than the high separation that is most
desirable between modules / namespaces -- but it IS higher than that most
desirable between "ordinary" objects under inheritance.  I surely don't know
"the solution", but I just as surely do feel there ARE issues here that are
worth pondering about.


Just my two Eurocents (now worth slightly MORE than 2 cents of US$...!-)

Alex



From mal@lemburg.com  Tue Jul 16 08:57:41 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 16 Jul 2002 09:57:41 +0200
Subject: [Python-Dev] AtExit Functions
References: <LNBBLJKPBEHFEDALKOLCMEBEAFAB.tim.one@comcast.net>
Message-ID: <3D33D1F5.3030302@lemburg.com>

Tim Peters wrote:
> [MAL, on Py_AtExit()]
> 
>>PyObject_Del() [must be avoided] as well ?
> 
> 
> I don't think that one's a problem.  The Py{Object,Mem}_{Del,DEL,Free,FREE}
> spellings have resolved to plain free().  Under pymalloc that's different,
> but pymalloc never tears itself down so it's safe there too.

Good, because I've been using that particular API for years
in Py_AtExit() functions.

>>>We have two sets of exit-function gimmicks, one that runs at
>>>the very start of Py_Finalize(), and the other at the very end.
>>>If you need to clean up Python objects, you have to get into
>>>the first group.
>>
> 
>>I suppose the first one is what the atexit module exposes
>>in Python 2.0+, right ?
> 
> 
> Yes, but do read Skip's message too.  atexit.py wraps a primitive gimmick
> that was also in 1.5.2.  See the docs for sys.exitfunc at
> 
>     http://www.python.org/doc/1.5.2p2/lib/module-sys.html
> 
> The docs are pretty much the same now, except atexit.py provides a rational
> way to register multiple exit functions now. 

The problem is not finding code to work in Python 1.5.2.
I have my own ExitFunctions.py module which did pretty much
the same as atexit.py for Python 1.5.2. The problem is that
if there's no standard for managing atexit functions available
in Python 1.5.2, then it is likely that others will have used
a similar method and maybe failed to play nice with other
such hooks.

> Hmm!  The logic in atexit.py
> looks wrong if sys.exitfunc already exists:  then atexit appends it to the
> module's own list of exit functions, but then forgets to do anything to
> ensure that its own list gets run at the end.  I conclude that nobody has
> tried mixing these gimmicks.

Indeed.

try:
     x = sys.exitfunc
except AttributeError:
     sys.exitfunc = _run_exitfuncs
else:
     # if x isn't our own exit func executive, assume it's another
     # registered exit function - append it to our list...
     if x != _run_exitfuncs:
         register(x)

The logic seems a bit wrong for that case: how could
x possibly be _run_exitfuncs ? I think this code should
look something like this:

try:
     x = sys.exitfunc
except AttributeError:
     pass
else:
     # if x isn't our own exit func executive, assume it's another
     # registered exit function - append it to our list...
     register(x)
sys.exitfunc = _run_exitfuncs

> In any case, you can do something safe across all versions via:
> 
> import sys
> try:
>     inherited = sys.exitfunc
> except AttributeError:
>     def inherited():
>         pass
> 
> def myexitfunc():
>     clean_up_my_stuff()
>     inherited()
> 
> sys.exitfunc = myexitfunc
> del sys
> 
> You can get screwed then if somebody else sets sys.exitfunc later without
> being as considerate of your hook as the code above is of a pre-existing
> hook, but then that's why atexit.py was created and you should move to a
> later Python if you want saner behavior <wink>.

Right. OTOH, if someone screws up here, worst which can happen
is a memory leak. Not all that much too lose nowadays with GB of
RAM ;-)

>>The problem with that approach is that there may still be some
>>references to objects left in lists and dicts which are cleaned
>>up after having called the atexit functions. This is not so
>>much a problem in my cases, but something to watch out in other
>>applications which use C level Python objects as globals.
> 
> 
> I don't know specifically what you have in mind there, but I expect that it
> would kick off another round of the every-18-months discussion of what kind
> of module finalization protocol Python should start to support.  A PEP for
> that is long overdue.
 >
>>>>Also, atexit.py is not present in Python 1.5.2.
>>>
> 
>>>What's that <wink>?
>>
> 
>>That's the Python version which was brand new just 3 years
>>ago. I know... in US terms that's for history books ;-)
> 
> 
> Oh, 3 years ago is sooooo 20th century!  Goodness, they didn't even have
> cellophane sleeping tubes back then.  May as well go back to worshipping
> cats while you're at it.

Naa, I'll stick to the Python 1.2 tutorial I still keep under
my pillow as per instructions from Guido at the time.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From gmcm@hypernet.com  Tue Jul 16 12:46:19 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Tue, 16 Jul 2002 07:46:19 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <Pine.LNX.4.44.0207151634200.17524-100000@ziggy>
References: <200207142219.g6EMJaJ28788@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D33CF4B.5890.18BBC3E8@localhost>

On 15 Jul 2002 at 16:51, Ka-Ping Yee wrote:

> ...  An exhausted
> iterator is not the same thing as a freshly-created
> iterator on an empty sequence, 

Right. But a freshly-created iterator on an empty
sequence is exactly like an iterator which will be
exhausted on the next next.

This is something the callng code can easily
detect if it desires, so having the iterator track
it is needless complication.

-- Gordon
http://www.mcmillan-inc.com/

PS to ?ing: Seen Aaron & Cindy lately?



From guido@python.org  Tue Jul 16 12:51:20 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jul 2002 07:51:20 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Tue, 16 Jul 2002 09:51:11 +0200."
 <E17UN7X-0001Up-00@mail.python.org>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com> <E17UBqp-0000Ji-00@mail.python.org> <15667.24256.621942.555107@anthem.wooz.org>
 <E17UN7X-0001Up-00@mail.python.org>
Message-ID: <200207161151.g6GBpKs14485@pcp02138704pcs.reston01.va.comcast.net>

> But current generations of programmers grew up on machines of
> overwhelming power -- gradually, the pitfall of premature
> optimization becomes less pervasive.

I don't see that yet.  We get an awful number of contributions that
are broken or obfuscated by premature attempts at optimization.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jul 16 13:07:54 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jul 2002 08:07:54 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: Your message of "Tue, 16 Jul 2002 02:08:26 EDT."
 <20020716060826.GA92278@hishome.net>
References: <LNBBLJKPBEHFEDALKOLCMELIAEAB.tim.one@comcast.net> <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net>
 <20020716060826.GA92278@hishome.net>
Message-ID: <200207161207.g6GC7s414566@pcp02138704pcs.reston01.va.comcast.net>

> On Mon, Jul 15, 2002 at 11:15:58AM -0400, Guido van Rossum wrote:
> > I'm still only considering two options:
> > 
> >   (a) leave the status quo, or
> >   (b) implement (and document!) the "sink-state" rule from the PEP.
> 
> (c) leave it officially undefined but make all builtin iterator behave
> consistently.

What would be the point of that?  Since we can't enforce the
sink-state rule for 3rd party iterators, this is no different from (b)
except that 3rd party implementers have less of an incentive to fix
their implementations.

> Implementing consistent post-StopIteration behavior for builtin
> iterators is not too hard and doesn't require adding flags and
> special cases - when the iterator is exhausted it can clean up and
> decref any referenced objects and change its type to a
> StoppedIterator type.  I can write a patch.

Don't bother.  I already wrote a patch, SF patch 580331.

Changing the type is evil (you can't change the type unless the memory
deallocation policies are the same), so I won't do that.

> I would prefer this StoppedIterator type to raise a new exception
> when its next() is called.  I assume you would want it to be a
> StopIteration sink.

You got that right, buddy.

> As the risk of sounding like a broken record I will repeat my
> position: I consider the StopIteration sink state to be a silent
> error. It makes an exhausted iterator behave just like an iterator
> of an empty sequence.  Because iterators and iterables can be mixed
> freely it results in silent failures when a function that requires a
> re-iterable object gets an iterator.  Iterables can serve as a
> replacement for sequences in most cases.  When they are not I'd like
> to get an error, please.

This is inconsistent with your position (c) above, which gives you no
guarantees in this case.

I also think you're mistaken in your desire.  Iterables do *not* serve
as sequence replacements.

> When I pass a popened pipe to a function that expects a real file I
> will get an error if the function tries to perform a seek. I
> wouldn't want the seek operation to fail silently but that's
> more-or-less the equivalent of what iterators currently do.

It would be an error to try to use __getitem__ on an iterator.

Please give up this line of request -- I'm tired of this argument.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jul 16 13:49:36 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jul 2002 08:49:36 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: Your message of "Tue, 16 Jul 2002 01:56:33 EDT."
 <LNBBLJKPBEHFEDALKOLCOEBFAFAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCOEBFAFAB.tim.one@comcast.net>
Message-ID: <200207161249.g6GCnbN14616@pcp02138704pcs.reston01.va.comcast.net>

> > It means that the implementation of various iterators can be a little
> > simpler, because no next() implementation needs to be given.
> 
> I'm not sure that's a feature we always want to use.  Going thru a wrapper
> function (a) adds another layer of function call, and (b) adds a
> 
> 	if (!PyArg_ParseTuple(args, ""))
> 		return NULL;
> 
> call via wrap_next().  Both expenses could be avoided if an existing
> next method were left alone.  I suppose only the seoond expense is
> actually "real", though, as most explicit xyz_next methods naturally
> call the tp_iternext slot function anyway.  Still, when the body of
> a "next" method is as simple as it is for lists, a call to
> PyArg_ParseTuple is a significant overhead.

I'm not worried.  There's considerable expense in the attribute lookup
for the next method too, if you call it from Python, which probably
drowns the PyArg_ParseTuple overhead.  The whole idea is that usually
the tp_iternext slot will be used directly, and as long as that's
fast, I'm happy.  (If you're not, we can always write faster code to
check for no arguments here.)

The code in typeobject.c maintains a correspondence between
tp_iternext and the next() method, just like it does for tp_iter and
__iter__(), and for tp_hash and __hash__().  It goes both ways: if you
assign to C.next, the tp_iternext slot will be set to a wrapper that
calls C.next().

This also prevents inconsistency between next() and tp_iternext.
(Both the list iterator and hotshot had broken next() implementations,
BTW.)

It needs to be documented, though!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jul 16 13:52:56 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jul 2002 08:52:56 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Tue, 16 Jul 2002 08:25:03 +0300."
 <20020716082503.A20992@hishome.net>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <200207111047.g6BAlri29897@pcp02138704pcs.reston01.va.comcast.net> <E17SbpT-0002yd-00@mail.python.org> <200207111219.g6BCJVU30095@pcp02138704pcs.reston01.va.comcast.net> <20020712005928.A9833@hishome.net> <200207151439.g6FEdpM31288@pcp02138704pcs.reston01.va.comcast.net>
 <20020716082503.A20992@hishome.net>
Message-ID: <200207161252.g6GCquk14627@pcp02138704pcs.reston01.va.comcast.net>

> On the new version of patch #580331 the cache is invalidated on a seek. 
> 
> > Maybe the xreadlines object could grow a flush() method that throws
> > away its buffer, and f.seek() could call that if there's a cached
> > xreadlines iterator?
> 
> The behavior of an xreadlines object is already undefined after a seek on 
> the file.  This patch doesn't try to fix that.  The invalidation makes sure 
> that the next iter() call will produce a fresh xreadlines, though.

OK, good enough.

> Flushing would be too much work for this little hack. The right solution 
> would be to fully integrate buffering into the file object and get rid of 
> the dependency on the xreadlines module. The xreadlines method will then be 
> equivalent to __iter__ (i.e. return self).  I assume that after this rewrite
> the xreadlines module would be deprecated.

Yes, that's what I've called rewriting the I/O system. :-)

> > > See my previous postings for why I think a file should be an iterator.
> > 
> > Haven't seen them but I would agree that this makes sense.
> 
> For some reason I got the impression that you disagreed.

I disagreed with making next simply point to readline, because that
would defeat the speedup we get from using the file iterator.  The
solution in your patch doesn't have this problem.  (Though one *could*
argue that making the file object its own iterator is only confusing;
given that I'm also not sure what problem it solves, I'm at best +0 on
it.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aleax@aleax.it  Tue Jul 16 14:29:17 2002
From: aleax@aleax.it (Alex Martelli)
Date: Tue, 16 Jul 2002 15:29:17 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207161252.g6GCquk14627@pcp02138704pcs.reston01.va.comcast.net>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <20020716082503.A20992@hishome.net> <200207161252.g6GCquk14627@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17USOm-0005RC-00@mail.python.org>

On Tuesday 16 July 2002 02:52 pm, Guido van Rossum wrote:
	...
> argue that making the file object its own iterator is only confusing;
> given that I'm also not sure what problem it solves, I'm at best +0 on

Personally, I think it solves at least a teaching problem -- it helps me
teach the difference between iterators and iterables.  In the Europython
tutorial I had to gloss a bit over the fact that the difference was rather
blurried.  According to the principles I mentioned, as easiest for the
audience to understand and apply, the file object SHOULD have been
an iterator, not an iterable -- i.e. it SHOULD have been the case that
f is iter(f) when f is a file object -- but it wasn't.  When it IS, that's one
less micro-wart I need to mention when teaching or writing about it.

I don't see any downside to having this micro-wart removed.  In
particular, I don't see what's confusing.  Things that respond to
iter(x) fall in two categories:
    iterators: also have x.next(), and iter(x) is x
    iterables: iter(x) is not x, so you can presumably get another
        iterator out of x at some later point in time if needed.
It's not QUITE as simple as this, but moving file objects from
the second category to the first seems to _simplify_ things a bit.

E.g.:

def useIterable(x):
    try: 
        it = iter(x)
    except TypeError:
        raise TypeError, "Need iterable object, not %s" % type(x)
    if it is x:
        raise TypeError, "Need iterable object, not iterator"
    # keep happily using it and/or x as needed, and in particular
    # the code is able to call it1 = iter(x) if it needs to iterate again

Not perfect -- but having a file-object argument fail this simplistic
test seems better to me, less confusing, than having it pass.


So, I, personally, am +1.  It might be even nicer (from the point of view 
of teaching, at least) if iterating on f interoperated more smoothly with
other method calls on f, but I do see your point that the right way
to achieve THAT would be a complete rewrite of the I/O system,
and thus a vastly heavier project than the current one.  Still, the current
step seems to be in the right direction.


Alex



From guido@python.org  Tue Jul 16 14:50:22 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jul 2002 09:50:22 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Tue, 16 Jul 2002 15:29:17 +0200."
 <E17USOm-0005RD-00@mail.python.org>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <20020716082503.A20992@hishome.net> <200207161252.g6GCquk14627@pcp02138704pcs.reston01.va.comcast.net>
 <E17USOm-0005RD-00@mail.python.org>
Message-ID: <200207161350.g6GDoM522277@odiug.zope.com>

> > argue that making the file object its own iterator is only confusing;
> > given that I'm also not sure what problem it solves, I'm at best +0 on
> 
> Personally, I think it solves at least a teaching problem -- it
> helps me teach the difference between iterators and iterables.  In
> the Europython tutorial I had to gloss a bit over the fact that the
> difference was rather blurried.  According to the principles I
> mentioned, as easiest for the audience to understand and apply, the
> file object SHOULD have been an iterator, not an iterable -- i.e. it
> SHOULD have been the case that f is iter(f) when f is a file object
> -- but it wasn't.  When it IS, that's one less micro-wart I need to
> mention when teaching or writing about it.

I dunno.  The presence of seek() and write() makes the behavior of
files a rather unique blend of iterator and iterable.

> I don't see any downside to having this micro-wart removed.  In
> particular, I don't see what's confusing.  Things that respond to
> iter(x) fall in two categories:
>     iterators: also have x.next(), and iter(x) is x
>     iterables: iter(x) is not x, so you can presumably get another
>         iterator out of x at some later point in time if needed.
> It's not QUITE as simple as this, but moving file objects from
> the second category to the first seems to _simplify_ things a bit.

I worry that equating a file with its iterable makes it more likely
that people mix next() with readline() or seek(), which doesn't work
(at least not until the I/O system is rewritten).

I'd be more comfortable with teaching people that you should *either*
use a file in a for loop (the common case, probably) *or* use its
native I/O methods (readline() etc.), but not mix both.

> E.g.:
> 
> def useIterable(x):
>     try: 
>         it = iter(x)
>     except TypeError:
>         raise TypeError, "Need iterable object, not %s" % type(x)
>     if it is x:
>         raise TypeError, "Need iterable object, not iterator"
>     # keep happily using it and/or x as needed, and in particular
>     # the code is able to call it1 = iter(x) if it needs to iterate again
> 
> Not perfect -- but having a file-object argument fail this simplistic
> test seems better to me, less confusing, than having it pass.

This actually looks like an example of the "look before you leap"
(LBYL) syndrome, which you disapproved of recently.

> So, I, personally, am +1.  It might be even nicer (from the point of
> view of teaching, at least) if iterating on f interoperated more
> smoothly with other method calls on f, but I do see your point that
> the right way to achieve THAT would be a complete rewrite of the I/O
> system, and thus a vastly heavier project than the current one.
> Still, the current step seems to be in the right direction.

Somehow I'd rather emphasize the relative brokenness of the current
situation.  Anyway, I'm somewhere between -0 and +0 (inclusive) on
this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip@pobox.com  Tue Jul 16 14:55:49 2002
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 16 Jul 2002 08:55:49 -0500
Subject: [Python-Dev] AtExit Functions
In-Reply-To: <3D33D1F5.3030302@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCMEBEAFAB.tim.one@comcast.net>
 <3D33D1F5.3030302@lemburg.com>
Message-ID: <15668.9701.455650.465650@localhost.localdomain>

    mal> The problem is that if there's no standard for managing atexit
    mal> functions available in Python 1.5.2, then it is likely that others
    mal> will have used a similar method and maybe failed to play nice with
    mal> other such hooks.

That was precisely why I wrote the atexit module.

(I agree there's a bug in the init code.)

Skip



From barry@zope.com  Tue Jul 16 15:30:51 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 16 Jul 2002 10:30:51 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B428@UKRUX002.rundc.uk.origin-it.com>
 <E17UBqp-0000Ji-00@mail.python.org>
 <15667.24256.621942.555107@anthem.wooz.org>
 <200207160751.g6G7pIq13900@smtp.zope.com>
Message-ID: <15668.11803.110215.686549@anthem.wooz.org>

>>>>> "AM" == Alex Martelli <aleax@aleax.it> writes:

    AM> Right -- "i inherit the implementation but none of the
    AM> interfaces".  You can express this either by appropriately
    AM> tagging the "I inherit" part, as C++ does (private inheritance
    AM> -- the default, but that's yet _another_ C++ issue... defaults
    AM> that may or may not be appropriate for typical use!-), or with
    AM> a variation of "exclude-interfaces" however you spell that.

    AM> Alternatively, "I inherit" could default to "not the
    AM> interfaces", and, if needed, one might add a clause "oh, and
    AM> all the interfaces too, please" when that is positively
    AM> desired.  Maybe the default would best be chosen on the basis
    AM> of "what is good for them" rather on "what appears most
    AM> desirable intuitively", as is currently done for module
    AM> imports.

I don't know, it seems like 6-one-way, half-dozen-the-other, but I
tend to agree with Guido on this one.

    AM> Defaulting to "i inherit all" is roughly as convenient as
    AM> defaulting to "from amodule import *" would be felt to be by
    AM> naive users unfamiliar with the issues of namespace pollution.

Interface conformance seems totally different than name importing, so
I don't think the analogy holds.  I just feel that in Python, I rarely
use inheritance for implementation convenience only.

-Barry



From aahz@pythoncraft.com  Tue Jul 16 15:34:36 2002
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 16 Jul 2002 10:34:36 -0400
Subject: [Python-Dev] Termination of two-arg iter()
In-Reply-To: <20020716060826.GA92278@hishome.net>
References: <LNBBLJKPBEHFEDALKOLCMELIAEAB.tim.one@comcast.net> <200207151515.g6FFFwq31597@pcp02138704pcs.reston01.va.comcast.net> <20020716060826.GA92278@hishome.net>
Message-ID: <20020716143436.GB4012@panix.com>

On Tue, Jul 16, 2002, Oren Tirosh wrote:
>
> I consider the StopIteration sink state to be a silent error. It
> makes an exhausted iterator behave just like an iterator of an empty
> sequence.  Because iterators and iterables can be mixed freely it
> results in silent failures when a function that requires a re-iterable
> object gets an iterator.  Iterables can serve as a replacement for
> sequences in most cases.  When they are not I'd like to get an error,
> please.

So the real problem isn't that you can't distinguish between an empty
iterator and an exhausted one, but that you can't distinguish between
re-iterable objects and objects that can't be re-iterated.  If my
understanding of your POV is correct, you can't get there from here.
You're talking about two different concepts and conflating them, which
to my mind breaks, "Simple is better than complex," and, "Beautiful is
better than ugly."

Your sole hope IMO, is to get behind Alex's bandwagon so that there is a
mechanism available for documenting such behaviors at the code level.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From neal@metaslash.com  Wed Jul 17 00:05:03 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Tue, 16 Jul 2002 19:05:03 -0400
Subject: [Python-Dev] atexit test failing
Message-ID: <3D34A69F.91B60F74@metaslash.com>

Tim:

test_atexit is failing for me because the test assumes that
the python being tested is the first one found in the path.
This is not true on my system.  Would it be safer to use the
patch below which replaces:
	"python " + fname
with
	"%s %s" % (sys.executable, fname)

Neal
--

> import sys
25c26
< p = os.popen("python " + fname)
---
> p = os.popen("%s %s" % (sys.executable, fname))
53c54
< p = os.popen("python " + fname)
---
> p = os.popen("%s %s" % (sys.executable, fname))



From tim.one@comcast.net  Wed Jul 17 01:33:54 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 16 Jul 2002 20:33:54 -0400
Subject: [Python-Dev] atexit test failing
In-Reply-To: <3D34A69F.91B60F74@metaslash.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEEGAFAB.tim.one@comcast.net>

[Neal Norwitz]
> test_atexit is failing for me because the test assumes that
> the python being tested is the first one found in the path.
> This is not true on my system.

I wondered about that, but figured "it must be safe" since test_popen's
_do_test_commandline also does a popen("python ...").

So what I'm actually assuming is that you don't have a broken Python first
on your path <wink>.

> Would it be safer to use the patch below which replaces:
> 	"python " + fname
> with
> 	"%s %s" % (sys.executable, fname)

I don't know.  For example, does that work for you?  If so, that would be a
good start.  It works for me, so I'll check that in.




From ping@zesty.ca  Wed Jul 17 02:18:05 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Tue, 16 Jul 2002 18:18:05 -0700 (PDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207152325.g6FNPD128921@europa.research.att.com>
Message-ID: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy>

On Mon, 15 Jul 2002, Andrew Koenig wrote:
> However the purpose my suggestion of __multiter__ was not to use it to
> test for multiple iteration, but to enable a container to be able to
> yield either a single or a multiple iterator on request.

I see what you want, though i have a hard time imagining a situation
where it's really necessary to have both (as opposed to just the
multiple iterator, which is strictly more capable).  I can certainly
see how you might want to be able to ask for a breadth-first or
depth-first iterator on a tree, though.

> > Or, what if there is no container to begin with, but the iterator is still
> > copyable?  You can't flag that by putting __multiter__ on anything; again
> > it makes more sense to just provide __copy__ on the iterator.
>
> You could flag it by putting __multiter__ on the iterator, just as iterators
> presently have __iter__.

Ugh.  I don't like this, for the reasons i outlined in another message:
an iterator is not the same as a container.  Iterators always mutate;
containers usually do not (at least not as a result of looking at the
elements).

> > All that's really necessary here is to document the convention about what
> > __copy__ is supposed to mean if it's available on an iterator.  If we
> > all agree that __copy__ should preserve an independent copy of the
> > current state of the iterator, we're all set.
>
> Not quite.  We also need an agreement that calling __iter__ on a container
> is not a destructive operation unless you call next() on the iterator that
> you get back.

What i'd like is an agreement that calling __iter__ on a container is
not a destructive operation at all.  If it's destructive, then what you
started with is not really a container, and we should encourage people
to call attention to this irregularity in their documentation.

> > I think a proliferation of iterator-fetching methods would be a
> > messy and unpleasant prospect.  After __iter__, __multiter__,
> > and __ambiter__, what next?  __mutableiter__?
> > __depthfirstiter__?  __breadthfirstiter__?
>
> A data structure that supports several different kinds of iteration
> has to provide that support somehow.

Agreed.  I was unclear: what makes me uncomfortable is the pollution
of the double-underscore namespace.  When you do have a container-like
object that supports various kinds of iteration, naturally you are
going to need some methods for getting iterators.  I just think it's
not appropriate to establish special names for them.

To me, the presence of double-underscores around a method name means
that the method is called automatically.  My expectation is that when
i write a method with a "normal" name, the name itself will appear
after a dot wherever that method is used; and that when there's a
method with a "__special__" name, the method is called implicitly.
The implicit call can occur via an operator (e.g. __add__), or to
implement a protocol defined in the language (e.g. __init__), etc.
If you see the string ".__" it means that something unusual is going on.

If you follow this convention, then "__iter__" deserves a special name,
because it is the specially blessed iterator-getter used by "for".
There may be other iterator-getters, but they must be called explicitly,
so they shouldn't get underscores.

                *               *               *

An aside on "next" vs. "__next__":

Note that this convention would also suggest that "next" should be
called "__next__", since "for" calls "next" implicitly.  I forget
why we ended up going with "next" instead of "__next__".  I think
"__next__" would have been better, especially in light of this:

Tim Peters wrote:
> Requiring *some* method with a reserved name is an aid to
> introspection, lest it become impossible to distinguish, say,
> an iterator from an instance of a doubly-linked list node class
> that just happens to supply methods named .prev() and .next()
> for an unrelated purpose.

This is exactly why the iterator protocol should consist of one
method named "__next__" rather than two methods named "__iter__"
(which has nothing to do with the act of iterating!) and "next"
(which is the one we really care about, but can collide with
existing method names).

As far as i know, "next" is the only implicitly-called method of
an internal protocol that has no underscores.  It's a little late
to fix the name of "next" in Python 2, though it might be worth
considering for Python 3.


-- ?!ng




From guido@python.org  Wed Jul 17 02:29:55 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jul 2002 21:29:55 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Tue, 16 Jul 2002 18:18:05 PDT."
 <Pine.LNX.4.44.0207161656360.17524-100000@ziggy>
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy>
Message-ID: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net>

> An aside on "next" vs. "__next__":
> 
> Note that this convention would also suggest that "next" should be
> called "__next__", since "for" calls "next" implicitly.  I forget
> why we ended up going with "next" instead of "__next__".  I think
> "__next__" would have been better, especially in light of this:
> 
> Tim Peters wrote:
> > Requiring *some* method with a reserved name is an aid to
> > introspection, lest it become impossible to distinguish, say,
> > an iterator from an instance of a doubly-linked list node class
> > that just happens to supply methods named .prev() and .next()
> > for an unrelated purpose.
> 
> This is exactly why the iterator protocol should consist of one
> method named "__next__" rather than two methods named "__iter__"
> (which has nothing to do with the act of iterating!) and "next"
> (which is the one we really care about, but can collide with
> existing method names).
> 
> As far as i know, "next" is the only implicitly-called method of
> an internal protocol that has no underscores.  It's a little late
> to fix the name of "next" in Python 2, though it might be worth
> considering for Python 3.

Yup.  I regret this too.  We should have had a built-in next(x) which
calls x.__next__().  I think that if it had been __next__() we
wouldn't have the mistake that I just discovered -- that all the
iterator types that define a next() method shouldn't have done so,
because you get one automatically which is the tp_iternext slot
wrapped. :-(

But yes, it's too late to change now.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ping@zesty.ca  Wed Jul 17 03:16:35 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Tue, 16 Jul 2002 19:16:35 -0700 (PDT)
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.LNX.4.44.0207161909060.17524-100000@ziggy>

On Tue, 16 Jul 2002, Guido van Rossum wrote:
> Yup.  I regret this too.  We should have had a built-in next(x) which
> calls x.__next__().

Ah!  I think one of the hang-ups i (we? the BOF?) got stuck on was
that users of iterators would sometimes call next() directly, and
so it wouldn't do to call it __next__.  But it's clear to me now that
a built-in next() is exactly the right answer, by analogy to the
built-in repr() and method __repr__.

> But yes, it's too late to change now.

Sigh.  Well, in the hope that it makes the change a little easier
to swallow, i'll say now that if the protocol is fixed in some
future version of Python, i'll volunteer to update the standard
library to the new protocol.  I guess when Python 3 comes around,
there's going to me some sort of migration helper tool, and that
tool can check for classes that have __iter__ and next, and
suggest changing the name to __next__.


-- ?!ng




From mhammond@skippinet.com.au  Wed Jul 17 14:37:06 2002
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Wed, 17 Jul 2002 23:37:06 +1000
Subject: [Python-Dev] Review of build system patch requested
Message-ID: <LCEPIIGDJPKCOIHOBJEPOEABGAAA.mhammond@skippinet.com.au>

I would like review of a patch that touches the configure/build system.

The patch is to fix/deprecate the DL_IMPORT macros that pepper the Python
source code.  These macros were originally introduced for the Windows port
many years ago as a way of declaring special linkage for the Python API
functions and data exposed in the Python DLL.  It has since been used in the
cygwin and BeOS ports, and, to put it bluntly, is broken!

The patch touches the configure/build system to provide a consistent
mechanism for declaring special linkage macros regardless of platform.
Specifically:

* configure.in has been changed to #define Py_ENABLE_SHARED in pyconfig.h if
Python has been configured for building as a shared library.

* Makefile.pre.in has been changed to pass "-DPy_BUILD_CORE" to the compiler
when building Python itself and any builtin modules.  This flag is not
passed to extension modules.

* pyport.h has been changed to define the macros PyAPI_FUNC, PyAPI_DATA and
PyMODINIT_FUNC.  For Windows, cygwin and BeOS, these will resolve to
"__declspec" directives (depending on Py_ENABLE_SHARED and Py_BUILD_CORE).
For all other platforms these will resolve to nothing.

The patch also contains significant changes to PC/pyconfig.h - while reviews
of that code are welcome, I am primarily interested in reviews of the above
three points, and some indication from gurus on Linux and other platforms
that these changes are reasonable (or if I am lucky, desirable <wink>)

www.python.org/sf/566100 - Patch [ 566100 ] Rationalize DL_IMPORT and
DL_EXPORT

Thanks,

Mark.




From cce@clarkevans.com  Wed Jul 17 14:45:04 2002
From: cce@clarkevans.com (Clark C . Evans)
Date: Wed, 17 Jul 2002 09:45:04 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Tue, Jul 16, 2002 at 09:29:55PM -0400
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy> <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020717094504.A85351@doublegemini.com>

On Tue, Jul 16, 2002 at 09:29:55PM -0400, Guido van Rossum wrote:
| > An aside on "next" vs. "__next__":
| > 
| > As far as i know, "next" is the only implicitly-called method of
| > an internal protocol that has no underscores.  It's a little late
| > to fix the name of "next" in Python 2, though it might be worth
| > considering for Python 3.
| 
| Yup.  I regret this too. 
| But yes, it's too late to change now.

I don't think it is too late.  90% ++ of the python code base out
there doesn't use iterators yet... people are still wrapping their
minds around it to see how they can use it in their applications.
If it was publicly stated that this could be "fixed" in the next
version I don't think that it would hurt.  These things happen,
and sometimes its best to "roll back".  Programmers understand this.

Best,

Clark

-- 
Clark C. Evans                   Axista, Inc.
http://www.axista.com            800.926.5525
XCOLLA Collaborative Project Management Software



From barry@zope.com  Wed Jul 17 14:48:09 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Wed, 17 Jul 2002 09:48:09 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy>
 <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net>
 <20020717094504.A85351@doublegemini.com>
Message-ID: <15669.30105.128625.569434@anthem.wooz.org>

>>>>> "CC" == Clark C <cce@clarkevans.com> writes:

    CC> I don't think it is too late.  90% ++ of the python code base
    CC> out there doesn't use iterators yet... people are still
    CC> wrapping their minds around it to see how they can use it in
    CC> their applications.  If it was publicly stated that this could
    CC> be "fixed" in the next version I don't think that it would
    CC> hurt.  These things happen, and sometimes its best to "roll
    CC> back".  Programmers understand this.

And besides (to continue Clark's devils advocacy), how much of the
code out there that /does/ use iterators, calls .next() explicitly?

-Barry



From David Abrahams" <david.abrahams@rcn.com  Wed Jul 17 14:48:19 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Wed, 17 Jul 2002 09:48:19 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy><200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net><20020717094504.A85351@doublegemini.com> <15669.30105.128625.569434@anthem.wooz.org>
Message-ID: <040701c22d98$bdf45320$6501a8c0@boostconsulting.com>

From: "Barry A. Warsaw" <barry@zope.com>

>     CC> I don't think it is too late.  90% ++ of the python code base
>     CC> out there doesn't use iterators yet... people are still
>     CC> wrapping their minds around it to see how they can use it in
>     CC> their applications.  If it was publicly stated that this could
>     CC> be "fixed" in the next version I don't think that it would
>     CC> hurt.  These things happen, and sometimes its best to "roll
>     CC> back".  Programmers understand this.
>
> And besides (to continue Clark's devils advocacy), how much of the
> code out there that /does/ use iterators, calls .next() explicitly?

Hmm, I'm getting excited! We rarely get an opportunity to fix mistakes in
language design.

Probably someone will bring me back to reality shortly, though ;-)

Maybe I'll do it: the problem is really the iterators people have written.
However, you could implicitly generate __next__() which calls next() when
the result of __iter__() lacks a __next__() function... with a warning, of
course.

-Dave





From guido@python.org  Wed Jul 17 15:09:13 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 10:09:13 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 17 Jul 2002 09:45:04 EDT."
 <20020717094504.A85351@doublegemini.com>
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy> <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net>
 <20020717094504.A85351@doublegemini.com>
Message-ID: <200207171409.g6HE9Di00659@odiug.zope.com>

> I don't think it is too late.  90% ++ of the python code base out
> there doesn't use iterators yet... people are still wrapping their
> minds around it to see how they can use it in their applications.
> If it was publicly stated that this could be "fixed" in the next
> version I don't think that it would hurt.  These things happen,
> and sometimes its best to "roll back".  Programmers understand this.

I find this really hard to believe, given that such a big deal has
been made of iterators.  Care to conduct a survey on c.l.py?

Given that it's really only a very minor problem, I'd rather not
expend the effort to 'fix" this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Wed Jul 17 15:18:34 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 10:18:34 -0400
Subject: [Python-Dev] Review of build system patch requested
In-Reply-To: Your message of "Wed, 17 Jul 2002 23:37:06 +1000."
 <LCEPIIGDJPKCOIHOBJEPOEABGAAA.mhammond@skippinet.com.au>
References: <LCEPIIGDJPKCOIHOBJEPOEABGAAA.mhammond@skippinet.com.au>
Message-ID: <200207171418.g6HEIZo00747@odiug.zope.com>

> * Makefile.pre.in has been changed to pass "-DPy_BUILD_CORE" to the compiler
> when building Python itself and any builtin modules.  This flag is
> not passed to extension modules.

My only concern would be that tools which parse the Makefile (I
believe distutils does this?) should not accidentally pick up the
"-DPy_BUILD_CORE" flag.

Apart from that I trust your judgement and Neal's test drive.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From cce@clarkevans.com  Wed Jul 17 15:49:35 2002
From: cce@clarkevans.com (Clark C . Evans)
Date: Wed, 17 Jul 2002 10:49:35 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207171409.g6HE9Di00659@odiug.zope.com>; from guido@python.org on Wed, Jul 17, 2002 at 10:09:13AM -0400
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy> <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com>
Message-ID: <20020717104935.A86293@doublegemini.com>

On Wed, Jul 17, 2002 at 10:09:13AM -0400, Guido van Rossum wrote:
| > I don't think it is too late.  90% ++ of the python code base out
| > there doesn't use iterators yet... people are still wrapping their
| > minds around it to see how they can use it in their applications.
| > If it was publicly stated that this could be "fixed" in the next
| > version I don't think that it would hurt.  These things happen,
| > and sometimes its best to "roll back".  Programmers understand this.
| 
| I find this really hard to believe, given that such a big deal has
| been made of iterators. i

None of my code uses explicit use of iterators, and I was very
aware of them.  My new code that I'm building now does, but it
wouldn't take much effort to fix it.   I myself personally would
rather keep Python "clean" of blemish.   For the most part, 
Python is really free of dragons and that's why I like it.  I'm
willing to put up with short-term pain for long term gain.  Unlike
Java or Visual Basic, I intend to be programming in Python 10+ 
years from now; so from my perspective, it is an investment.

Plus, most features don't get used by the public for at least
a year or so as it takes a while for the code-examples to 
start using them and books to be updated. 

| Care to conduct a survey on c.l.py?

Sure.  I'll run the survey and report back.  What would
be the options?  It'll be a simple CGI form using a radio
or check boxes and a button.  I'll aggregate the results.
To do this I need:

 - A specific description of what would change
 - An example of what would break, plus what it would
   be replaced with.
 - An explanation of what problems occur when the 
   blemish isn't fixed (what can't you do?)

| Given that it's really only a very minor problem, I'd rather not
| expend the effort to 'fix" this.

Well, if it is a minor problem, it shouldn't be that hard to fix.

*evil grins*

Clark

-- 
Clark C. Evans                   Axista, Inc.
http://www.axista.com            800.926.5525
XCOLLA Collaborative Project Management Software



From guido@python.org  Wed Jul 17 16:03:48 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 11:03:48 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 17 Jul 2002 10:49:35 EDT."
 <20020717104935.A86293@doublegemini.com>
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy> <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com>
 <20020717104935.A86293@doublegemini.com>
Message-ID: <200207171503.g6HF3mW01047@odiug.zope.com>

> None of my code uses explicit use of iterators, and I was very
> aware of them.  My new code that I'm building now does, but it
> wouldn't take much effort to fix it.   I myself personally would
> rather keep Python "clean" of blemish.   For the most part, 
> Python is really free of dragons and that's why I like it.  I'm
> willing to put up with short-term pain for long term gain.  Unlike
> Java or Visual Basic, I intend to be programming in Python 10+ 
> years from now; so from my perspective, it is an investment.

Calling it a dragon sounds way overstated.

Another issue is that we can't really fix this retroactively in Python
2.2.  Python 2.2 has been elected to be the "Python-in-a-Tie" favored
by the Python Business Forum, giving it a very long life expectancy --
18 months from the first official release of Python-in-a-Tie (probably
Python 2.2.2), plus however long it takes people to want to ulgrade
after that.

> Plus, most features don't get used by the public for at least
> a year or so as it takes a while for the code-examples to 
> start using them and books to be updated. 
> 
> | Care to conduct a survey on c.l.py?
> 
> Sure.  I'll run the survey and report back.  What would
> be the options?  It'll be a simple CGI form using a radio
> or check boxes and a button.  I'll aggregate the results.
> To do this I need:
> 
>  - A specific description of what would change
>  - An example of what would break, plus what it would
>    be replaced with.
>  - An explanation of what problems occur when the 
>    blemish isn't fixed (what can't you do?)

- The mapping between the next() method and the tp_iternext slot in
  the type object would disappear, and instead the __next__() method
  would be mapped to this slot.  This means that every iterator
  definition written in Python has to be changed from
  "def next(self): ..." to "def __next__(self): ...".

- There would be a new built-in function, next(), which calls the
  __next__() method on its argument.

- Calls to it.next() will have to be changed to call next(it)
  instead.  (it.__next__() would also work but is not recommended.)

- There really isn't anything "broken" about the current situation;
  it's just that "next" is the only method name mapped to a slot in
  the type object that doesn't have leading and trailing double
  underscores.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ark@research.att.com  Wed Jul 17 16:27:31 2002
From: ark@research.att.com (Andrew Koenig)
Date: Wed, 17 Jul 2002 11:27:31 -0400 (EDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy> (message from
 Ka-Ping Yee on Tue, 16 Jul 2002 18:18:05 -0700 (PDT))
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy>
Message-ID: <200207171527.g6HFRV214431@europa.research.att.com>

Ping> On Mon, 15 Jul 2002, Andrew Koenig wrote:

>> However the purpose my suggestion of __multiter__ was not to use it to
>> test for multiple iteration, but to enable a container to be able to
>> yield either a single or a multiple iterator on request.

Ping> I see what you want, though i have a hard time imagining a situation
Ping> where it's really necessary to have both (as opposed to just the
Ping> multiple iterator, which is strictly more capable).  I can certainly
Ping> see how you might want to be able to ask for a breadth-first or
Ping> depth-first iterator on a tree, though.

How about a class that represents a file?  If you ask it for a single
iterator, that's easy.  If you ask it for a multiple iterator, it checks
whether the file is really an interactive device such as a pipe or a keyboard.
If so, it uses a buffering mechanism to simulate multiple iteration;
otherwise, it lets the multiple iterators access the file directly.

Then when you ask to iterate over the file, you automatically get the
least cumbersome mechanism needed to support the kind of iteration
that you requested.

>> > Or, what if there is no container to begin with, but the iterator is still
>> > copyable?  You can't flag that by putting __multiter__ on anything; again
>> > it makes more sense to just provide __copy__ on the iterator.

>> You could flag it by putting __multiter__ on the iterator, just as iterators
>> presently have __iter__.

Ping> Ugh.  I don't like this, for the reasons i outlined in another message:
Ping> an iterator is not the same as a container.  Iterators always mutate;
Ping> containers usually do not (at least not as a result of looking at the
Ping> elements).

The scenario is this:

    def search(thing):
	iter = thing.__multiter__()
	// now either iter is an iterator that supports __copy__
	// or we will raise an exception (and raise it here, rather
	// than waiting for the first time we try to copy iter). 

>> Not quite.  We also need an agreement that calling __iter__ on a container
>> is not a destructive operation unless you call next() on the iterator that
>> you get back.

Ping> What i'd like is an agreement that calling __iter__ on a container is
Ping> not a destructive operation at all.  If it's destructive, then what you
Ping> started with is not really a container, and we should encourage people
Ping> to call attention to this irregularity in their documentation.

Is a file a container or not?  Isn't making an iterator from a file and
calling next() on it a destructive operation?

>> > I think a proliferation of iterator-fetching methods would be a
>> > messy and unpleasant prospect.  After __iter__, __multiter__,
>> > and __ambiter__, what next?  __mutableiter__?
>> > __depthfirstiter__?  __breadthfirstiter__?

>> A data structure that supports several different kinds of iteration
>> has to provide that support somehow.

Ping> Agreed.  I was unclear: what makes me uncomfortable is the pollution
Ping> of the double-underscore namespace.  When you do have a container-like
Ping> object that supports various kinds of iteration, naturally you are
Ping> going to need some methods for getting iterators.  I just think it's
Ping> not appropriate to establish special names for them.

Fair enough.  But then why is __iter__ special?

Ping> To me, the presence of double-underscores around a method name means
Ping> that the method is called automatically.  My expectation is that when
Ping> i write a method with a "normal" name, the name itself will appear
Ping> after a dot wherever that method is used; and that when there's a
Ping> method with a "__special__" name, the method is called implicitly.
Ping> The implicit call can occur via an operator (e.g. __add__), or to
Ping> implement a protocol defined in the language (e.g. __init__), etc.
Ping> If you see the string ".__" it means that something unusual is going on.

Ping> If you follow this convention, then "__iter__" deserves a special name,
Ping> because it is the specially blessed iterator-getter used by "for".
Ping> There may be other iterator-getters, but they must be called explicitly,
Ping> so they shouldn't get underscores.

Ah, is it only "for" that makes __iter__ special, and not iter() ?

Ping> An aside on "next" vs. "__next__":

Ping> This is exactly why the iterator protocol should consist of one
Ping> method named "__next__" rather than two methods named "__iter__"
Ping> (which has nothing to do with the act of iterating!) and "next"
Ping> (which is the one we really care about, but can collide with
Ping> existing method names).

Ping> As far as i know, "next" is the only implicitly-called method of
Ping> an internal protocol that has no underscores.  It's a little late
Ping> to fix the name of "next" in Python 2, though it might be worth
Ping> considering for Python 3.

One way to clarify a discussion of a protocol is to append an "s" and
think of a plurality of protocols, so as to see which properites are
truly intrinsic and which can vary between protocols.  That's part of
what I'm trying to do in this discussion.

(and I don't presently have a strong opinion about what the right answer
is.  I don't even know for sure what the question is.)




From aleax@aleax.it  Wed Jul 17 17:08:51 2002
From: aleax@aleax.it (Alex Martelli)
Date: Wed, 17 Jul 2002 18:08:51 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207161350.g6GDoM522277@odiug.zope.com>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <E17USOm-0005RD-00@mail.python.org> <200207161350.g6GDoM522277@odiug.zope.com>
Message-ID: <E17UrN2-0000Re-00@mail.python.org>

On Tuesday 16 July 2002 03:50 pm, Guido van Rossum wrote:
	...
> I dunno.  The presence of seek() and write() makes the behavior of
> files a rather unique blend of iterator and iterable.

All files have seek and write, but not on all files do they work -- and
the same goes for iteration.  I.e., it IS something of a mess, probably
because the file object's is the only example of "fat interface" problem
in Python -- an interface that exposes a lot of methods, with many
objects claiming they implement that interface but actually lying
(because they only implement a subset of it -- trying to use methods
they can't in fact provide raises exceptions).

The galaxy of Microsoft interfaces based on COM has sadly many
fat interfaces and it IS the worst mess with that galaxy.

Anyway, a rewindable-iterator is not an iterable in any case.  You
can't have two nested loops on it -- that's crucial.  Making a file
into an iterable requires wrapping it with a class that caches it.

If and when rewindable iterators are recognized as such by Python,
files whose seek(0) method doesn't raise will make a fine example.

But iterables, they ain't, just like rewindable iterators in general aren't.


> > I don't see any downside to having this micro-wart removed.  In
> > particular, I don't see what's confusing.  Things that respond to
> > iter(x) fall in two categories:
> >     iterators: also have x.next(), and iter(x) is x
> >     iterables: iter(x) is not x, so you can presumably get another
> >         iterator out of x at some later point in time if needed.
> > It's not QUITE as simple as this, but moving file objects from
> > the second category to the first seems to _simplify_ things a bit.
>
> I worry that equating a file with its iterable makes it more likely
> that people mix next() with readline() or seek(), which doesn't work
> (at least not until the I/O system is rewritten).

It's exactly to DISTINGUISH a file from "its iterable" (which it does
not have) that I'd like files to be iterators, NOT fake iterables.

f.seek does cooperate with f.next now, doesn't it?  since it
invalidates f's xreadlines object, if any?

> I'd be more comfortable with teaching people that you should *either*
> use a file in a for loop (the common case, probably) *or* use its
> native I/O methods (readline() etc.), but not mix both.

Fine (I think BOTH cases are very common), although it will probably
be handier one day if/when the I/O system is indeed rewritten.  But
having "iter(f) is f" isn't really germane to this issue.


> > E.g.:
> >
> > def useIterable(x):
> >     try:
> >         it = iter(x)
> >     except TypeError:
> >         raise TypeError, "Need iterable object, not %s" % type(x)
> >     if it is x:
> >         raise TypeError, "Need iterable object, not iterator"
> >     # keep happily using it and/or x as needed, and in particular
> >     # the code is able to call it1 = iter(x) if it needs to iterate again
> >
> > Not perfect -- but having a file-object argument fail this simplistic
> > test seems better to me, less confusing, than having it pass.
>
> This actually looks like an example of the "look before you leap"
> (LBYL) syndrome, which you disapproved of recently.

Only if you don't look carefully enough.  It uses try/except when
it can (just to change the exception's contents -- probably might
as well not bother and just do it=iter(x) without a try), it uses
a guarded raise statement when it must, because there's no way
it could get an exception out of the case it can't handle.

Consider, by analogy:

def loopUntilConvergence(f, x, epsilon):
    y = f(x)
    while abs(x-y) > epsilon:
        x = y
        y = f(x)
    return y

Now what happens if you mistakenly pass epsilon<0?  Oops -- an
infinite loop.  So, one may add:
    if epsilon<0: raise ValueError, "Need epsilon>=0, not %s" % epsilon

Is this an example of erroneous use of LBYL rather than EAFP?  No,
because no exception would be raised by the infinite loop, so there is
no alternative to doing the checks.

In exactly the same way, there is no alternative to checking in
useIterable, because there is no exception one could count on --
rather, we'd have a case of an error passing silently.

In other words: that EAFP is preferable to LBYL does NOT mean
that one should NEVER use:
    if whatever: raise something
because certain error conditions do reveal themselves only in
ways testable with an if, NOT by raising exceptions themselves.

And some you can't even test with an if, and then you're in
trouble (e.g., in loopUntilConvergence, nothing assures us that
f and the initial x ARE such as to converge -- so, one would
further have a maximum-iteration-count argument, defaulting to
something suitably big, count iterations, and do something of
a look-AFTER-you've-leaped to raise on non-iteration:-).


This doesn't have all that much to do with file objects being
or not being iterators, but I love rambling discussions anyway:-).


Alex



From aahz@pythoncraft.com  Wed Jul 17 17:17:55 2002
From: aahz@pythoncraft.com (Aahz)
Date: Wed, 17 Jul 2002 12:17:55 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <20020717104935.A86293@doublegemini.com>
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy> <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com> <20020717104935.A86293@doublegemini.com>
Message-ID: <20020717161754.GA22297@panix.com>

On Wed, Jul 17, 2002, Clark C . Evans wrote:
>
> Sure.  I'll run the survey and report back.  What would
> be the options?  It'll be a simple CGI form using a radio
> or check boxes and a button.  I'll aggregate the results.

Make sure you ask for e-mail addresses to prevent duplicates.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From guido@python.org  Wed Jul 17 17:23:46 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 12:23:46 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 17 Jul 2002 18:08:51 +0200."
 <E17UrN3-0000SA-00@mail.python.org>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <E17USOm-0005RD-00@mail.python.org> <200207161350.g6GDoM522277@odiug.zope.com>
 <E17UrN3-0000SA-00@mail.python.org>
Message-ID: <200207171623.g6HGNki02368@odiug.zope.com>

> All files have seek and write, but not on all files do they work -- and
> the same goes for iteration.  I.e., it IS something of a mess, probably
> because the file object's is the only example of "fat interface" problem
> in Python -- an interface that exposes a lot of methods, with many
> objects claiming they implement that interface but actually lying
> (because they only implement a subset of it -- trying to use methods
> they can't in fact provide raises exceptions).

Yup.  I inherited this from C stdio. :-(

> But iterables, they ain't, just like rewindable iterators in general aren't.

Can you remind me of your definition of "iterable"?  Mine is
"something for which iter() works", which clearly isn't yours. :-)

> f.seek does cooperate with f.next now, doesn't it?  since it
> invalidates f's xreadlines object, if any?

Not yet.  You may have seen Oren's patch for this.  Unfortunately it
has a problem in that it creates a cycle, and neither type supports
GC...

So I'm not sure if it ever will -- this is an implementation mess as
much as a conceptual mess. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aleax@aleax.it  Wed Jul 17 17:34:09 2002
From: aleax@aleax.it (Alex Martelli)
Date: Wed, 17 Jul 2002 18:34:09 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207171623.g6HGNki02368@odiug.zope.com>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <E17UrN3-0000SA-00@mail.python.org> <200207171623.g6HGNki02368@odiug.zope.com>
Message-ID: <E17UrlO-0005w9-00@mail.python.org>

On Wednesday 17 July 2002 06:23 pm, Guido van Rossum wrote:
	...
> > But iterables, they ain't, just like rewindable iterators in general
> > aren't.
>
> Can you remind me of your definition of "iterable"?  Mine is
> "something for which iter() works", which clearly isn't yours. :-)

Right -- I mean something closer to what I've seen others call
"a container".  By your definition, iterators are indeed iterable.
I would love for all iterables-by-your-definition to divide neatly
into iterators and what-many-call-containers.

The file object, unless you make it into an iterator, is not "a
container" like all others and just sits there -- a bit of a wart.

> > f.seek does cooperate with f.next now, doesn't it?  since it
> > invalidates f's xreadlines object, if any?
>
> Not yet.  You may have seen Oren's patch for this.  Unfortunately it

Right -- that's what I had in mind.  I had also tweaked it so that
readline sort of interoperated with it (delegating to next if the file
object is holding an xreadlines object) and sent the modified patch
to Oren but he disliked it (because it meant readline would not
respect its numeric argument, if any, in that case).

> has a problem in that it creates a cycle, and neither type supports
> GC...
>
> So I'm not sure if it ever will -- this is an implementation mess as
> much as a conceptual mess. :-(

I see your point.  Darn!-(.


Alex



From guido@python.org  Wed Jul 17 17:40:11 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 12:40:11 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 17 Jul 2002 18:34:09 +0200."
 <E17Url3-0005w8-00@mail.python.org>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <E17UrN3-0000SA-00@mail.python.org> <200207171623.g6HGNki02368@odiug.zope.com>
 <E17Url3-0005w8-00@mail.python.org>
Message-ID: <200207171640.g6HGeB302603@odiug.zope.com>

> > Can you remind me of your definition of "iterable"?  Mine is
> > "something for which iter() works", which clearly isn't yours. :-)
> 
> Right -- I mean something closer to what I've seen others call
> "a container".  By your definition, iterators are indeed iterable.
> I would love for all iterables-by-your-definition to divide neatly
> into iterators and what-many-call-containers.
> 
> The file object, unless you make it into an iterator, is not "a
> container" like all others and just sits there -- a bit of a wart.

I must be misunderstanding.  How does making the file object into an
iterator make it a container???

> > > f.seek does cooperate with f.next now, doesn't it?  since it
> > > invalidates f's xreadlines object, if any?
> >
> > Not yet.  You may have seen Oren's patch for this.  Unfortunately it
> 
> Right -- that's what I had in mind.  I had also tweaked it so that
> readline sort of interoperated with it (delegating to next if the file
> object is holding an xreadlines object) and sent the modified patch
> to Oren but he disliked it (because it meant readline would not
> respect its numeric argument, if any, in that case).

Hm, you should've sent it to me.  The numeric argument was a mistake I
think.  Who ever uses it?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gmcm@hypernet.com  Wed Jul 17 17:49:44 2002
From: gmcm@hypernet.com (Gordon McMillan)
Date: Wed, 17 Jul 2002 12:49:44 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207171527.g6HFRV214431@europa.research.att.com>
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy> (message from	Ka-Ping Yee on Tue, 16 Jul 2002 18:18:05 -0700 (PDT))
Message-ID: <3D3567E8.17472.1EF7E920@localhost>

On 17 Jul 2002 at 11:27, Andrew Koenig wrote:

> How about a class that represents a file?  If you
> ask it for a single iterator, that's easy.  If you
> ask it for a multiple iterator, it checks whether
> the file is really an interactive device such as a
> pipe or a keyboard. If so, it uses a buffering
> mechanism to simulate multiple iteration; otherwise,
> it lets the multiple iterators access the file
> directly. 
> 
> Then when you ask to iterate over the file, you
> automatically get the least cumbersome mechanism
> needed to support the kind of iteration that you
> requested. 

OK, it's a pipe, and one iterator wants to go past
what's been received. Is that iterator at EOF? Not
really, just "temporary EOF". So should it block?
But I'm single threaded and receiving asynchronously.
Oh, and it turns out to be a humongous download,
and what happens if the buffering mechanism runs
out of memory / disk space. Does my process die?

Aargh. Too much magic. Too many corner cases.

-- Gordon
http://www.mcmillan-inc.com/




From ark@research.att.com  Wed Jul 17 17:53:27 2002
From: ark@research.att.com (Andrew Koenig)
Date: Wed, 17 Jul 2002 12:53:27 -0400 (EDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <3D3567E8.17472.1EF7E920@localhost> (gmcm@hypernet.com)
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy> (message from	Ka-Ping Yee on Tue, 16 Jul 2002 18:18:05 -0700 (PDT)) <3D3567E8.17472.1EF7E920@localhost>
Message-ID: <200207171653.g6HGrRb15897@europa.research.att.com>

Gordon> OK, it's a pipe, and one iterator wants to go past
Gordon> what's been received. Is that iterator at EOF? Not
Gordon> really, just "temporary EOF". So should it block?
Gordon> But I'm single threaded and receiving asynchronously.
Gordon> Oh, and it turns out to be a humongous download,
Gordon> and what happens if the buffering mechanism runs
Gordon> out of memory / disk space. Does my process die?

Gordon> Aargh. Too much magic. Too many corner cases.

The implementation of the file should have a choice: Either refuse to
yield a multiple iterator (which seems to be your preference) or yield
one that works (which might or might not be my preference, depending
on circumstances).

In the latter case, I don't think your questions are hard to
answer, because most of the answers fall out of the single-iterator
case.  So if the iterator is at EOF, it should either block or
not, depending on what a single iterator should so.

The only real question is what happens if the buffering mechanism
runs out of space, but that's always a question for such mechanisms;
I don't see why it's any more irksome in this particular context.





From aleax@aleax.it  Wed Jul 17 17:55:48 2002
From: aleax@aleax.it (Alex Martelli)
Date: Wed, 17 Jul 2002 18:55:48 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207171640.g6HGeB302603@odiug.zope.com>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <E17Url3-0005w8-00@mail.python.org> <200207171640.g6HGeB302603@odiug.zope.com>
Message-ID: <E17Us6L-00060l-00@mail.python.org>

On Wednesday 17 July 2002 06:40 pm, Guido van Rossum wrote:
	...
> > The file object, unless you make it into an iterator, is not "a
> > container" like all others and just sits there -- a bit of a wart.
>
> I must be misunderstanding.  How does making the file object into an
> iterator make it a container???

My fault for unclear expression!  I mean: if it's an iterator, it's an
iterator.  All OTHER iterables (iterables that aren't iterators) are
(what some call) containers.

It's not QUITE that way, but Python would be easier to teach if
it were.

> > to Oren but he disliked it (because it meant readline would not
> > respect its numeric argument, if any, in that case).
>
> Hm, you should've sent it to me.  The numeric argument was a mistake I
> think.  Who ever uses it?

Not me, and I think it's advisory anyway according to the docs.

Still, it doesn't solve the reference-loop-between-two-deuced-things-
that-don't-cooperate-with-gc problem.  And I can't see how either
could be made into a WEAK reference given that xreadlines objects
in other contexts need to hold a strong ref to the file they work on --
we'd have to refactor xreadlines objects too, a core part holding a
weak ref and a shell around it (holding a strong ref to the file) to
support ordinary calls to xreadlines.xreadlines.  Messy:-(.


Alex



From mal@lemburg.com  Wed Jul 17 17:55:36 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jul 2002 18:55:36 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128
 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121
 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170
 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156
 xxobject.c,2.20,2.21
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <3D35A188.20407@lemburg.com>


jhylton@users.sourceforge.net wrote:
> Update of /cvsroot/python/python/dist/src/Objects
> In directory usw-pr-cvs1:/tmp/cvs-serv17711/Objects
> 
> Modified Files:
> 	dictobject.c floatobject.c intobject.c listobject.c 
> 	longobject.c rangeobject.c stringobject.c tupleobject.c 
> 	typeobject.c unicodeobject.c xxobject.c 
> Log Message:
> staticforward bites the dust.
> 
> The staticforward define was needed to support certain broken C
> compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the
> static keyword when it was used with a forward declaration of a static
> initialized structure.  Standard C allows the forward declaration with
> static, and we've decided to stop catering to broken C compilers.  (In
> fact, we expect that the compilers are all fixed eight years later.)

You'd think so :-) From a support file of the mx tools:

/* --- Platform or compiler specific tweaks ------------------------------- */

/* Add some platform specific symbols to enable work-arounds for the
    static forward declaration of type definitions; note that the GNU C
    compiler does not have this problem.

    Many thanks to all who have contributed to this list.

*/
#if (!defined(__GNUC__))
# if (defined(NeXT) || defined(sgi) || defined(_AIX) ||
      (defined(__osf__) && defined(__DECC)) || defined(TrueComaq64) || defined(__VMS))
#  define BAD_STATIC_FORWARD
# endif
#endif

/* Some more tweaks for various platforms. */

/* VMS needs this define. Thanks to Jean-Fran?ois PI?RONNE */
#if defined(__VMS)
# define __SC__
#endif

/* xlC on AIX doesn't like the Python work-around for static forwards
    in ANSI mode (default), so we switch on extended mode. Thanks to
    Albert Chin-A-Young */
#if defined(__xlC__)
# pragma langlvl extended
#endif

> I'm leaving staticforward and statichere defined in object.h as
> static.  This is only for backwards compatibility with C extensions
> that might still use it.



-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From guido@python.org  Wed Jul 17 18:38:35 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 13:38:35 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 17 Jul 2002 18:55:48 +0200."
 <E17Us6L-00060l-00@mail.python.org>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <E17Url3-0005w8-00@mail.python.org> <200207171640.g6HGeB302603@odiug.zope.com>
 <E17Us6L-00060l-00@mail.python.org>
Message-ID: <200207171738.g6HHcZa09875@odiug.zope.com>

> > > The file object, unless you make it into an iterator, is not "a
> > > container" like all others and just sits there -- a bit of a wart.
> >
> > I must be misunderstanding.  How does making the file object into an
> > iterator make it a container???
> 
> My fault for unclear expression!  I mean: if it's an iterator, it's an
> iterator.  All OTHER iterables (iterables that aren't iterators) are
> (what some call) containers.
> 
> It's not QUITE that way, but Python would be easier to teach if
> it were.

But leaving the file object as an exception to the rule helps as a
reminder that it's just a rule of thumb and cannot be taken as
absolute law.

> > > to Oren but he disliked it (because it meant readline would not
> > > respect its numeric argument, if any, in that case).
> >
> > Hm, you should've sent it to me.  The numeric argument was a mistake I
> > think.  Who ever uses it?
> 
> Not me, and I think it's advisory anyway according to the docs.
> 
> Still, it doesn't solve the reference-loop-between-two-deuced-things-
> that-don't-cooperate-with-gc problem.  And I can't see how either
> could be made into a WEAK reference given that xreadlines objects
> in other contexts need to hold a strong ref to the file they work on --
> we'd have to refactor xreadlines objects too, a core part holding a
> weak ref and a shell around it (holding a strong ref to the file) to
> support ordinary calls to xreadlines.xreadlines.  Messy:-(.

I don't think that a weak ref to the file would be sufficient for
xreadlines -- e.g.

    for line in open(filename):
        print line,

would close the file right away.

Likewise, the file needs a strong ref to the xreadlines, otherwise the
following would create a new iterator in the second for loop, and lose
data buffered by the first iterator.

    f = open(filename)
    it = iter(f)
    for i in range(10):
        it.next()
    del it
    for line in f:
        print line,

I think I will have to reject Oren's patch because of this, and the
situation with file iterators will remain as it is: once you've asked
for the iterator, all operations on the file are unsafe, and the only
way to get back to using the file is to abandon the file and do an
absolute seek on the file.  (This is sort of like switching between
the raw integer file descriptor and the stream object in C -- or in
Python if you care to use f.fileno() and os.read() etc.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aleax@aleax.it  Wed Jul 17 18:46:42 2002
From: aleax@aleax.it (Alex Martelli)
Date: Wed, 17 Jul 2002 19:46:42 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207171738.g6HHcZa09875@odiug.zope.com>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <E17Us6L-00060l-00@mail.python.org> <200207171738.g6HHcZa09875@odiug.zope.com>
Message-ID: <E17Ustw-0003DY-00@mail.python.org>

On Wednesday 17 July 2002 07:38 pm, Guido van Rossum wrote:
	...
> But leaving the file object as an exception to the rule helps as a
> reminder that it's just a rule of thumb and cannot be taken as
> absolute law.

The sublunar world has enough reminders of its imperfections that
we need not strive to add more.


> > Still, it doesn't solve the reference-loop-between-two-deuced-things-
> > that-don't-cooperate-with-gc problem.  And I can't see how either
> > could be made into a WEAK reference given that xreadlines objects
> > in other contexts need to hold a strong ref to the file they work on --
> > we'd have to refactor xreadlines objects too, a core part holding a
> > weak ref and a shell around it (holding a strong ref to the file) to
> > support ordinary calls to xreadlines.xreadlines.  Messy:-(.
>
> I don't think that a weak ref to the file would be sufficient for
> xreadlines -- e.g.
>
>     for line in open(filename):
>         print line,
>
> would close the file right away.

If the iterator were the file itself, no it wouldn't, whatever kind of
ref the xreadlines object had to the file.

What would break without refactoring would be:

    for line in xreadlines.xreadlines(open(filename)):
        ...

The refactoring would be to have a, say _xreadlines, object, with
the functionality of today's xreadlines object BUT a weak ref to
the file, and an xreadlines object with strong refs to the file and
the _xreadlines object and delegating functionality to the latter.
A bit of a mess.


> Likewise, the file needs a strong ref to the xreadlines, otherwise the

Definitely!  Otherwise nothing keeps the xreadlines (or _xreadlines)
object around _at all_ -- it's even worse than you indicate below, it
seems to me:

> following would create a new iterator in the second for loop, and lose
> data buffered by the first iterator.
>
>     f = open(filename)
>     it = iter(f)

...with the patch it would be "it is f", and so, I don't really get it...

>     for i in range(10):
>         it.next()
>     del it
>     for line in f:
>         print line,
>
> I think I will have to reject Oren's patch because of this, and the
> situation with file iterators will remain as it is: once you've asked
> for the iterator, all operations on the file are unsafe, and the only
> way to get back to using the file is to abandon the file and do an

Abandon the iterator, you mean?  Or am I hopelessly confused?

> absolute seek on the file.  (This is sort of like switching between
> the raw integer file descriptor and the stream object in C -- or in
> Python if you care to use f.fileno() and os.read() etc.)

In these cases you do get some control on the buffering, though,
if you care to exercise it.


Alex



From guido@python.org  Wed Jul 17 19:07:31 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 14:07:31 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 17 Jul 2002 19:46:42 +0200."
 <E17Ustw-0003DY-00@mail.python.org>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <E17Us6L-00060l-00@mail.python.org> <200207171738.g6HHcZa09875@odiug.zope.com>
 <E17Ustw-0003DY-00@mail.python.org>
Message-ID: <200207171807.g6HI7VS10049@odiug.zope.com>

OK, I'll wait to see if someone submits a working patch.  I still find
it a non-issue myself.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ark@research.att.com  Wed Jul 17 19:08:50 2002
From: ark@research.att.com (Andrew Koenig)
Date: 17 Jul 2002 14:08:50 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207171738.g6HHcZa09875@odiug.zope.com>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]>
 <E17Url3-0005w8-00@mail.python.org>
 <200207171640.g6HGeB302603@odiug.zope.com>
 <E17Us6L-00060l-00@mail.python.org>
 <200207171738.g6HHcZa09875@odiug.zope.com>
Message-ID: <yu99fzyifcr1.fsf@europa.research.att.com>

Guido> Likewise, the file needs a strong ref to the xreadlines,
Guido> otherwise the following would create a new iterator in the
Guido> second for loop, and lose data buffered by the first iterator.

Guido>     f = open(filename)
Guido>     it = iter(f)
Guido>     for i in range(10):
Guido>         it.next()
Guido>     del it
Guido>     for line in f:
Guido>         print line,

Guido> I think I will have to reject Oren's patch because of this, and
Guido> the situation with file iterators will remain as it is: once
Guido> you've asked for the iterator, all operations on the file are
Guido> unsafe, and the only way to get back to using the file is to
Guido> abandon the file and do an absolute seek on the file.

This implies that you don't expect the code above to work correctly, right?

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark



From guido@python.org  Wed Jul 17 19:30:09 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 14:30:09 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 17 Jul 2002 14:08:50 EDT."
 <yu99fzyifcr1.fsf@europa.research.att.com>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <E17Url3-0005w8-00@mail.python.org> <200207171640.g6HGeB302603@odiug.zope.com> <E17Us6L-00060l-00@mail.python.org> <200207171738.g6HHcZa09875@odiug.zope.com>
 <yu99fzyifcr1.fsf@europa.research.att.com>
Message-ID: <200207171830.g6HIU9i10168@odiug.zope.com>

> Guido> Likewise, the file needs a strong ref to the xreadlines,
> Guido> otherwise the following would create a new iterator in the
> Guido> second for loop, and lose data buffered by the first iterator.
> 
> Guido>     f = open(filename)
> Guido>     it = iter(f)
> Guido>     for i in range(10):
> Guido>         it.next()
> Guido>     del it
> Guido>     for line in f:
> Guido>         print line,
> 
> Guido> I think I will have to reject Oren's patch because of this, and
> Guido> the situation with file iterators will remain as it is: once
> Guido> you've asked for the iterator, all operations on the file are
> Guido> unsafe, and the only way to get back to using the file is to
> Guido> abandon the file and do an absolute seek on the file.
> 
> This implies that you don't expect the code above to work correctly, right?

I think that Oren's patch would make this work (the iterator requested
by the second for loop would return the same iterator as the first
one, since it's cached in the file object), but at the cost of an
unbreakable cycle between the file and the xreadlines object.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy@alum.mit.edu  Wed Jul 17 19:38:57 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Wed, 17 Jul 2002 14:38:57 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128
 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121
 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170
 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156
 xxobject.c,2.20,2.21
In-Reply-To: <3D35A188.20407@lemburg.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
 <3D35A188.20407@lemburg.com>
Message-ID: <15669.47553.15097.651868@slothrop.zope.com>

Sigh :-(.  Both C89 and C99 say that what we're doing is legal.

I'll try this on the SF compile farm's True64 and see where I get.
Reports of failures on other platforms would be appreciated.  (Actual
compiler output rather than include files.  I don't want to believe
you <0.3 wink>.) 

Jeremy




From guido@python.org  Wed Jul 17 19:52:15 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 14:52:15 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21
In-Reply-To: Your message of "Wed, 17 Jul 2002 14:38:57 EDT."
 <15669.47553.15097.651868@slothrop.zope.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net> <3D35A188.20407@lemburg.com>
 <15669.47553.15097.651868@slothrop.zope.com>
Message-ID: <200207171852.g6HIqFY10241@odiug.zope.com>

I also note that there seems to be a typo in Marc-Andre's include
file:

      (defined(__osf__) && defined(__DECC)) || defined(TrueComaq64) || defined(__VMS))

Note the missing 'p' in 'TrueComaq64'.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin@v.loewis.de  Wed Jul 17 19:57:07 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 17 Jul 2002 20:57:07 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21
In-Reply-To: <200207171852.g6HIqFY10241@odiug.zope.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
 <3D35A188.20407@lemburg.com>
 <15669.47553.15097.651868@slothrop.zope.com>
 <200207171852.g6HIqFY10241@odiug.zope.com>
Message-ID: <m33cuii3ng.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

>       (defined(__osf__) && defined(__DECC)) || defined(TrueComaq64) || defined(__VMS))
> 
> Note the missing 'p' in 'TrueComaq64'.

I thought it had superfluous 'q' instead...

Regards,
Martin



From tim@zope.com  Wed Jul 17 20:09:04 2002
From: tim@zope.com (Tim Peters)
Date: Wed, 17 Jul 2002 15:09:04 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <E17Us6L-00060l-00@mail.python.org>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEAJDHAA.tim@zope.com>

Note that it's easy to make objects cooperate with gc.  We've historically
only done so when the need was clear, because the gc header takes about a
dozen extra bytes per gc-tracked object.  There aren't enough files or
xreadlines objects in existence to care about the extra memory burden here,
though; we simply thought that objects of these types could never be in
cycles.  OTOH, if that means lazy code like

    for fname in os.listdir('.'):
        for line in file(fname):
            n += 1

would accumulate an ever-growing number of open file objects until gc
happened to run and break cycles, I expect a lot of CPython programs would
"suddenly break" (they rely on refcount semantics now closing the anonymous
file object the instant it becomes unreachable).




From cce@clarkevans.com  Wed Jul 17 20:41:36 2002
From: cce@clarkevans.com (Clark C . Evans)
Date: Wed, 17 Jul 2002 15:41:36 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207171503.g6HF3mW01047@odiug.zope.com>; from guido@python.org on Wed, Jul 17, 2002 at 11:03:48AM -0400
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy> <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com> <20020717104935.A86293@doublegemini.com> <200207171503.g6HF3mW01047@odiug.zope.com>
Message-ID: <20020717154136.A91218@doublegemini.com>

On Wed, Jul 17, 2002 at 11:03:48AM -0400, Guido van Rossum wrote:
| 
| Calling it a dragon sounds way overstated.

Oh.  I wasn't calling it a dragon... I was stating
that Python is dragon free.  

| > | Care to conduct a survey on c.l.py?
| > 
| > Sure.  I'll run the survey and report back. 

Ok.  Here is the survey form for comment before it is posted.

    http://yaml.org/wk/survey?id=pyiter

I'll summarize the results after the survey has run its
course...

Best,

Clark
http://yaml.org




From guido@python.org  Wed Jul 17 20:46:23 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 15:46:23 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 17 Jul 2002 15:41:36 EDT."
 <20020717154136.A91218@doublegemini.com>
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy> <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com> <20020717104935.A86293@doublegemini.com> <200207171503.g6HF3mW01047@odiug.zope.com>
 <20020717154136.A91218@doublegemini.com>
Message-ID: <200207171946.g6HJkN210645@odiug.zope.com>

> Ok.  Here is the survey form for comment before it is posted.
> 
>     http://yaml.org/wk/survey?id=pyiter
> 
> I'll summarize the results after the survey has run its
> course...

Fine with me.  Maybe the 4th para ("There really isn't ...") should be
moved up so it becomes the 2nd.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ping@zesty.ca  Wed Jul 17 20:58:55 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Wed, 17 Jul 2002 14:58:55 -0500 (CDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207171409.g6HE9Di00659@odiug.zope.com>
Message-ID: <Pine.LNX.4.33.0207171453350.11138-100000@server1.lfw.org>

On Wed, 17 Jul 2002, Guido van Rossum wrote:
> Given that it's really only a very minor problem, I'd rather not
> expend the effort to 'fix" this.

I do agree that there is a policy decision to be made about
when it's appropriate to make a protocol change, and that this
should be left to you, Guido.

But i think this is more than a minor problem.  This is a
namespace collision problem, and that's significant.  Naming
the method "next" means that any object with a "next" method
cannot be adapted to support the iterator protocol.  Unfortunately
"next" is a pretty common word and it's quite possible that such
a method name is already in use.

So it's worth thinking through.


-- ?!ng




From guido@python.org  Wed Jul 17 21:00:54 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 16:00:54 -0400
Subject: [Python-Dev] Going to OSCON?  Give a lightning talk!
Message-ID: <200207172000.g6HK0sp13487@odiug.zope.com>

If you're going to the O'Reilly Open Source Convention next week,
please consider giving a lightning talk.  We have reserved two
45-minute slots in the Python track on Thursday afternoon for
lightning talks.

A lightning talk is a 5-minute tightly-focused presentation on any
subject you like.  You can discuss your favorite extension, rant, sing
the praises of an under-appreciated developer, plug your product or
company, beg for a job, or even present a Shakespearean-style play
(don't laugh --- we had one of these in 2001).

To submit your idea, fill out this simple web form:

http://conferences.oreillynet.com/cs/os2002/create/e_sess?x-t=os2002_lt.create.form

--Guido van Rossum (home page: http://www.python.org/~guido/)



From cce@clarkevans.com  Wed Jul 17 21:11:44 2002
From: cce@clarkevans.com (Clark C . Evans)
Date: Wed, 17 Jul 2002 16:11:44 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <Pine.LNX.4.33.0207171453350.11138-100000@server1.lfw.org>; from ping@zesty.ca on Wed, Jul 17, 2002 at 02:58:55PM -0500
References: <200207171409.g6HE9Di00659@odiug.zope.com> <Pine.LNX.4.33.0207171453350.11138-100000@server1.lfw.org>
Message-ID: <20020717161144.A91916@doublegemini.com>

On Wed, Jul 17, 2002 at 02:58:55PM -0500, Ka-Ping Yee wrote:
| But i think this is more than a minor problem.  This is a
| namespace collision problem, and that's significant.  Naming
| the method "next" means that any object with a "next" method
| cannot be adapted to support the iterator protocol.  Unfortunately
| "next" is a pretty common word and it's quite possible that such
| a method name is already in use.

Right, but such objects wouldn't be mis-leading beacuse they'd
be missing a __iter__ method, correct?

Best,

Clark



From guido@python.org  Wed Jul 17 21:04:12 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 16:04:12 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 17 Jul 2002 14:58:55 CDT."
 <Pine.LNX.4.33.0207171453350.11138-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0207171453350.11138-100000@server1.lfw.org>
Message-ID: <200207172004.g6HK4C613511@odiug.zope.com>

> But i think this is more than a minor problem.  This is a
> namespace collision problem, and that's significant.  Naming
> the method "next" means that any object with a "next" method
> cannot be adapted to support the iterator protocol.  Unfortunately
> "next" is a pretty common word and it's quite possible that such
> a method name is already in use.

Can you explain this?  Last time I checked CVS, PEP 246 wasn't
implemented yet, so I don't think you mean "adapted" in that sense.
Generally speaking, iterator implementations aren't created by making
changes to an existing class -- they're created by creating new a
class.  The only change to *existing* classes needed is the addition
of an __iter__ method to the underlying container object.  So I'm not
sure what you mean.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal@lemburg.com  Wed Jul 17 21:19:36 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jul 2002 22:19:36 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128
 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121
 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170
 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156
 xxobject.c,2.20,2.21
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net> <3D35A188.20407@lemburg.com>              <15669.47553.15097.651868@slothrop.zope.com> <200207171852.g6HIqFY10241@odiug.zope.com>
Message-ID: <3D35D158.6010908@lemburg.com>

Guido van Rossum wrote:
> I also note that there seems to be a typo in Marc-Andre's include
> file:
> 
>       (defined(__osf__) && defined(__DECC)) || defined(TrueComaq64) || defined(__VMS))
> 
> Note the missing 'p' in 'TrueComaq64'.

Good catch. Thanks.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Wed Jul 17 21:25:19 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jul 2002 22:25:19 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128
 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121
 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170
 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156
 xxobject.c,2.20,2.21
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>	<3D35A188.20407@lemburg.com>	<15669.47553.15097.651868@slothrop.zope.com>	<200207171852.g6HIqFY10241@odiug.zope.com> <m33cuii3ng.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D35D2AF.9050604@lemburg.com>


Martin v. Loewis wrote:
> Guido van Rossum <guido@python.org> writes:
> 
> 
>>      (defined(__osf__) && defined(__DECC)) || defined(TrueComaq64) || defined(__VMS))
>>
>>Note the missing 'p' in 'TrueComaq64'.
> 
> 
> I thought it had superfluous 'q' instead...

The whole name is superfluous ... it should be TrueHP64 ;-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Wed Jul 17 21:32:38 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jul 2002 22:32:38 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128
 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121
 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170
 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156
 xxobject.c,2.20,2.21
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>	<3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com>
Message-ID: <3D35D466.5090903@lemburg.com>

Jeremy Hylton wrote:
> Sigh :-(.  Both C89 and C99 say that what we're doing is legal.
> 
> I'll try this on the SF compile farm's True64 and see where I get.
> Reports of failures on other platforms would be appreciated.  (Actual
> compiler output rather than include files.  I don't want to believe
> you <0.3 wink>.) 

Can't provide you with that. I simply collect feedback from
users having compile problems in that file.

Note that most of these problems are related to declaring
arrays as static forward (rather than C functions as Python
normally does):

staticforward PyMethodDef mxODBCursor_Methods[];

...tons of code...

statichere PyMethodDef mxODBCursor_Methods[] =
{
     /* DB API interface */
...

I could eliminate those by clever rearranging the code,
but have never had an actual need for it.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From guido@python.org  Wed Jul 17 21:45:11 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 16:45:11 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156 xxobject.c,2.20,2.21
In-Reply-To: Your message of "Wed, 17 Jul 2002 22:32:38 +0200."
 <3D35D466.5090903@lemburg.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net> <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com>
 <3D35D466.5090903@lemburg.com>
Message-ID: <200207172045.g6HKjBg13729@odiug.zope.com>

> Can't provide you with that. I simply collect feedback from
> users having compile problems in that file.

Of course you never hear from users when their compiler is fixed so
that a particular work-around is no longer necessary, so you keep
collecting cruft until it collapses under its own weight.

> Note that most of these problems are related to declaring
> arrays as static forward (rather than C functions as Python
> normally does):

Note that staticforward was *only* intended for data declarations.  It
was never intended (nor needed) for functions.

> staticforward PyMethodDef mxODBCursor_Methods[];
> 
> ...tons of code...
> 
> statichere PyMethodDef mxODBCursor_Methods[] =
> {
>      /* DB API interface */
> ...
> 
> I could eliminate those by clever rearranging the code,
> but have never had an actual need for it.

You shouldn't need to.

I suggest that we keep Jeremy's checkins in 2.3.  Hopefully during the
alpha or beta release cycle we will find out if there *really* are
still platforms with broken compilers.  At worst, it will show up
after 2.3 final is released, and then we can fix it in 2.3.1.  You
won't have to target mx for 2.3 for another 18 months (assuming the
PBF ever releases Python-in-a-Tie).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ping@zesty.ca  Wed Jul 17 21:36:45 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Wed, 17 Jul 2002 15:36:45 -0500 (CDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207172004.g6HK4C613511@odiug.zope.com>
Message-ID: <Pine.LNX.4.33.0207171520110.11138-100000@server1.lfw.org>

On Wed, 17 Jul 2002, Guido van Rossum wrote:
> > the method "next" means that any object with a "next" method
> > cannot be adapted to support the iterator protocol.  Unfortunately
> > "next" is a pretty common word and it's quite possible that such
> > a method name is already in use.
>
> Can you explain this?  Last time I checked CVS, PEP 246 wasn't
> implemented yet, so I don't think you mean "adapted" in that sense.

No, i didn't -- i just meant in the more general sense.  To make an
object support one of the other internal protocols, like repr(), you
can just add a special method (in Python) or fill in a slot (in C).

> Generally speaking, iterator implementations aren't created by making
> changes to an existing class

Well, i guess that's part of the protocol philosophy.  There exist
cursor-like objects that would be natural candidates for being used
like iterators (files are one example, database cursors are another).
Choosing __next__ makes it possible to add support to an existing
object when appropriate, instead of requiring an auxiliary object
regardless of whether it's appropriate or inappropriate.

To me, the former feels like the more typical Python thing to do,
because it's consistent with the way all the other protocols work.
So it's from this perspective that "next" without underscores
is a wart to me.

For example, when something is container-like you can implement
__getitem__ on the object itself, and then you can use [] with the
object.  Some objects let you fetch containers and some objects
implement __getitem__ on their own.  But we don't force everybody
to provide a convert-to-container operation in all cases before
allowing them to provide __getitem__.


-- ?!ng




From ping@zesty.ca  Wed Jul 17 21:58:11 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Wed, 17 Jul 2002 15:58:11 -0500 (CDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <20020717161144.A91916@doublegemini.com>
Message-ID: <Pine.LNX.4.33.0207171536550.11138-100000@server1.lfw.org>

On Wed, 17 Jul 2002, Clark C . Evans wrote:
> On Wed, Jul 17, 2002 at 02:58:55PM -0500, Ka-Ping Yee wrote:
> | Naming
> | the method "next" means that any object with a "next" method
> | cannot be adapted to support the iterator protocol.
>
> Right, but such objects wouldn't be mis-leading beacuse they'd
> be missing a __iter__ method, correct?

__iter__ is a red herring.  It has nothing to do with the act of
iterating.  It exists only to support the use of "for" directly
on the iterator.  Iterators that currently implement "next" but
not "__iter__" will work in some places and not others.  For
example, given this:

    class Counter:
        def __init__(self, last):
            self.i = 0
            self.last = last

        def next(self):
            self.i += 1
            if self.i > self.last: raise StopIteration
            return self.i

    class Container:
        def __init__(self, size):
            self.size = size

        def __iter__(self):
            return Counter(self.size)

This will work:

    >>> for x in Container(3): print x
    ...
    1
    2
    3

But this will fail:

    >>> for x in Counter(3): print x
    ...
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    TypeError: iteration over non-sequence

It's more accurate to say that there are two distinct protocols here.

1.  An object is "for-able" if it implements __iter__ or __getitem__.
    This is a subset of the sequence protocol.

2.  An object can be iterated if it implements next.

The Container supports only protocol 1, and the Counter supports
only protocol 2, with the above results.

Iterators are currently asked to support both protocols.  The
semantics of iteration come only from protocol 2; protocol 1 is
an effort to make iterators look sorta like sequences.  But the
analogy is very weak -- these are "sequences" that destroy
themselves while you look at them -- not like any typical
sequence i've ever seen!

The short of it is that whenever any Python programmer says
"for x in y", he or she had better be darned sure of whether
this is going to destroy y.  Whatever we can do to make this
clear would be a good idea.


-- ?!ng




From mal@lemburg.com  Wed Jul 17 21:58:15 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jul 2002 22:58:15 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128
 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121
 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170
 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156
 xxobject.c,2.20,2.21
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net> <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com>              <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com>
Message-ID: <3D35DA67.8060206@lemburg.com>

Guido van Rossum wrote:
>>Can't provide you with that. I simply collect feedback from
>>users having compile problems in that file.
> 
> 
> Of course you never hear from users when their compiler is fixed so
> that a particular work-around is no longer necessary, so you keep
> collecting cruft until it collapses under its own weight.

True; it doesn't hurt too much, though :-)

>>Note that most of these problems are related to declaring
>>arrays as static forward (rather than C functions as Python
>>normally does):
> 
> 
> Note that staticforward was *only* intended for data declarations.  It
> was never intended (nor needed) for functions.

So what I'm doing is intended and what Jeremy corrected
is not. Gald to hear that :-)

>>staticforward PyMethodDef mxODBCursor_Methods[];
>>
>>...tons of code...
>>
>>statichere PyMethodDef mxODBCursor_Methods[] =
>>{
>>     /* DB API interface */
>>...
>>
>>I could eliminate those by clever rearranging the code,
>>but have never had an actual need for it.
> 
> 
> You shouldn't need to.
> 
> I suggest that we keep Jeremy's checkins in 2.3.  Hopefully during the
> alpha or beta release cycle we will find out if there *really* are
> still platforms with broken compilers.  At worst, it will show up
> after 2.3 final is released, and then we can fix it in 2.3.1.  You
> won't have to target mx for 2.3 for another 18 months (assuming the
> PBF ever releases Python-in-a-Tie).

It's easy enough for me to add the #defines to the
support header file if you take it out of the distribution,
so it wouldn't hurt.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Wed Jul 17 22:03:53 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jul 2002 23:03:53 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128
 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121
 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170
 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156
 xxobject.c,2.20,2.21
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net> <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com>              <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com>
Message-ID: <3D35DBB9.9000103@lemburg.com>

M.-A. Lemburg wrote:
>> I suggest that we keep Jeremy's checkins in 2.3.  Hopefully during the
>> alpha or beta release cycle we will find out if there *really* are
>> still platforms with broken compilers.  At worst, it will show up
>> after 2.3 final is released, and then we can fix it in 2.3.1.  You
>> won't have to target mx for 2.3 for another 18 months (assuming the
>> PBF ever releases Python-in-a-Tie).
> 
> 
> It's easy enough for me to add the #defines to the
> support header file if you take it out of the distribution,
> so it wouldn't hurt.

Just an addition: please leave the configure test in the
distribution. While I could implement that using distutils
as well, I would rather benefit from relying on config.h
doing the right thing in case there are some newly broken
compilers out there, e.g. the xlC one on AIX seems to be
a very picky one...

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From ping@zesty.ca  Wed Jul 17 22:07:07 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Wed, 17 Jul 2002 16:07:07 -0500 (CDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <20020717161144.A91916@doublegemini.com>
Message-ID: <Pine.LNX.4.33.0207171600410.11138-100000@server1.lfw.org>

On Wed, 17 Jul 2002, Clark C . Evans wrote:
>
> Right, but such objects wouldn't be mis-leading beacuse they'd
> be missing a __iter__ method, correct?

Oh, i guess i didn't properly answer your question.  Oops. :)

My answer would be: you could say that, but wouldn't it suck to
have to check for the existence of __iter__ every time you wanted
to call next?

You can legislate that everyone should implement __iter__ together
with next; you can legislate that everyone should check for __iter__
before calling next.  To some extent you have to do both or neither;
one without the other is inconsistent and would lead to surprises.

In practice no one's going to check.  So in practice __iter__
isn't really part of the protocol.


-- ?!ng




From guido@python.org  Wed Jul 17 22:09:33 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 17:09:33 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 17 Jul 2002 15:36:45 CDT."
 <Pine.LNX.4.33.0207171520110.11138-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0207171520110.11138-100000@server1.lfw.org>
Message-ID: <200207172109.g6HL9XJ13871@odiug.zope.com>

> Well, i guess that's part of the protocol philosophy.  There exist
> cursor-like objects that would be natural candidates for being used
> like iterators (files are one example, database cursors are another).
> Choosing __next__ makes it possible to add support to an existing
> object when appropriate, instead of requiring an auxiliary object
> regardless of whether it's appropriate or inappropriate.

OK, that's clear.

> To me, the former feels like the more typical Python thing to do,
> because it's consistent with the way all the other protocols work.
> So it's from this perspective that "next" without underscores
> is a wart to me.

Yes.

> For example, when something is container-like you can implement
> __getitem__ on the object itself, and then you can use [] with the
> object.  Some objects let you fetch containers and some objects
> implement __getitem__ on their own.  But we don't force everybody
> to provide a convert-to-container operation in all cases before
> allowing them to provide __getitem__.

Correct.

Now, weren't you a co-author of the Iterator PEP?  I wish you'd
brought this up then.  Or maybe you did, and I overruled you.  Sorry
then.

But I don't think we can withdraw this so easily.  It's not the end of
the world.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Wed Jul 17 22:21:26 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 17:21:26 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 17 Jul 2002 15:58:11 CDT."
 <Pine.LNX.4.33.0207171536550.11138-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0207171536550.11138-100000@server1.lfw.org>
Message-ID: <200207172121.g6HLLQH13946@odiug.zope.com>

> __iter__ is a red herring.  It has nothing to do with the act of
> iterating.  It exists only to support the use of "for" directly
> on the iterator.  Iterators that currently implement "next" but
> not "__iter__" will work in some places and not others.  For
> example, given this:
> 
>     class Counter:
>         def __init__(self, last):
>             self.i = 0
>             self.last = last
> 
>         def next(self):
>             self.i += 1
>             if self.i > self.last: raise StopIteration
>             return self.i
> 
>     class Container:
>         def __init__(self, size):
>             self.size = size
> 
>         def __iter__(self):
>             return Counter(self.size)
> 
> This will work:
> 
>     >>> for x in Container(3): print x
>     ...
>     1
>     2
>     3
> 
> But this will fail:
> 
>     >>> for x in Counter(3): print x
>     ...
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     TypeError: iteration over non-sequence
> 
> It's more accurate to say that there are two distinct protocols here.
> 
> 1.  An object is "for-able" if it implements __iter__ or __getitem__.
>     This is a subset of the sequence protocol.
> 
> 2.  An object can be iterated if it implements next.
> 
> The Container supports only protocol 1, and the Counter supports
> only protocol 2, with the above results.
> 
> Iterators are currently asked to support both protocols.  The
> semantics of iteration come only from protocol 2; protocol 1 is
> an effort to make iterators look sorta like sequences.  But the
> analogy is very weak -- these are "sequences" that destroy
> themselves while you look at them -- not like any typical
> sequence i've ever seen!
> 
> The short of it is that whenever any Python programmer says
> "for x in y", he or she had better be darned sure of whether
> this is going to destroy y.  Whatever we can do to make this
> clear would be a good idea.

This is a very good summary of the two iterator protocols.  Ping,
would you mind adding this to PEP 234?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aleax@aleax.it  Wed Jul 17 23:06:45 2002
From: aleax@aleax.it (Alex Martelli)
Date: Thu, 18 Jul 2002 00:06:45 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207171807.g6HI7VS10049@odiug.zope.com>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <E17Ustw-0003DY-00@mail.python.org> <200207171807.g6HI7VS10049@odiug.zope.com>
Message-ID: <E17Uwx3-0002YK-00@mail.python.org>

On Wednesday 17 July 2002 08:07 pm, Guido van Rossum wrote:
> OK, I'll wait to see if someone submits a working patch.  I still find
> it a non-issue myself.

OK, I'm gonna give it a try -- kludging up Oren's patch so that
the xreadlines object is able to hold a non-addref'd pointer to
the file object (when it's for internal use of the file object) and,
as long as I'm at it, also including the little further kludge that
makes f.readline delegate to f.next if f is holding an xreadlines
object.  Oh, and dropping the xreadlines object on a seek, too.

It's just a few lines' changes to two files after all,
Objects/fileobject.c and Modules/xreadlines.c.  A bit kludgey
and tricky, admittedly, which is perhaps not the nicest thing
in the world given that fileobject.c isn't the shortest, simplest,
or least crucial part of Python.  But anyway, I think I'll have it
ready by early tomorrow my time (it's past midnight and I'm
past the age for all-nighters:-).


Alex



From ping@zesty.ca  Wed Jul 17 23:40:27 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Wed, 17 Jul 2002 15:40:27 -0700 (PDT)
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <200207172109.g6HL9XJ13871@odiug.zope.com>
Message-ID: <Pine.LNX.4.44.0207171513230.17524-100000@ziggy>

On Wed, 17 Jul 2002, Guido van Rossum wrote:
[...]
> OK, that's clear.
[...]
> Yes.
[...]
> Correct.

Neat!  So much understanding.

> Now, weren't you a co-author of the Iterator PEP?  I wish you'd
> brought this up then.  Or maybe you did, and I overruled you.  Sorry
> then.

Indeed, i wrote the first draft of the PEP, though it was very
different from what we have today; it's been largely rewritten.
The big design changes happened at the iterator BOF, so unfortunately
there's no e-mail record of the debate.

I recall that __iter__ made me uncomfortable, but i don't recall to
what extent i expressed this.  I don't remember whether there was any
overruling.  But it doesn't really matter; it's today now, and here
we are.  It is true that i failed to understand or express the issue
well enough to have an effect on the design.  I will cheerfully accept
blame if it somehow means we'll end up with a nicer language. :)

> But I don't think we can withdraw this so easily.  It's not the end of
> the world.

I would be pleased to see a migration path (perhaps along the lines
of Dave's suggestion, with warnings for a while), but i won't throw
myself off a bridge if it doesn't happen.

I do think there is some potential for errors caused by misunderstandings
about whether or not "for x in y" is destructive.  That's the thing that
worries me the most.  I think this is the main reason why the old
practice of abusing __getitem__ was bad, and thus helped to motivate
iterators in the first place.  It seems serious enough that migrating to
something that distinguishes destructive-for from non-destructive-for
could indeed be worth the cost.

The destructive-for issue may seem superficially unrelated to the
__next__-naming issue.  As i see it, the __next__-naming issue is
related to the mandatory-__iter__ issue (because some people view
__iter__ as a type flag), which is related to the destructive-for issue.


-- ?!ng




From ping@zesty.ca  Wed Jul 17 23:53:42 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Wed, 17 Jul 2002 15:53:42 -0700 (PDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207172121.g6HLLQH13946@odiug.zope.com>
Message-ID: <Pine.LNX.4.44.0207171540460.17524-100000@ziggy>

I wrote:
> __iter__ is a red herring.
[...blah blah blah...]
> The short of it is that whenever any Python programmer says
> "for x in y", he or she had better be darned sure of whether
> this is going to destroy y.  Whatever we can do to make this
> clear would be a good idea.

Guido wrote:
> This is a very good summary of the two iterator protocols.  Ping,
> would you mind adding this to PEP 234?

And i thought it was a critique.  Fascinating, Captain. :)

I'm happy to add the text, but i want to be clear, then: is it
acceptable to write an iterator that only provides <next> if you
only care about the "iteration protocol" and not the "for-able
protocol"?

I see that "ought to" is the most opinion the PEP is willing to
give on the topic:

    A class is a valid iterator object when it defines a next()
    method that behaves as described above.  A class that wants
    to be an iterator also ought to implement __iter__()
    returning itself.


-- ?!ng




From greg@cosc.canterbury.ac.nz  Thu Jul 18 00:18:28 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jul 2002 11:18:28 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207171503.g6HF3mW01047@odiug.zope.com>
Message-ID: <200207172318.g6HNIRS23784@oma.cosc.canterbury.ac.nz>

Guido:
> - The mapping between the next() method and the tp_iternext slot in
>   the type object would disappear, and instead the __next__() method
>   would be mapped to this slot.

For what it's worth, I took it upon myself to "fix"
this already in Pyrex extension types. So if you make 
this change, you'll be making Python more compatible
with Pyrex. :-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Thu Jul 18 00:25:26 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jul 2002 11:25:26 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207171527.g6HFRV214431@europa.research.att.com>
Message-ID: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz>

Andrew Koenig <ark@research.att.com>:

> Is a file a container or not?

I would say no, a file object is not a container in Python terms.
You can't index it with [] or use len() on it or any of
the other things you expect to be able to do on containers.

I think we just have to live with the idea that there are
things other than containers that can supply iterators.
Forcing everything that can supply an iterator to bend
over backwards to try to be a random-access container
as well would be too cumbersome.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Thu Jul 18 00:32:22 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jul 2002 11:32:22 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <E17UrN2-0000Re-00@mail.python.org>
Message-ID: <200207172332.g6HNWMp23835@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleax@aleax.it>:

> All files have seek and write, but not on all files do they work -- and
> the same goes for iteration.  I.e., it IS something of a mess

I've just had a thought. Maybe it would be less of a mess
if what we are calling "iterators" had been called "streams"
instead. Then the term "iterator" could have been reserved
for the special case of an object that provides stream
access to a random-access collection.

Then you could say that a file object is a stream object
that provides line-by-line access to an OS file. Other
stream objects can be constructed that give access to
the OS file in other units. That would all make sense
without seeming to imply any multi-pass ability.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+
 



From greg@cosc.canterbury.ac.nz  Thu Jul 18 00:39:05 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jul 2002 11:39:05 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <E17UrN2-0000Re-00@mail.python.org>
Message-ID: <200207172339.g6HNd5j23845@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleax@aleax.it>:

> the file object's is the only example of "fat interface" problem
> in Python -- an interface that exposes a lot of methods, with many
> objects claiming they implement that interface but actually lying

Maybe the existing file object should be split up into
some number of other objects with smaller interfaces.

For example, instead of the file object actually accessing an
OS file itself, it could just be a wrapper around an
underlying "bytestream" object, which implements only
read() and write().

Then, instead of implementing your own file-like object,
you would implement a new bytestream object instead, and
wrap it in a standard file object. That would give you
all the flavours of access automatically without having
to implement them yourself and without lying about
anything.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Thu Jul 18 00:48:14 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jul 2002 11:48:14 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <E17Us6L-00060l-00@mail.python.org>
Message-ID: <200207172348.g6HNmEB23863@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleax@aleax.it>:

> Still, it doesn't solve the reference-loop-between-two-deuced-things-
> that-don't-cooperate-with-gc problem.

Would making them cooperate with GC be a difficult
thing to do? Seems to me we should be moving towards
making everything cooperate with GC, and fixing
things like this whenever they come to light.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Thu Jul 18 00:55:27 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jul 2002 11:55:27 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207171738.g6HHcZa09875@odiug.zope.com>
Message-ID: <200207172355.g6HNtRG23895@oma.cosc.canterbury.ac.nz>

Guido:

> Likewise, the file needs a strong ref to the xreadlines, otherwise the
> following would create a new iterator in the second for loop, and lose
> data buffered by the first iterator.

To me, these problems are screaming out that the
buffer *shouldn't* be kept in the xreadlines object!

Maybe the xreadlines object's buffer should be kept
in the file object? Then it wouldn't matter if
multiple xreadlines objects were created, as
they'd all share the same buffer, and there would
be no reference loops.

Hmmm... then we're moving towards making the
file object and the xreadlines object be the
same object. What was the reason for not doing
that again? Was it just to avoid changing a lot
of code, or was there some reason it wouldn't
work?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Thu Jul 18 01:01:39 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jul 2002 12:01:39 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <Pine.LNX.4.33.0207171453350.11138-100000@server1.lfw.org>
Message-ID: <200207180001.g6I01dU23913@oma.cosc.canterbury.ac.nz>

> Unfortunately "next" is a pretty common word and it's quite possible
> that such a method name is already in use.

It is -- all my scanners have a "next" method that
does something different from what an iterator's
"next" is supposed to do. Fortunately I haven't
had an urge to make any of them into an iterator
yet.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From tdelaney@avaya.com  Thu Jul 18 01:05:32 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Thu, 18 Jul 2002 10:05:32 +1000
Subject: [Python-Dev] Single- vs. Multi-pass iterability
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A445@natasha.auslabs.avaya.com>

> From: Greg Ewing [mailto:greg@cosc.canterbury.ac.nz]
> 
> Would making them cooperate with GC be a difficult
> thing to do? Seems to me we should be moving towards
> making everything cooperate with GC, and fixing
> things like this whenever they come to light.

It would sure annoy those people who insist that

    file(f, 'w').write(s)

is a safe idiom ... :)

Tim Delaney



From greg@cosc.canterbury.ac.nz  Thu Jul 18 01:05:39 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jul 2002 12:05:39 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207172004.g6HK4C613511@odiug.zope.com>
Message-ID: <200207180005.g6I05dO23924@oma.cosc.canterbury.ac.nz>

Guido:

> Generally speaking, iterator implementations aren't created by making
> changes to an existing class

Continuing with my scanner example, it's conceivable
that I might want to give it an iterator interface
as an alternative to the existing one -- it's already
an iterator, really, it just doesn't have the
new standard iterator interface.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Thu Jul 18 01:25:45 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jul 2002 12:25:45 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <Pine.LNX.4.44.0207171540460.17524-100000@ziggy>
Message-ID: <200207180025.g6I0Pjw23965@oma.cosc.canterbury.ac.nz>

Ka-Ping:

> is it
> acceptable to write an iterator that only provides <next> if you
> only care about the "iteration protocol" and not the "for-able
> protocol"?

Probably just as acceptable as writing a file object
that only provides some of the file methods.
Seems quite Pythonic to me.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Thu Jul 18 01:29:55 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jul 2002 12:29:55 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <B43D149A9AB2D411971300B0D03D7E8BF0A445@natasha.auslabs.avaya.com>
Message-ID: <200207180029.g6I0TtE23981@oma.cosc.canterbury.ac.nz>

"Delaney, Timothy" <tdelaney@avaya.com>:
> Me:
> > Would making them cooperate with GC be a difficult
> > thing to do?
> 
> It would sure annoy those people who insist that
> 
>     file(f, 'w').write(s)
> 
> is a safe idiom ... :)

Well, it would serve them right! :-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From guido@python.org  Thu Jul 18 01:43:20 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jul 2002 20:43:20 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 17 Jul 2002 15:53:42 PDT."
 <Pine.LNX.4.44.0207171540460.17524-100000@ziggy>
References: <Pine.LNX.4.44.0207171540460.17524-100000@ziggy>
Message-ID: <200207180043.g6I0hKB25427@pcp02138704pcs.reston01.va.comcast.net>

> > This is a very good summary of the two iterator protocols.  Ping,
> > would you mind adding this to PEP 234?
> 
> And i thought it was a critique.  Fascinating, Captain. :)
> 
> I'm happy to add the text, but i want to be clear, then: is it
> acceptable to write an iterator that only provides <next> if you
> only care about the "iteration protocol" and not the "for-able
> protocol"?

No, an iterator ought to provide both, but it's good to recognize that
there *are* two protocols.

> I see that "ought to" is the most opinion the PEP is willing to
> give on the topic:
> 
>     A class is a valid iterator object when it defines a next()
>     method that behaves as described above.  A class that wants
>     to be an iterator also ought to implement __iter__()
>     returning itself.

I would like to see this strengthened.  I envision "iterator algebra"
code that really needs to be able to do a for loop over an iterator
when it feels like it.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aleax@aleax.it  Thu Jul 18 07:52:08 2002
From: aleax@aleax.it (Alex Martelli)
Date: Thu, 18 Jul 2002 08:52:08 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <E17Uwx3-0002YK-00@mail.python.org>
References: <r01050300-1015-01CDBF2B94A211D6B669003065D5E7E4@[10.0.0.23]> <200207171807.g6HI7VS10049@odiug.zope.com> <E17Uwx3-0002YK-00@mail.python.org>
Message-ID: <E17V598-0008PN-00@mail.python.org>

On Thursday 18 July 2002 12:06 am, Alex Martelli wrote:
> On Wednesday 17 July 2002 08:07 pm, Guido van Rossum wrote:
> > OK, I'll wait to see if someone submits a working patch.  I still find
> > it a non-issue myself.
>
> OK, I'm gonna give it a try -- kludging up Oren's patch so that

Done, now submitted as patch 583235.


Alex



From jeremy@alum.mit.edu  Thu Jul 18 17:58:05 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jul 2002 12:58:05 -0400
Subject: [Python-Dev] staticforward
In-Reply-To: <3D35DA67.8060206@lemburg.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
 <3D35A188.20407@lemburg.com>
 <15669.47553.15097.651868@slothrop.zope.com>
 <3D35D466.5090903@lemburg.com>
 <200207172045.g6HKjBg13729@odiug.zope.com>
 <3D35DA67.8060206@lemburg.com>
Message-ID: <15670.62365.517118.775364@slothrop.zope.com>

>>>>> "MAL" == mal  <M.-A.> writes:

  >>> Note that most of these problems are related to declaring arrays
  >>> as static forward (rather than C functions as Python normally
  >>> does):
  >>
  >>
  >> Note that staticforward was *only* intended for data
  >> declarations.  It was never intended (nor needed) for functions.

  MAL> So what I'm doing is intended and what Jeremy corrected is
  MAL> not. Gald to hear that :-)

staticforward was intended for data declarations, but was widely
misused within the core for function prototypes.  

The intended use for data declarations was to have the initial
declaration be staticforward and the initialization use statichere.
Although this was the intent, most uses did not follow this pattern.
It was common to use staticforward the first time and static the
second; this was pretty harmless as statichere always expanded to
static.  It was also common to use staticforward in both places, when
ended up declaring it as extern rather than static.

BTW, I'm also gald to hear that what I correct is not intended.

Jeremy




From jeremy@alum.mit.edu  Thu Jul 18 18:02:11 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jul 2002 13:02:11 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128
 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121
 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170
 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156
 xxobject.c,2.20,2.21
In-Reply-To: <3D35DBB9.9000103@lemburg.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
 <3D35A188.20407@lemburg.com>
 <15669.47553.15097.651868@slothrop.zope.com>
 <3D35D466.5090903@lemburg.com>
 <200207172045.g6HKjBg13729@odiug.zope.com>
 <3D35DA67.8060206@lemburg.com>
 <3D35DBB9.9000103@lemburg.com>
Message-ID: <15670.62611.943840.954629@slothrop.zope.com>

>>>>> "MAL" == mal  <M.-A.> writes:

  MAL> M.-A. Lemburg wrote:
  >>> I suggest that we keep Jeremy's checkins in 2.3.  Hopefully
  >>> during the alpha or beta release cycle we will find out if there
  >>> *really* are still platforms with broken compilers.  At worst,
  >>> it will show up after 2.3 final is released, and then we can fix
  >>> it in 2.3.1.  You won't have to target mx for 2.3 for another 18
  >>> months (assuming the PBF ever releases Python-in-a-Tie).
  >>
  >>
  >> It's easy enough for me to add the #defines to the support header
  >> file if you take it out of the distribution, so it wouldn't hurt.

  MAL> Just an addition: please leave the configure test in the
  MAL> distribution. While I could implement that using distutils as
  MAL> well, I would rather benefit from relying on config.h doing the
  MAL> right thing in case there are some newly broken compilers out
  MAL> there, e.g. the xlC one on AIX seems to be a very picky one...

I don't understand what your goal is.  Why do you want the configure
test if your header file has a bunch of platform-specific ifdefs?  If
these platforms actually had a problem, the configure test would have
caught it and you wouldn't need the ifdefs.  The only way the ifdefs
would have an effect is if the configure test did not detect a
problem; but if the configure test didn't detect a problem, then you
don't need the ifdefs.

Jeremy




From jmiller@stsci.edu  Thu Jul 18 16:59:16 2002
From: jmiller@stsci.edu (Todd Miller)
Date: Thu, 18 Jul 2002 11:59:16 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
References: <20020715175256.5971.qmail@web40112.mail.yahoo.com>
Message-ID: <3D36E5D4.80308@stsci.edu>

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <title></title>
</head>
<body>
Scott Gilbert wrote:<br>
<blockquote type="cite"
 cite="mid20020715175256.5971.qmail@web40112.mail.yahoo.com">
  <pre wrap="">--- Todd Miller <a class="moz-txt-link-rfc2396E" href="mailto:jmiller@stsci.edu">&lt;jmiller@stsci.edu&gt;</a> wrote:<br></pre>
  <blockquote type="cite">
    <blockquote type="cite">
      <pre wrap="">I don't understand what you say, but I believe you.<br><br></pre>
    </blockquote>
    <pre wrap="">I meant we call  PyBuffer_FromReadWriteObject and the resulting buffer <br>lives longer than the extension function call that created it.   I have <br>heard that it is possible for the original object to "move" leaving the <br>buffer object pointer to it dangling.<br></pre>
  </blockquote>
  <pre wrap=""><!----><br>Yes.  The PyBufferObject grabs the pointer from the PyBufferProcs<br>supporting object when the PyBufferObject is created.  If the PyBufferProcs<br>supporting object reallocates the memory (possibly from a resize) the</pre>
</blockquote>
Thanks for the example.<br>
<blockquote type="cite"
 cite="mid20020715175256.5971.qmail@web40112.mail.yahoo.com">
  <pre wrap=""><br>PyBufferObject can be left with a bad pointer.  This is easily possible if<br>you try to use the array module arrays as a buffer.</pre>
</blockquote>
This is good to know.<br>
<blockquote type="cite"
 cite="mid20020715175256.5971.qmail@web40112.mail.yahoo.com">
  <pre wrap=""><br><br>I've submitted a patch to fix this particular problem (among others), but<br>there are still enough things that the buffer object can't do that<br>something new is needed.<br></pre>
</blockquote>
I understand. &nbsp;I saw your patches and they sounded good to me.<br>
<blockquote type="cite"
 cite="mid20020715175256.5971.qmail@web40112.mail.yahoo.com">
  <pre wrap=""><br></pre>
  <blockquote type="cite">
    <blockquote type="cite">
      <pre wrap=""><br></pre>
      <blockquote type="cite">
        <blockquote type="cite">
          <pre wrap="">Maybe instead of the buffer() function/type, there should be a way to<br>allocate raw memory?<br><br></pre>
        </blockquote>
      </blockquote>
      <blockquote type="cite">
        <pre wrap="">Yes.    It would also be nice to be able to:<br><br>1.  Know (at the python level) that a type supports the buffer C-API.<br><br></pre>
      </blockquote>
      <pre wrap="">Good idea.  (I guess right now you can see if calling buffer() with an<br>instance as argument works. :-)<br><br></pre>
      <blockquote type="cite">
        <pre wrap="">2.  Copy bytes from one buffer to another (writeable buffer).  <br><br></pre>
      </blockquote>
    </blockquote>
  </blockquote>
  <pre wrap=""><!----><br>And the copy operations shouldn't create any large temporaries:</pre>
</blockquote>
I agree with this completely. &nbsp; &nbsp;I could summarize my opinion by saying that
while<br>
I regard the current buffering system as pretty complete, &nbsp;the buffer object
places emphasis<br>
on the wrong behavior. &nbsp;In terms of modelling memory regions, strings are
the wrong way<br>
to go. &nbsp;&nbsp;
<blockquote type="cite"
 cite="mid20020715175256.5971.qmail@web40112.mail.yahoo.com">
  <pre wrap=""><br><br>  buf1 = memory(50000)<br>  buf2 = memory(50000)<br>  # no 10K temporary should be created in the next line<br>  buf1[10000:20000] = buf2[30000:40000] <br><br>The current buffer object could be used like this, but it would create a<br>temporary string.  <br></pre>
</blockquote>
Looking at buffering most of this week, the fact that mmap slicing also returns
strings is one justification I've found for having a buffer object, &nbsp;i.e.,
&nbsp;mmap slicing is not a substitute for the buffer object. &nbsp;The buffer object
makes it possible to partition a mmap or any bufferable object into pseudo-independent,
possibly writable, pieces. &nbsp; <br>
<br>
One justification to have a new buffer object is pickling (one of Scott's
posts alerted me to this).&nbsp; &nbsp;I think the behavior we want for numarray is
to be able to pickle a view of a bufferable object more or less like a string
containing the buffer image, and to unpickle it as a memory object. &nbsp; The
prospect of adding pickling support makes me wonder if seperating the allocator
and view aspects of the buffer object is a good idea; &nbsp;I thought it was,
but now I wonder.<br>
<blockquote type="cite"
 cite="mid20020715175256.5971.qmail@web40112.mail.yahoo.com">
  <pre wrap=""><br>So getting an efficient copy operation seems to require that slices just<br>create new "views" to the same memory.</pre>
</blockquote>
Other justifications for a new buffer object might be:<br>
<br>
1. The ability to partition any bufferable object into regions which can
be passed around. &nbsp;These regions<br>
would themselves be buffers.<br>
<br>
2. The ability to efficiently pickle a view of any bufferable object.<br>
<blockquote type="cite"
 cite="mid20020715175256.5971.qmail@web40112.mail.yahoo.com">
  <pre wrap=""><br></pre>
  <blockquote type="cite">
    <blockquote type="cite">
      <pre wrap="">Maybe you would like to work on a requirements gathering for a memory<br>object<br><br></pre>
    </blockquote>
    <pre wrap="">Sure.  I'd be willing to poll comp.lang.python (python-list?) and <br>collate the results of any discussion that ensues.  Is that what you had <br>in mind?<br><br></pre>
  </blockquote>
  <pre wrap=""><!----><br><br>In the PEP that I'm drafting, I've been calling the new object "bytes"<br>(since it is just a simple array of bytes).  Now that you guys are<br>referring to it as the "memory object", should I change the name?  Doesn't<br>really matter, but it might avoid confusion to know we're all talking about<br>the same thing.<br><br></pre>
</blockquote>
Calling this a memory type&nbsp; sounds the best to me. &nbsp;The question I have not
resolved for myself <br>
is whether there should be one type which "does it all" or two types, a memory
allocator and a bufferable<br>
object manipulator. &nbsp; <br>
<blockquote type="cite"
 cite="mid20020715175256.5971.qmail@web40112.mail.yahoo.com">
  <pre wrap=""><br><br><br>__________________________________________________<br>Do You Yahoo!?<br>Yahoo! Autos - Get free new car price quotes<br><a class="moz-txt-link-freetext" href="http://autos.yahoo.com">http://autos.yahoo.com</a><br></pre>
</blockquote>
<br>
<br>
</body>
</html>




From lalo@laranja.org  Thu Jul 18 16:03:40 2002
From: lalo@laranja.org (Lalo Martins)
Date: Thu, 18 Jul 2002 12:03:40 -0300
Subject: [Python-Dev] PEP 292-related: why string substitution is not the same operation as data formatting
In-Reply-To: <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net>
References: <20020623181630.GN25927@laranja.org> <200207121447.g6CElY808029@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020718150340.GB1209@laranja.org>

On Fri, Jul 12, 2002 at 10:47:34AM -0400, Guido van Rossum wrote:
> > Guido, can you please, for our enlightenment, tell us what are the
> > reasons you feel %(foo)s was a mistake?
> 
> Because of the trailing 's'.  It's very easy to leave it out by
> mistake, and because the definition of printf formats skips over
> spaces (don't ask me why), the first character of the following word
> is used as the type indicator.

In case that wasn't clear, I agree with that - I asked because I wanted this
in writing for the record.

BTW: IIRC, it skips over spaces because spaces are a valid format modifier
(meaning "pad with spaces").

[]s,
                                               |alo
                                               +----
--
            Those who trade freedom for security
               lose both and deserve neither.
--
http://www.laranja.org/                mailto:lalo@laranja.org
         pgp key: http://www.laranja.org/pessoal/pgp

Eu jogo RPG! (I play RPG)         http://www.eujogorpg.com.br/
Python Foundry Guide http://www.sf.net/foundry/python-foundry/



From guido@python.org  Thu Jul 18 17:27:25 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 12:27:25 -0400
Subject: [Python-Dev] test_socket failure on FreeBSD
In-Reply-To: Your message of "Mon, 08 Jul 2002 23:53:01 +1100."
 <Pine.OS2.4.32.0207082338440.28004-400000@tenring.andymac.org>
References: <Pine.OS2.4.32.0207082338440.28004-400000@tenring.andymac.org>
Message-ID: <200207181627.g6IGRPE21459@odiug.zope.com>

> > There are probably some differences in the socket semantics.  I'd
> > appreciate it if you could provide a patch or at least a clue!
> 
> I've not read enough Stevens to grok sockets code (yet) :-(
> 
> However, I hope that the instrumented verbose output of test_socket might
> give you a clue....
> 
> I've attached the diff from the version of test_socket (vs recent CVS)
> that I used, as well as output from test_socket on FreeBSD 4.4 and
> OS/2+EMX.  Getting the FreeBSD issues sorted is a higher priority for me
> than getting OS/2+EMX working (though that would be nice too).
> 
> Please let me know if there's more testing/debugging I can do.

I've got some time for this now.  Ignoring your OS/2+EMX output and
focusing on the FreeBSD logs, I notice:

[...]
> Testing recvfrom() in chunks over TCP. ... 
> seg1='Michael Gilfix was he', addr='None'
> seg2='re
> ', addr='None'
> ERROR

Hm.  This looks like recvfrom() on a TCP stream doesn't return an
address; not entirely unreasonable.  I wonder if
self.cli_conn.getpeername() returns the expected address; can you
check this?  Add this after each recvfrom() call.

        if addr is None:
            addr = self.cli_conn.getpeername()

[...]
> Testing large recvfrom() over TCP. ... 
> msg='Michael Gilfix was here
> ', addr='None'
> ERROR

Ditto.

> Testing non-blocking accept. ... 
> conn=<socket object, fd=8, family=2, type=1, protocol=0>
> addr=('127.0.0.1', 3144)
> FAIL

This is different.  It seems that the accept() call doesn't time out.
But this could be because the client thread connects too fast.  Can
you add a sleep (e.g. time.sleep(5)) to _testAccept() before the
connect() call?

[...]
> Testing non-blocking recv. ... 
> conn=<socket object, fd=8, family=2, type=1, protocol=0>
> addr=('127.0.0.1', 3146)
> FAIL

Similar.  Try putting a sleep in _testRecv() between the connect() and
the send().

[...]

Let me know if you want me to provide specific patches...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jul 18 16:49:44 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 11:49:44 -0400
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: Your message of "Mon, 08 Jul 2002 21:20:56 EDT."
 <20020709012056.GA2526@cthulhu.gerg.ca>
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com> <3D220A86.5070003@lemburg.com> <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <m3znx79rk3.fsf@mira.informatik.hu-berlin.de> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com>
 <20020709012056.GA2526@cthulhu.gerg.ca>
Message-ID: <200207181549.g6IFniw21368@odiug.zope.com>

> > Perhaps we could have some kind of category for distutils
> > packages which marks them as system add-ons vs. site add-ons.
> 
> +1 -- this should definitely be up to the package author/packager, not
> the local admin.  I once tried to convince Guido that the ability to
> occasionally upgrade standard library modules/packages would be a good
> thing, but he wasn't having it.  Any change of heart, O Mighty BDFL?

Before I answer that, here's a question.  Why do we think it's a good
idea to distribute upgrades as separate add-ons while we don't think
it's okay to distribute such upgrades with bugfix releases?  Doesn't
this just increase the variability of site configurations, and hence
version interaction hell?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jul 18 15:22:11 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 10:22:11 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Wed, 17 Jul 2002 15:40:27 PDT."
 <Pine.LNX.4.44.0207171513230.17524-100000@ziggy>
References: <Pine.LNX.4.44.0207171513230.17524-100000@ziggy>
Message-ID: <200207181422.g6IEMBr14526@odiug.zope.com>

> I do think there is some potential for errors caused by
> misunderstandings about whether or not "for x in y" is destructive.
> That's the thing that worries me the most.  I think this is the main
> reason why the old practice of abusing __getitem__ was bad, and thus
> helped to motivate iterators in the first place.  It seems serious
> enough that migrating to something that distinguishes
> destructive-for from non-destructive-for could indeed be worth the
> cost.

I'm not sure I understand this (this seems to be my week for not
understanding what people write :-( ).

First of all, I'm not sure what exactly the issue is with destructive
for-loops.  If I have a function that contains a for-loop over its
argument, and I pass iter(x) as the argument, then the iterator is
destroyed in the process, but x may or may not be, depending on what
it is.  Maybe the for-loop is a red herring?  Calling next() on an
iterator may or may not be destructive on the underlying "sequence" --
if it is a generator, for example, I would call it destructive.

Perhaps you're trying to assign properties to the iterator abstraction
that aren't really there?

Next, I'm not sure how renaming next() to __next__() would affect the
situation w.r.t. the destructivity of for-loops.  Or were you talking
about some other migration?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pinard@iro.umontreal.ca  Thu Jul 18 12:23:16 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 18 Jul 2002 07:23:16 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <200207180043.g6I0hKB25427@pcp02138704pcs.reston01.va.comcast.net>
References: <Pine.LNX.4.44.0207171540460.17524-100000@ziggy>
 <200207180043.g6I0hKB25427@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <oqele1th3v.fsf@carouge.sram.qc.ca>

> > I'm happy to add the text, but i want to be clear, then: is it
> > acceptable to write an iterator that only provides <next> if you
> > only care about the "iteration protocol" and not the "for-able
> > protocol"?

> No, an iterator ought to provide both, but it's good to recognize that
> there *are* two protocols.

> >     A class is a valid iterator object when it defines a next()
> >     method that behaves as described above.  A class that wants
> >     to be an iterator also ought to implement __iter__()
> >     returning itself.

> I would like to see this strengthened.  I envision "iterator algebra"
> code that really needs to be able to do a for loop over an iterator
> when it feels like it.

Maybe the reasons behind having __iter__() returning itself should be
clearly expressed in the PEP, too.  On this list, Tim gave one recently,
Guido gives another here, but unless I missed it, the PEP gives none.
Usually, PEPs explain the reasons behind the choices.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard




From cce@clarkevans.com  Thu Jul 18 15:06:31 2002
From: cce@clarkevans.com (Clark C . Evans)
Date: Thu, 18 Jul 2002 10:06:31 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <Pine.LNX.4.33.0207171453350.11138-100000@server1.lfw.org>; from ping@zesty.ca on Wed, Jul 17, 2002 at 02:58:55PM -0500
References: <200207171409.g6HE9Di00659@odiug.zope.com> <Pine.LNX.4.33.0207171453350.11138-100000@server1.lfw.org>
Message-ID: <20020718100631.A3468@doublegemini.com>

On Wed, Jul 17, 2002 at 02:58:55PM -0500, Ka-Ping Yee wrote:
| But i think this is more than a minor problem.  This is a
| namespace collision problem, and that's significant.  Naming
| the method "next" means that any object with a "next" method
| cannot be adapted to support the iterator protocol.  Unfortunately
| "next" is a pretty common word and it's quite possible that such
| a method name is already in use.

Ping,

Do you have any suggestions for re-wording the Iterator questionare
at http://yaml.org/wk/survey?id=pyiter to reflect this paragraph above?

Best,

Clark

-- 
Clark C. Evans                   Axista, Inc.
http://www.axista.com            800.926.5525
XCOLLA Collaborative Project Management Software



From xscottg@yahoo.com  Mon Jul 15 18:52:56 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Mon, 15 Jul 2002 10:52:56 -0700 (PDT)
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: <3D32FA0D.6020200@stsci.edu>
Message-ID: <20020715175256.5971.qmail@web40112.mail.yahoo.com>

--- Todd Miller <jmiller@stsci.edu> wrote:
> >
> >I don't understand what you say, but I believe you.
> >
> I meant we call  PyBuffer_FromReadWriteObject and the resulting buffer 
> lives longer than the extension function call that created it.   I have 
> heard that it is possible for the original object to "move" leaving the 
> buffer object pointer to it dangling.

Yes.  The PyBufferObject grabs the pointer from the PyBufferProcs
supporting object when the PyBufferObject is created.  If the PyBufferProcs
supporting object reallocates the memory (possibly from a resize) the
PyBufferObject can be left with a bad pointer.  This is easily possible if
you try to use the array module arrays as a buffer.

I've submitted a patch to fix this particular problem (among others), but
there are still enough things that the buffer object can't do that
something new is needed.

> 
> >
> >
> >>>Maybe instead of the buffer() function/type, there should be a way to
> >>>allocate raw memory?
> >>>
> >
> >>Yes.    It would also be nice to be able to:
> >>
> >>1.  Know (at the python level) that a type supports the buffer C-API.
> >>
> >
> >Good idea.  (I guess right now you can see if calling buffer() with an
> >instance as argument works. :-)
> >
> >>2.  Copy bytes from one buffer to another (writeable buffer).  
> >>

And the copy operations shouldn't create any large temporaries:

  buf1 = memory(50000)
  buf2 = memory(50000)
  # no 10K temporary should be created in the next line
  buf1[10000:20000] = buf2[30000:40000] 

The current buffer object could be used like this, but it would create a
temporary string.  

So getting an efficient copy operation seems to require that slices just
create new "views" to the same memory.

> >
> >Maybe you would like to work on a requirements gathering for a memory
> >object
> >
> Sure.  I'd be willing to poll comp.lang.python (python-list?) and 
> collate the results of any discussion that ensues.  Is that what you had 
> in mind?
> 


In the PEP that I'm drafting, I've been calling the new object "bytes"
(since it is just a simple array of bytes).  Now that you guys are
referring to it as the "memory object", should I change the name?  Doesn't
really matter, but it might avoid confusion to know we're all talking about
the same thing.




__________________________________________________
Do You Yahoo!?
Yahoo! Autos - Get free new car price quotes
http://autos.yahoo.com



From aleax@aleax.it  Thu Jul 18 07:02:23 2002
From: aleax@aleax.it (Alex Martelli)
Date: Thu, 18 Jul 2002 08:02:23 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207172348.g6HNmEB23863@oma.cosc.canterbury.ac.nz>
References: <200207172348.g6HNmEB23863@oma.cosc.canterbury.ac.nz>
Message-ID: <E17V4Mz-0000Hj-00@mail.python.org>

On Thursday 18 July 2002 01:48 am, Greg Ewing wrote:
> Alex Martelli <aleax@aleax.it>:
> > Still, it doesn't solve the reference-loop-between-two-deuced-things-
> > that-don't-cooperate-with-gc problem.
>
> Would making them cooperate with GC be a difficult
> thing to do? Seems to me we should be moving towards
> making everything cooperate with GC, and fixing
> things like this whenever they come to light.

Tim Peters says it wouldn't be, but I have not explored that.


Alex



From aleax@aleax.it  Thu Jul 18 06:52:34 2002
From: aleax@aleax.it (Alex Martelli)
Date: Thu, 18 Jul 2002 07:52:34 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz>
References: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz>
Message-ID: <E17V4Dr-0005F8-00@mail.python.org>

On Thursday 18 July 2002 01:25 am, Greg Ewing wrote:
> Andrew Koenig <ark@research.att.com>:
> > Is a file a container or not?
>
> I would say no, a file object is not a container in Python terms.
> You can't index it with [] or use len() on it or any of
> the other things you expect to be able to do on containers.
>
> I think we just have to live with the idea that there are
> things other than containers that can supply iterators.

Yes, there are such things, and there may be cases in
which no other alternative makes sense.  But I don't think
files are necessarily in such a bind.

> Forcing everything that can supply an iterator to bend
> over backwards to try to be a random-access container
> as well would be too cumbersome.

Absolutely.  But what Oren's patch does, and my mods of
it preserve, is definitely NOT "forcing" files "to be random-
access containers": on the contrary, it accepts the fact
that files aren't containers and conceptually simplifies
things by making them iterators instead.

I'm not sure about "random access" being needed to be
a container.  Consider sets, e.g. as per Greg Wilson's
soapbox implementation (as modified by my patch to
allow immutable-sets, maybe, but that's secondary).

They're doubtlessly containers, able to produce on
request as many iterators as you wish, each iterator
not affecting the set's state in any way -- the ideal.

But what sense would it make to force sets to expose
a __getitem__?  Right now they inherit from dict and
thus do happen to expose it, but that's really an
implementation artefact showing through (and a good
example of why one might like to inherit without needing
to expose all of the superclass's interface, to tie this in
to another recent thread -- inheritance for implementation).

Ideally, sets would expose __contains__, __iter__, __len__,
ways to add and remove elements, and perhaps (it's so in
Greg's implementation, and I didn't touch that) set ops such
as union, intersection &c.  someset[anindex] is really a weird
thing to have... yet sets _are_ containers!


Alex



From guido@python.org  Thu Jul 18 19:42:30 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 14:42:30 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: Your message of "Mon, 15 Jul 2002 10:52:56 PDT."
 <20020715175256.5971.qmail@web40112.mail.yahoo.com>
References: <20020715175256.5971.qmail@web40112.mail.yahoo.com>
Message-ID: <200207181842.g6IIgUo22271@odiug.zope.com>

> Yes.  The PyBufferObject grabs the pointer from the PyBufferProcs
> supporting object when the PyBufferObject is created.  If the PyBufferProcs
> supporting object reallocates the memory (possibly from a resize) the
> PyBufferObject can be left with a bad pointer.  This is easily possible if
> you try to use the array module arrays as a buffer.
> 
> I've submitted a patch to fix this particular problem (among others), but
> there are still enough things that the buffer object can't do that
> something new is needed.

Can you remind me of the patch#?  (I'm curious how you plan to fix
this...)

> In the PEP that I'm drafting, I've been calling the new object "bytes"
> (since it is just a simple array of bytes).  Now that you guys are
> referring to it as the "memory object", should I change the name?  Doesn't
> really matter, but it might avoid confusion to know we're all talking about
> the same thing.

I like bytes just fine.

PS, Todd, if you can, please don't send HTML-only mail to
python-dev...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jul 18 19:49:19 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 14:49:19 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Thu, 18 Jul 2002 08:02:23 +0200."
 <E17V4Mz-0000Hj-00@mail.python.org>
References: <200207172348.g6HNmEB23863@oma.cosc.canterbury.ac.nz>
 <E17V4Mz-0000Hj-00@mail.python.org>
Message-ID: <200207181849.g6IInJa22327@odiug.zope.com>

> > Would making them cooperate with GC be a difficult
> > thing to do? Seems to me we should be moving towards
> > making everything cooperate with GC, and fixing
> > things like this whenever they come to light.
> 
> Tim Peters says it wouldn't be, but I have not explored that.

But he also warned that it introduces new surprises.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jul 18 19:45:41 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 14:45:41 -0400
Subject: [Python-Dev] Re: Sets
In-Reply-To: Your message of "Thu, 18 Jul 2002 07:52:34 +0200."
 <E17V4Dr-0005F8-00@mail.python.org>
References: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz>
 <E17V4Dr-0005F8-00@mail.python.org>
Message-ID: <200207181845.g6IIjfw22307@odiug.zope.com>

> But what sense would it make to force sets to expose
> a __getitem__?  Right now they inherit from dict and
> thus do happen to expose it, but that's really an
> implementation artefact showing through (and a good
> example of why one might like to inherit without needing
> to expose all of the superclass's interface, to tie this in
> to another recent thread -- inheritance for implementation).
> 
> Ideally, sets would expose __contains__, __iter__, __len__,
> ways to add and remove elements, and perhaps (it's so in
> Greg's implementation, and I didn't touch that) set ops such
> as union, intersection &c.  someset[anindex] is really a weird
> thing to have... yet sets _are_ containers!

I believe I recommended to Greg to make sets "have" a dict instead of
"being" dicts, and I think he agreed.  But I guess he never got to
implementing that change.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aleax@aleax.it  Thu Jul 18 06:57:37 2002
From: aleax@aleax.it (Alex Martelli)
Date: Thu, 18 Jul 2002 07:57:37 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207172332.g6HNWMp23835@oma.cosc.canterbury.ac.nz>
References: <200207172332.g6HNWMp23835@oma.cosc.canterbury.ac.nz>
Message-ID: <E17V4Ij-00073d-00@mail.python.org>

On Thursday 18 July 2002 01:32 am, Greg Ewing wrote:
> Alex Martelli <aleax@aleax.it>:
> > All files have seek and write, but not on all files do they work -- and
> > the same goes for iteration.  I.e., it IS something of a mess
>
> I've just had a thought. Maybe it would be less of a mess
> if what we are calling "iterators" had been called "streams"

Possibly -- I did use the "streams" name often in the tutorial
on iterators and generators, it's a very natural term.

> instead. Then the term "iterator" could have been reserved
> for the special case of an object that provides stream
> access to a random-access collection.

Nice touch, except that I keep quibbling on the "random
access" need -- see my previous msg about sets.

> Then you could say that a file object is a stream object

That's what I'd love to do -- and requires the file object to
expose a next method and have iter(f) is f.  That's what
Oren's patch does, and the reason I'm trying to save it
from the need for a reference loop.

> that provides line-by-line access to an OS file. Other
> stream objects can be constructed that give access to
> the OS file in other units. That would all make sense
> without seeming to imply any multi-pass ability.

Seekable files can be multi-pass, but in the strict sense
that you can rewind them -- it's still impractical to have
them produce multiple *independent* iterators (needing
some sort of in-memory caching).


Alex



From jeremy@alum.mit.edu  Thu Jul 18 20:08:16 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jul 2002 15:08:16 -0400
Subject: [Python-Dev] configure problems porting to Tru64
Message-ID: <15671.4640.361811.434411@slothrop.zope.com>

I've been trying to build with the current CVS on Tru64 today.  This
is Tru64 Unix 5.1a with Compaq C++ 6.5.  I've run into a bunch of
problems with posixmodule.c (not surprise there), but I don't know
what the right strategy for fixing them is.

Here is a conflicting set of problems:

fchdir() is only defined if _XOPEN_SOURCE_EXTENDED is defined.

setpgrp() takes no arguments if _XOPEN_SOURCE_EXTENDED is defined, but
two arguments if it is not.

I found the fchdir() problem first and though the solution would be to
change this bit of code in Python.h:

    /* Forcing SUSv2 compatibility still produces problems on some
       platforms, True64 and SGI IRIX being two of them, so for now the
       define is switched off. */
    #if 0
    #ifndef _XOPEN_SOURCE
    # define _XOPEN_SOURCE	500
    #endif
    #endif

And change "#if 0" to "#if __digital__", but that causes the setpgrp()
problem to appear.  It seems that configure has a test for whether
setpgrp() takes arguments, but configure runs its test without
defining _XOPEN_SOURCE.

(I'll also note that configure.in has a rather complex test for this,
when it appears that autoconf has a builtin AC_FUNC_SETPGRP.  Anyone
know why we don't use this?)

How should we actually fix this problem?  It seems to me that the
right solution is to define _XOPEN_SOURCE in Tru64 and somehow
guarantee that configure runs its tests with that defined, too.  How
would we achieve that?

Jeremy




From mal@lemburg.com  Thu Jul 18 20:13:37 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jul 2002 21:13:37 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128
 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121
 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170
 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156
 xxobject.c,2.20,2.21
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>	<3D35A188.20407@lemburg.com>	<15669.47553.15097.651868@slothrop.zope.com>	<3D35D466.5090903@lemburg.com>	<200207172045.g6HKjBg13729@odiug.zope.com>	<3D35DA67.8060206@lemburg.com>	<3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com>
Message-ID: <3D371361.7050908@lemburg.com>

Jeremy Hylton wrote:
>>>>>>"MAL" == mal  <M.-A.> writes:
>>>>>
> 
>   MAL> M.-A. Lemburg wrote:
>   >>> I suggest that we keep Jeremy's checkins in 2.3.  Hopefully
>   >>> during the alpha or beta release cycle we will find out if there
>   >>> *really* are still platforms with broken compilers.  At worst,
>   >>> it will show up after 2.3 final is released, and then we can fix
>   >>> it in 2.3.1.  You won't have to target mx for 2.3 for another 18
>   >>> months (assuming the PBF ever releases Python-in-a-Tie).
>   >>
>   >>
>   >> It's easy enough for me to add the #defines to the support header
>   >> file if you take it out of the distribution, so it wouldn't hurt.
> 
>   MAL> Just an addition: please leave the configure test in the
>   MAL> distribution. While I could implement that using distutils as
>   MAL> well, I would rather benefit from relying on config.h doing the
>   MAL> right thing in case there are some newly broken compilers out
>   MAL> there, e.g. the xlC one on AIX seems to be a very picky one...
> 
> I don't understand what your goal is.  Why do you want the configure
> test if your header file has a bunch of platform-specific ifdefs?  If
> these platforms actually had a problem, the configure test would have
> caught it and you wouldn't need the ifdefs.  The only way the ifdefs
> would have an effect is if the configure test did not detect a
> problem; but if the configure test didn't detect a problem, then you
> don't need the ifdefs.

Correct, but I don't want to add more cruft to the file:

The configure script tests whether static forwards work
or not. If you'd rip out the test as well, then I'd have
to add those platforms which still have problems manually.

The problem is: I don't know which platforms these are
(because configure found these itself).

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From greg@cosc.canterbury.ac.nz  Thu Jul 18 11:03:47 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jul 2002 22:03:47 +1200 (NZST)
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
Message-ID: <200207181003.g6IA3l127038@oma.cosc.canterbury.ac.nz>

Someone told me that Pyrex should be generating
__declspec(dllexport) for the module init func.
But someone else says this is only needed if
you're importing a dll as a library, and that
it's not needed for Python extensions.

Can anyone who knows what they're doing on
Windows give me a definitive answer about
whether it's really needed or not?

Thanks,

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From aleax@aleax.it  Thu Jul 18 07:01:54 2002
From: aleax@aleax.it (Alex Martelli)
Date: Thu, 18 Jul 2002 08:01:54 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207172339.g6HNd5j23845@oma.cosc.canterbury.ac.nz>
References: <200207172339.g6HNd5j23845@oma.cosc.canterbury.ac.nz>
Message-ID: <E17V4MO-0008SY-00@mail.python.org>

On Thursday 18 July 2002 01:39 am, Greg Ewing wrote:
> Alex Martelli <aleax@aleax.it>:
> > the file object's is the only example of "fat interface" problem
> > in Python -- an interface that exposes a lot of methods, with many
> > objects claiming they implement that interface but actually lying
>
> Maybe the existing file object should be split up into
> some number of other objects with smaller interfaces.

In an ideal world, yes.  In practice, I strongly doubt it's feasible
to break backwards compatibility THAT heavily.


> For example, instead of the file object actually accessing an
> OS file itself, it could just be a wrapper around an
> underlying "bytestream" object, which implements only
> read() and write().

I suspect read and write would best be kept on separate
interfaces.  Ability to read, write, seek-and-tell, being three
atoms of which it makes sense to have about 6 combos
(R, W, R+W, each with or without S&T).  Rewind might
make sense separately from S&T if streaming tapes were still in
fashion and OS's gave natural access to them.

But I do think it's all pretty academic.


Alex



From mal@lemburg.com  Thu Jul 18 20:19:21 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jul 2002 21:19:21 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com> <3D220A86.5070003@lemburg.com> <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <m3znx79rk3.fsf@mira.informatik.hu-berlin.de> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com>              <20020709012056.GA2526@cthulhu.gerg.ca> <200207181549.g6IFniw21368@odiug.zope.com>
Message-ID: <3D3714B9.1060807@lemburg.com>

Guido van Rossum wrote:
>>>Perhaps we could have some kind of category for distutils
>>>packages which marks them as system add-ons vs. site add-ons.
>>
>>+1 -- this should definitely be up to the package author/packager, not
>>the local admin.  I once tried to convince Guido that the ability to
>>occasionally upgrade standard library modules/packages would be a good
>>thing, but he wasn't having it.  Any change of heart, O Mighty BDFL?
> 
> 
> Before I answer that, here's a question.  Why do we think it's a good
> idea to distribute upgrades as separate add-ons while we don't think
> it's okay to distribute such upgrades with bugfix releases? 

The idea is to provide bugfixes for Python versions which are
no longer being maintained. Of course, the effect would only
show a few years ahead.

> Doesn't
> this just increase the variability of site configurations, and hence
> version interaction hell?

I don't think that core packages are any different than
other third party packages: they are usually independent
enough from the rest of the code that upgrades don't affect
the workings of the other code using it. The internals are
free to change, though, e.g. to accomodate bug fixes, etc.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From xscottg@yahoo.com  Thu Jul 18 20:24:50 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Thu, 18 Jul 2002 12:24:50 -0700 (PDT)
Subject: [Python-Dev] Fw: Behavior of buffer()
In-Reply-To: <200207181842.g6IIgUo22271@odiug.zope.com>
Message-ID: <20020718192450.15024.qmail@web40105.mail.yahoo.com>

--- Guido van Rossum <guido@python.org> wrote:
> > Yes.  The PyBufferObject grabs the pointer from the PyBufferProcs
> > supporting object when the PyBufferObject is created.  If the
> > PyBufferProcs supporting object reallocates the memory (possibly 
> > from a resize) the PyBufferObject can be left with a bad pointer.
> > This is easily possible if you try to use the array module arrays 
> > as a buffer.
> > 
> > I've submitted a patch to fix this particular problem (among others),
> > but there are still enough things that the buffer object can't do that
> > something new is needed.
> 
> Can you remind me of the patch#?  (I'm curious how you plan to fix
> this...)
> 

Patch number 552438.

Instead of cacheing the pointer, it grabs it from the other object every
time it is needed.  Might be a little slower, but I think it's correct.


> Barry (the PEP czar) forwarded me your PEP.  I'll try to do some
> triage on it so I can tell Barry whether to check it in (that doesn't
> mean it's accepted :-).

<chuckle> I'm bad at patience, but I'm not terribly naive.  I fully expect
everyone and their dog will find something to dislike before it gets
approved/rejected.


Cheers,
    -Scott


__________________________________________________
Do You Yahoo!?
Yahoo! Autos - Get free new car price quotes
http://autos.yahoo.com



From haering_python@gmx.de  Thu Jul 18 20:28:51 2002
From: haering_python@gmx.de (Gerhard =?iso-8859-1?Q?H=E4ring?=)
Date: Thu, 18 Jul 2002 21:28:51 +0200
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <200207181003.g6IA3l127038@oma.cosc.canterbury.ac.nz>
References: <200207181003.g6IA3l127038@oma.cosc.canterbury.ac.nz>
Message-ID: <20020718192851.GA2759@lilith.my-fqdn.de>

* Greg Ewing <greg@cosc.canterbury.ac.nz> [2002-07-18 22:03 +1200]:
> Someone told me that Pyrex should be generating
> __declspec(dllexport) for the module init func.

That's wrong. You should be using DL_EXPORT instead, which will do the
right thing no matter which platform you're on: on Windows, it will
expand to __declspec(dllexport), iff you're compiling an extension
module (in contrast to compiling the Python core). I believe that on
Unix, it will expand to an empty string :-)

You also don't need any #ifdefs for win32 for setting ob_type, just set
them _only_ in your init function and leave them as NULL in the
declarations.

Gerhard
-- 
This sig powered by Python!
Außentemperatur in München: 14.3 °C      Wind: 1.9 m/s



From guido@python.org  Thu Jul 18 20:30:41 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 15:30:41 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Thu, 18 Jul 2002 08:01:54 +0200."
 <E17V4MO-0008SY-00@mail.python.org>
References: <200207172339.g6HNd5j23845@oma.cosc.canterbury.ac.nz>
 <E17V4MO-0008SY-00@mail.python.org>
Message-ID: <200207181930.g6IJUfX22643@odiug.zope.com>

> I suspect read and write would best be kept on separate
> interfaces.  Ability to read, write, seek-and-tell, being three
> atoms of which it makes sense to have about 6 combos
> (R, W, R+W, each with or without S&T).  Rewind might
> make sense separately from S&T if streaming tapes were still in
> fashion and OS's gave natural access to them.

5, because R+W without S&T makes little sense.

> But I do think it's all pretty academic.

C++ has tried very hard to do this with its istream, ostream and
iostream classes; I believe I heard C++ people say once that it's not
considered a success.  I believe Java has tried to address this too.
What do you think of Java's solution?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ping@zesty.ca  Thu Jul 18 20:31:45 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Thu, 18 Jul 2002 12:31:45 -0700 (PDT)
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <20020718100631.A3468@doublegemini.com>
Message-ID: <Pine.LNX.4.44.0207181215120.17524-100000@ziggy>

On Thu, 18 Jul 2002, Clark C . Evans wrote:
> On Wed, Jul 17, 2002 at 02:58:55PM -0500, Ka-Ping Yee wrote:
> | But i think this is more than a minor problem.  This is a
> | namespace collision problem, and that's significant.  Naming
> | the method "next" means that any object with a "next" method
> | cannot be adapted to support the iterator protocol.  Unfortunately
> | "next" is a pretty common word and it's quite possible that such
> | a method name is already in use.
>
> Ping,
>
> Do you have any suggestions for re-wording the Iterator questionare
> at http://yaml.org/wk/survey?id=pyiter to reflect this paragraph above?

I might add something like:

    One motivation for this change is that the name "next()" might
    collide with the name of an existing "next()" method.  This could
    cause a problem if someone wants to implement the iterator protocol
    for an object that already happens to have a method called "next()".
    So far no one has reported encountering this situation.  It seems
    plausible that there will be some objects where it would be nice to
    support the iterator protocol, and we have heard of some objects
    with methods named "next()", but we don't know how likely or
    unlikely it is that there's an object where both are true.

Does that seem fair?


-- ?!ng




From jeremy@alum.mit.edu  Thu Jul 18 20:32:14 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jul 2002 15:32:14 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128
 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121
 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170
 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156
 xxobject.c,2.20,2.21
In-Reply-To: <3D371361.7050908@lemburg.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
 <3D35A188.20407@lemburg.com>
 <15669.47553.15097.651868@slothrop.zope.com>
 <3D35D466.5090903@lemburg.com>
 <200207172045.g6HKjBg13729@odiug.zope.com>
 <3D35DA67.8060206@lemburg.com>
 <3D35DBB9.9000103@lemburg.com>
 <15670.62611.943840.954629@slothrop.zope.com>
 <3D371361.7050908@lemburg.com>
Message-ID: <15671.6078.577033.943393@slothrop.zope.com>

>>>>> "MAL" == mal  <M.-A.> writes:

  MAL> The configure script tests whether static forwards work or
  MAL> not. If you'd rip out the test as well, then I'd have to add
  MAL> those platforms which still have problems manually.

  MAL> The problem is: I don't know which platforms these are (because
  MAL> configure found these itself).

If you think the configure test works, why do you have platform
specific ifdefs in your header file?

Jeremy




From mal@lemburg.com  Thu Jul 18 20:35:01 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jul 2002 21:35:01 +0200
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
References: <200207181003.g6IA3l127038@oma.cosc.canterbury.ac.nz>
Message-ID: <3D371865.5070908@lemburg.com>

Greg Ewing wrote:
> Someone told me that Pyrex should be generating
> __declspec(dllexport) for the module init func.
> But someone else says this is only needed if
> you're importing a dll as a library, and that
> it's not needed for Python extensions.
> 
> Can anyone who knows what they're doing on
> Windows give me a definitive answer about
> whether it's really needed or not?

You need to export at least the init<modulename>()
API and that is usually done using the dllexport
flag.

Note that this is only needed for shared modules
(DLLs), not modules which are linked statically.

This is what I use for this:

/* Macro to "mark" a symbol for DLL export */

#if (defined(_MSC_VER) && _MSC_VER > 850		\
      || defined(__MINGW32__) || defined(__CYGWIN) || defined(__BEOS__))
# ifdef __cplusplus
#   define MX_EXPORT(type) extern "C" type __declspec(dllexport)
# else
#   define MX_EXPORT(type) extern type __declspec(dllexport)
# endif
#elif defined(__WATCOMC__)
#   define MX_EXPORT(type) extern type __export
#elif defined(__IBMC__)
#   define MX_EXPORT(type) extern type _Export
#else
#   define MX_EXPORT(type) extern type
#endif

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From tim@zope.com  Thu Jul 18 20:34:58 2002
From: tim@zope.com (Tim Peters)
Date: Thu, 18 Jul 2002 15:34:58 -0400
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <200207181003.g6IA3l127038@oma.cosc.canterbury.ac.nz>
Message-ID: <BIEJKCLHCIOIHAGOKOLHMEDMDHAA.tim@zope.com>

[Greg Ewing]
> Someone told me that Pyrex should be generating
> __declspec(dllexport) for the module init func.
> But someone else says this is only needed if
> you're importing a dll as a library,

1. What else could one do with a DLL?  That is, in your view is
   the "importing ... as a library" part not redundant?

2. Does Pyrex compile to DLLs (or PYDs) on Windows?  I simply don't
   know.

> and that it's not needed for Python extensions.

If an extension is compiled into a DLL/PYD, it must tell the linker which
symbols are to be exported.  __declspec(dllexport) in the source is one way
to do that.  Other possibilities include creating a .def file, or specifying
exported names on the linker's command line (like "/export:init_sre").

The best thing to do for Windows is ask that Windows users supply patches.
Or you could upgrade to Windows yourself <wink>.




From fredrik@pythonware.com  Thu Jul 18 20:37:09 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 18 Jul 2002 21:37:09 +0200
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
References: <200207181003.g6IA3l127038@oma.cosc.canterbury.ac.nz>
Message-ID: <034701c22e92$9473dfc0$ced241d5@hagrid>

greg wrote:

> Someone told me that Pyrex should be generating
> __declspec(dllexport) for the module init func.

almost; for portability, it's better to use the DL_EXPORT
provided by Python.h:

DL_EXPORT(void)
init_module(void)
{
    ...
}

> But someone else says this is only needed if
> you're importing a dll as a library, and that
> it's not needed for Python extensions.

that someone is confused; the dllexport declaration makes
sure that the init function is exported from the DLL.  if not,
Python's PYD loader won't find the init function.

</F>




From aleax@aleax.it  Thu Jul 18 20:38:15 2002
From: aleax@aleax.it (Alex Martelli)
Date: Thu, 18 Jul 2002 21:38:15 +0200
Subject: [Python-Dev] Re: Sets
In-Reply-To: <200207181845.g6IIjfw22307@odiug.zope.com>
References: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz> <E17V4Dr-0005F8-00@mail.python.org> <200207181845.g6IIjfw22307@odiug.zope.com>
Message-ID: <02071821381500.04480@arthur>

On Thursday 18 July 2002 20:45, Guido van Rossum wrote:
	...
> I believe I recommended to Greg to make sets "have" a dict instead of
> "being" dicts, and I think he agreed.  But I guess he never got to
> implementing that change.

Right.  OK, guess I'll make a new patch using delegation instead
of inheritance, then.


Alex



From guido@python.org  Thu Jul 18 20:50:39 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 15:50:39 -0400
Subject: [Python-Dev] Re: Sets
In-Reply-To: Your message of "Thu, 18 Jul 2002 21:38:15 +0200."
 <02071821381500.04480@arthur>
References: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz> <E17V4Dr-0005F8-00@mail.python.org> <200207181845.g6IIjfw22307@odiug.zope.com>
 <02071821381500.04480@arthur>
Message-ID: <200207181950.g6IJodg22778@odiug.zope.com>

> > I believe I recommended to Greg to make sets "have" a dict instead of
> > "being" dicts, and I think he agreed.  But I guess he never got to
> > implementing that change.
> 
> Right.  OK, guess I'll make a new patch using delegation instead
> of inheritance, then.

Maybe benchmark the performance too.  If the "has" version is much
slower, perhaps we could remove unwanted interfaces from the public
API by overriding them with something that raises an exception (and
rename the internal versions to some internal name if they are
needed).

--Guido van Rossum (home page: http://www.python.org/~guido/)




From ping@zesty.ca  Thu Jul 18 20:59:01 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Thu, 18 Jul 2002 12:59:01 -0700 (PDT)
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <200207172121.g6HLLQH13946@odiug.zope.com>
Message-ID: <Pine.LNX.4.44.0207181245160.17524-100000@ziggy>

I wrote:
> __iter__ is a red herring.
[...blah blah blah...]
> Iterators are currently asked to support both protocols.  The
> semantics of iteration come only from protocol 2; protocol 1 is
> an effort to make iterators look sorta like sequences.  But the
> analogy is very weak -- these are "sequences" that destroy
> themselves while you look at them -- not like any typical
> sequence i've ever seen!
>
> The short of it is that whenever any Python programmer says
> "for x in y", he or she had better be darned sure of whether
> this is going to destroy y.  Whatever we can do to make this
> clear would be a good idea.

On Wed, 17 Jul 2002, Guido van Rossum wrote:
> This is a very good summary of the two iterator protocols.  Ping,
> would you mind adding this to PEP 234?

I have now done so.

I didn't add the whole thing verbatim, because the tone doesn't fit:
it was written with the intent of motivating a change to the
protocol, rather than describing what the protocol is.  Presumably
we don't want the PEP to say "__iter__ is a red herring".

There's a bunch of issues flying around here, which i'll try to
explain better in a separate posting.  But i wanted to take care
of Guido's request first.  I have toned down and abridged my text
somewhat, and strengthened the requirement for __iter__().  Here
is what the "API specification" section now says:

    Classes can define how they are iterated over by defining an
    __iter__() method; this should take no additional arguments and
    return a valid iterator object.  A class that wants to be an
    iterator should implement two methods: a next() method that behaves
    as described above, and an __iter__() method that returns self.

    The two methods correspond to two distinct protocols:

    1. An object can be iterated over with "for" if it implements
       __iter__() or __getitem__().

    2. An object can function as an iterator if it implements next().

    Container-like objects usually support protocol 1.  Iterators are
    currently required to support both protocols.  The semantics of
    iteration come only from protocol 2; protocol 1 is present to make
    iterators behave like sequences.  But the analogy is weak -- unlike
    ordinary sequences, iterators are "sequences" that are destroyed
    by the act of looking at their elements.

    Consequently, whenever any Python programmer says "for x in y",
    he or she must be sure of whether this is going to destroy y.


-- ?!ng




From guido@python.org  Thu Jul 18 20:58:50 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 15:58:50 -0400
Subject: [Python-Dev] Re: Patch level versions and new features (Was: Some dull gc stats)
In-Reply-To: Your message of "Thu, 18 Jul 2002 21:19:21 +0200."
 <3D3714B9.1060807@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com> <3D220A86.5070003@lemburg.com> <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <m3znx79rk3.fsf@mira.informatik.hu-berlin.de> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> <20020709012056.GA2526@cthulhu.gerg.ca> <200207181549.g6IFniw21368@odiug.zope.com>
 <3D3714B9.1060807@lemburg.com>
Message-ID: <200207181958.g6IJwoY22816@odiug.zope.com>

> Guido van Rossum wrote:
> >>>Perhaps we could have some kind of category for distutils
> >>>packages which marks them as system add-ons vs. site add-ons.
> >>
> >>+1 -- this should definitely be up to the package author/packager, not
> >>the local admin.  I once tried to convince Guido that the ability to
> >>occasionally upgrade standard library modules/packages would be a good
> >>thing, but he wasn't having it.  Any change of heart, O Mighty BDFL?
> > 
> > 
> > Before I answer that, here's a question.  Why do we think it's a good
> > idea to distribute upgrades as separate add-ons while we don't think
> > it's okay to distribute such upgrades with bugfix releases? 

[MAL]
> The idea is to provide bugfixes for Python versions which are
> no longer being maintained. Of course, the effect would only
> show a few years ahead.

Hm, if you really are fixing bugs in old versions, why not patch the
Python installation in-place rather than trying to play nice?

> > Doesn't
> > this just increase the variability of site configurations, and hence
> > version interaction hell?
> 
> I don't think that core packages are any different than
> other third party packages: they are usually independent
> enough from the rest of the code that upgrades don't affect
> the workings of the other code using it. The internals are
> free to change, though, e.g. to accomodate bug fixes, etc.

Well, I don't expect that we'll do independent upgrades for core
packages, so I propose to end this thread.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jul 18 21:08:54 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 16:08:54 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Thu, 18 Jul 2002 12:59:01 PDT."
 <Pine.LNX.4.44.0207181245160.17524-100000@ziggy>
References: <Pine.LNX.4.44.0207181245160.17524-100000@ziggy>
Message-ID: <200207182008.g6IK8tb22853@odiug.zope.com>

> I didn't add the whole thing verbatim, because the tone doesn't fit:
> it was written with the intent of motivating a change to the
> protocol, rather than describing what the protocol is.  Presumably
> we don't want the PEP to say "__iter__ is a red herring".
> 
> There's a bunch of issues flying around here, which i'll try to
> explain better in a separate posting.  But i wanted to take care
> of Guido's request first.  I have toned down and abridged my text
> somewhat, and strengthened the requirement for __iter__().  Here
> is what the "API specification" section now says:
> 
>     Classes can define how they are iterated over by defining an
>     __iter__() method; this should take no additional arguments and
>     return a valid iterator object.  A class that wants to be an
>     iterator should implement two methods: a next() method that behaves
>     as described above, and an __iter__() method that returns self.
> 
>     The two methods correspond to two distinct protocols:
> 
>     1. An object can be iterated over with "for" if it implements
>        __iter__() or __getitem__().
> 
>     2. An object can function as an iterator if it implements next().
> 
>     Container-like objects usually support protocol 1.  Iterators are
>     currently required to support both protocols.  The semantics of
>     iteration come only from protocol 2; protocol 1 is present to make
>     iterators behave like sequences.  But the analogy is weak -- unlike
>     ordinary sequences, iterators are "sequences" that are destroyed
>     by the act of looking at their elements.

Find up to here.

>     Consequently, whenever any Python programmer says "for x in y",
>     he or she must be sure of whether this is going to destroy y.

I don't understand why this is here.  *Why* is it important to know
whether this is going to destroy y?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jul 18 21:42:02 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 16:42:02 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Thu, 18 Jul 2002 07:23:16 EDT."
 <oqele1th3v.fsf@carouge.sram.qc.ca>
References: <Pine.LNX.4.44.0207171540460.17524-100000@ziggy> <200207180043.g6I0hKB25427@pcp02138704pcs.reston01.va.comcast.net>
 <oqele1th3v.fsf@carouge.sram.qc.ca>
Message-ID: <200207182042.g6IKg2n22947@odiug.zope.com>

> Maybe the reasons behind having __iter__() returning itself should be
> clearly expressed in the PEP, too.  On this list, Tim gave one recently,
> Guido gives another here, but unless I missed it, the PEP gives none.
> Usually, PEPs explain the reasons behind the choices.

Ping added this to the PEP:

    The two methods correspond to two distinct protocols:

    1. An object can be iterated over with "for" if it implements
       __iter__() or __getitem__().

    2. An object can function as an iterator if it implements next().

    Container-like objects usually support protocol 1.  Iterators are
    currently required to support both protocols.  The semantics of
    iteration come only from protocol 2; protocol 1 is present to make
    iterators behave like sequences.  But the analogy is weak -- unlike
    ordinary sequences, iterators are "sequences" that are destroyed
    by the act of looking at their elements.

(I could do without the last sentence, since this expresses a value
judgement rather than fact -- not a good thing to have in a PEP's
"specification" section.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal@lemburg.com  Thu Jul 18 21:50:31 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jul 2002 22:50:31 +0200
Subject: [Python-Dev] Re: Patch level versions and new features (Was:
 Some dull gc stats)
References: <BIEJKCLHCIOIHAGOKOLHKENMDFAA.tim@zope.com> <3D220A86.5070003@lemburg.com> <m37kkdjppq.fsf_-_@mira.informatik.hu-berlin.de> <3D22ADD9.1030901@lemburg.com> <15650.64375.162977.160780@anthem.wooz.org> <3D2433B9.9080102@lemburg.com> <m3znx79rk3.fsf@mira.informatik.hu-berlin.de> <15657.39558.325764.651122@anthem.wooz.org> <3D299E42.70200@lemburg.com> <20020709012056.GA2526@cthulhu.gerg.ca> <200207181549.g6IFniw21368@odiug.zope.com>              <3D3714B9.1060807@lemburg.com> <200207181958.g6IJwoY22816@odiug.zope.com>
Message-ID: <3D372A17.50509@lemburg.com>

Guido van Rossum wrote:
>>Guido van Rossum wrote:
>>
>>>>>Perhaps we could have some kind of category for distutils
>>>>>packages which marks them as system add-ons vs. site add-ons.
>>>>
>>>>+1 -- this should definitely be up to the package author/packager, not
>>>>the local admin.  I once tried to convince Guido that the ability to
>>>>occasionally upgrade standard library modules/packages would be a good
>>>>thing, but he wasn't having it.  Any change of heart, O Mighty BDFL?
>>>
>>>
>>>Before I answer that, here's a question.  Why do we think it's a good
>>>idea to distribute upgrades as separate add-ons while we don't think
>>>it's okay to distribute such upgrades with bugfix releases? 
>>
> 
> [MAL]
> 
>>The idea is to provide bugfixes for Python versions which are
>>no longer being maintained. Of course, the effect would only
>>show a few years ahead.
> 
> 
> Hm, if you really are fixing bugs in old versions, why not patch the
> Python installation in-place rather than trying to play nice?

We don't have an easy way of doing this, unless of course
we trick python setup.py install to install directly into
.../lib/pythonX.X rather than a sub directory on the path.

>>>Doesn't
>>>this just increase the variability of site configurations, and hence
>>>version interaction hell?
>>
>>I don't think that core packages are any different than
>>other third party packages: they are usually independent
>>enough from the rest of the code that upgrades don't affect
>>the workings of the other code using it. The internals are
>>free to change, though, e.g. to accomodate bug fixes, etc.
> 
> Well, I don't expect that we'll do independent upgrades for core
> packages, so I propose to end this thread.

Barry is already doing this with the email package and
I would expect more such packages to make their way into
the core. The PyXML package also has a life of its own
outside the core distribution and could benefit from this.

I think it's too early to end the thread.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From guido@python.org  Thu Jul 18 20:21:59 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 15:21:59 -0400
Subject: [Python-Dev] configure problems porting to Tru64
In-Reply-To: Your message of "Thu, 18 Jul 2002 15:08:16 EDT."
 <15671.4640.361811.434411@slothrop.zope.com>
References: <15671.4640.361811.434411@slothrop.zope.com>
Message-ID: <200207181922.g6IJM0O22574@odiug.zope.com>

> (I'll also note that configure.in has a rather complex test for this,
> when it appears that autoconf has a builtin AC_FUNC_SETPGRP.  Anyone
> know why we don't use this?)

I'll bet AC_FUNC_SETPGRP didn't exist in the autoconf version we were
using when we wrote that test.  Feel free to fix it.

BTW, the snake farm build for AIX-2-000000042E00-hal now fails like this:

../python/dist/src/Modules/posixmodule.c: In function `posix_fdatasync':
../python/dist/src/Modules/posixmodule.c:902: `fdatasync' undeclared (first use this function)
../python/dist/src/Modules/posixmodule.c:902: (Each undeclared identifier is reported only once
../python/dist/src/Modules/posixmodule.c:902: for each function it appears in.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aleax@aleax.it  Thu Jul 18 21:52:31 2002
From: aleax@aleax.it (Alex Martelli)
Date: Thu, 18 Jul 2002 22:52:31 +0200
Subject: [Python-Dev] Re: Sets
In-Reply-To: <200207181950.g6IJodg22778@odiug.zope.com>
References: <200207172325.g6HNPQM23808@oma.cosc.canterbury.ac.nz> <02071821381500.04480@arthur> <200207181950.g6IJodg22778@odiug.zope.com>
Message-ID: <E17VIGI-0004za-00@mail.python.org>

On Thursday 18 July 2002 09:50 pm, Guido van Rossum wrote:
> > > I believe I recommended to Greg to make sets "have" a dict instead of
> > > "being" dicts, and I think he agreed.  But I guess he never got to
> > > implementing that change.
> >
> > Right.  OK, guess I'll make a new patch using delegation instead
> > of inheritance, then.
>
> Maybe benchmark the performance too.  If the "has" version is much
> slower, perhaps we could remove unwanted interfaces from the public
> API by overriding them with something that raises an exception (and
> rename the internal versions to some internal name if they are
> needed).

I've just updated patch 580995 with the has-A rather than is-A version.
OK, I'll now run some simple benchmarks...

Looks good, offhand.  Here's the simple benchmark script:

import time
import set
import sys

clock = time.clock

raw = range(10000)
times = [None]*20

print "Timing Set %s (Python %s)" % (set.__version__, sys.version)

print "Make 20 10k-items sets (no reps)...",
start = clock()
for i in times:
    s10k = set.Set(raw)
stend = clock()
print stend-start

witre = range(1000)*10
print "Make 20 1k-items sets (x10 reps)...",
for i in times:
    s1k1 = set.Set(witre)
stend = clock()
print stend-start

raw1 = range(500, 1500)
print "Make 20 more 1k-items sets (no reps)...",
for i in times:
    s1k2 = set.Set(raw1)
stend = clock()
print stend-start

print "20 unions of 1k-items sets 50% overlap...",
for i in times:
    result = s1k1 | s1k2
stend = clock()
print stend-start

print "20 inters of 1k-items sets 50% overlap...",
for i in times:
    result = s1k1 & s1k2
stend = clock()
print stend-start

print "20 diffes of 1k-items sets 50% overlap...",
for i in times:
    result = s1k1 - s1k2
stend = clock()
print stend-start

print "20 simdif of 1k-items sets 50% overlap...",
for i in times:
    result = s1k1 ^ s1k2
stend = clock()
print stend-start


And here's a few runs (with -O of course) on my PC:


[alex@lancelot has]$ python -O ../bench_set.py
Timing Set $Revision: 1.5 $ (Python 2.2.1 (#2, Jul 15 2002, 17:32:26)
[GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)])
Make 20 10k-items sets (no reps)... 0.21
Make 20 1k-items sets (x10 reps)... 0.36
Make 20 more 1k-items sets (no reps)... 0.38
20 unions of 1k-items sets 50% overlap... 0.43
20 inters of 1k-items sets 50% overlap... 0.92
20 diffes of 1k-items sets 50% overlap... 1.41
20 simdif of 1k-items sets 50% overlap... 2.38
[alex@lancelot has]$ python -O ../bench_set.py
Timing Set $Revision: 1.5 $ (Python 2.2.1 (#2, Jul 15 2002, 17:32:26)
[GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)])
Make 20 10k-items sets (no reps)... 0.22
Make 20 1k-items sets (x10 reps)... 0.37
Make 20 more 1k-items sets (no reps)... 0.39
20 unions of 1k-items sets 50% overlap... 0.44
20 inters of 1k-items sets 50% overlap... 0.93
20 diffes of 1k-items sets 50% overlap... 1.42
20 simdif of 1k-items sets 50% overlap... 2.39
[alex@lancelot has]$ cd ../is
[alex@lancelot is]$ python -O ../bench_set.py
Timing Set $Revision: 1.5 $ (Python 2.2.1 (#2, Jul 15 2002, 17:32:26)
[GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)])
Make 20 10k-items sets (no reps)... 0.21
Make 20 1k-items sets (x10 reps)... 0.37
Make 20 more 1k-items sets (no reps)... 0.39
20 unions of 1k-items sets 50% overlap... 0.44
20 inters of 1k-items sets 50% overlap... 0.93
20 diffes of 1k-items sets 50% overlap... 1.42
20 simdif of 1k-items sets 50% overlap... 2.38
[alex@lancelot is]$ python -O ../bench_set.py
Timing Set $Revision: 1.5 $ (Python 2.2.1 (#2, Jul 15 2002, 17:32:26)
[GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)])
Make 20 10k-items sets (no reps)... 0.22
Make 20 1k-items sets (x10 reps)... 0.38
Make 20 more 1k-items sets (no reps)... 0.4
20 unions of 1k-items sets 50% overlap... 0.44
20 inters of 1k-items sets 50% overlap... 0.93
20 diffes of 1k-items sets 50% overlap... 1.42
20 simdif of 1k-items sets 50% overlap... 2.41
[alex@lancelot is]$

They look much of a muchness to me.
Sorry about the version stuck at 1.5 -- forgot to update that, but
you can tell the difference by the directory name, 'is' and 'has' resp.:-).

Python 2.3 (built from CVS 22 hours ago) is substantially faster at some
tasks (intersections and differences):
[alex@lancelot has]$ python -O ../bench_set.py
Timing Set $Revision: 1.5 $ (Python 2.3a0 (#44, Jul 18 2002, 00:03:05)
[GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)])
Make 20 10k-items sets (no reps)... 0.21
Make 20 1k-items sets (x10 reps)... 0.36
Make 20 more 1k-items sets (no reps)... 0.37
20 unions of 1k-items sets 50% overlap... 0.42
20 inters of 1k-items sets 50% overlap... 0.75
20 diffes of 1k-items sets 50% overlap... 1.08
20 simdif of 1k-items sets 50% overlap... 1.73
[alex@lancelot has]$ python -O ../bench_set.py
Timing Set $Revision: 1.5 $ (Python 2.3a0 (#44, Jul 18 2002, 00:03:05)
[GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)])
Make 20 10k-items sets (no reps)... 0.21
Make 20 1k-items sets (x10 reps)... 0.36
Make 20 more 1k-items sets (no reps)... 0.37
20 unions of 1k-items sets 50% overlap... 0.42
20 inters of 1k-items sets 50% overlap... 0.75
20 diffes of 1k-items sets 50% overlap... 1.08
20 simdif of 1k-items sets 50% overlap... 1.74
[alex@lancelot has]$
[alex@lancelot is]$ python -O ../bench_set.py
Timing Set $Revision: 1.5 $ (Python 2.3a0 (#44, Jul 18 2002, 00:03:05)
[GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)])
Make 20 10k-items sets (no reps)... 0.21
Make 20 1k-items sets (x10 reps)... 0.35
Make 20 more 1k-items sets (no reps)... 0.37
20 unions of 1k-items sets 50% overlap... 0.41
20 inters of 1k-items sets 50% overlap... 0.74
20 diffes of 1k-items sets 50% overlap... 1.07
20 simdif of 1k-items sets 50% overlap... 1.72
[alex@lancelot is]$ python -O ../bench_set.py
Timing Set $Revision: 1.5 $ (Python 2.3a0 (#44, Jul 18 2002, 00:03:05)
[GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)])
Make 20 10k-items sets (no reps)... 0.21
Make 20 1k-items sets (x10 reps)... 0.36
Make 20 more 1k-items sets (no reps)... 0.38
20 unions of 1k-items sets 50% overlap... 0.42
20 inters of 1k-items sets 50% overlap... 0.75
20 diffes of 1k-items sets 50% overlap... 1.08
20 simdif of 1k-items sets 50% overlap... 1.73
[alex@lancelot is]$

but as you can see, again it's uniformly faster on both 'is' and 'has'
versions of sets.

The 'has' version thus seems preferable here.


Alex



From jeremy@alum.mit.edu  Thu Jul 18 20:10:22 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jul 2002 15:10:22 -0400
Subject: [Python-Dev] staticforward
In-Reply-To: <15670.62365.517118.775364@slothrop.zope.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
 <3D35A188.20407@lemburg.com>
 <15669.47553.15097.651868@slothrop.zope.com>
 <3D35D466.5090903@lemburg.com>
 <200207172045.g6HKjBg13729@odiug.zope.com>
 <3D35DA67.8060206@lemburg.com>
 <15670.62365.517118.775364@slothrop.zope.com>
Message-ID: <15671.4766.961501.277589@slothrop.zope.com>

FWIW I confirm today that staticforward is not needed Tru64 5.1.

Jerem




From guido@python.org  Thu Jul 18 20:18:38 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 15:18:38 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Thu, 18 Jul 2002 07:57:37 +0200."
 <E17V4Ij-00073d-00@mail.python.org>
References: <200207172332.g6HNWMp23835@oma.cosc.canterbury.ac.nz>
 <E17V4Ij-00073d-00@mail.python.org>
Message-ID: <200207181918.g6IJIcW22539@odiug.zope.com>

> > I've just had a thought. Maybe it would be less of a mess
> > if what we are calling "iterators" had been called "streams"
> 
> Possibly -- I did use the "streams" name often in the tutorial
> on iterators and generators, it's a very natural term.

OTOH in C++ and Java, "stream" refers to an open file object (to
emphasize the iteratorish feeling of a file opened for sequential
reading or writing, as opposed to the concept of a file as a
random-access array of bytes on disk).

> Seekable files can be multi-pass, but in the strict sense
> that you can rewind them -- it's still impractical to have
> them produce multiple *independent* iterators (needing
> some sort of in-memory caching).

It would be trivial if you had an object representing the notion of a
file on disk rather than an open file.  Each iterator would be
implemented as a separate open file referring to the same filename.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy@alum.mit.edu  Thu Jul 18 22:00:05 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jul 2002 17:00:05 -0400
Subject: [Python-Dev] configure problems porting to Tru64
In-Reply-To: <200207181922.g6IJM0O22574@odiug.zope.com>
References: <15671.4640.361811.434411@slothrop.zope.com>
 <200207181922.g6IJM0O22574@odiug.zope.com>
Message-ID: <15671.11349.924113.246257@slothrop.zope.com>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

  GvR> BTW, the snake farm build for AIX-2-000000042E00-hal now fails
  GvR> like this:

  GvR> ../python/dist/src/Modules/posixmodule.c: In function
  GvR> `posix_fdatasync':
  GvR> ../python/dist/src/Modules/posixmodule.c:902: `fdatasync'
  GvR> undeclared (first use this function)
  GvR> ../python/dist/src/Modules/posixmodule.c:902: (Each undeclared
  GvR> identifier is reported only once
  GvR> ../python/dist/src/Modules/posixmodule.c:902: for each function
  GvR> it appears in.)

(I already mentioned this to Guido, but) This problem has been
occuring on AIX for a while.  It's unrelated to staticforward.

So we've now confirmed that staticforward is unneeded on AIX and
Tru64.  Perhaps MAL would like to find an SCO ODT compiler to try
it out with.

Jeremy




From mal@lemburg.com  Thu Jul 18 22:07:41 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jul 2002 23:07:41 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128
 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121
 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170
 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156
 xxobject.c,2.20,2.21
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>	<3D35A188.20407@lemburg.com>	<15669.47553.15097.651868@slothrop.zope.com>	<3D35D466.5090903@lemburg.com>	<200207172045.g6HKjBg13729@odiug.zope.com>	<3D35DA67.8060206@lemburg.com>	<3D35DBB9.9000103@lemburg.com>	<15670.62611.943840.954629@slothrop.zope.com>	<3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com>
Message-ID: <3D372E1D.50009@lemburg.com>

Jeremy Hylton wrote:
>>>>>>"MAL" == mal  <M.-A.> writes:
>>>>>
> 
>   MAL> The configure script tests whether static forwards work or
>   MAL> not. If you'd rip out the test as well, then I'd have to add
>   MAL> those platforms which still have problems manually.
> 
>   MAL> The problem is: I don't know which platforms these are (because
>   MAL> configure found these itself).
> 
> If you think the configure test works, why do you have platform
> specific ifdefs in your header file?

Because it doesn't always work :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From martin@v.loewis.de  Thu Jul 18 22:09:34 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 18 Jul 2002 23:09:34 +0200
Subject: [Python-Dev] Re: configure problems porting to Tru64
In-Reply-To: <15671.4640.361811.434411@slothrop.zope.com>
References: <15671.4640.361811.434411@slothrop.zope.com>
Message-ID: <m3znwo920h.fsf@mira.informatik.hu-berlin.de>

Jeremy Hylton <jeremy@alum.mit.edu> writes:

> (I'll also note that configure.in has a rather complex test for this,
> when it appears that autoconf has a builtin AC_FUNC_SETPGRP.  Anyone
> know why we don't use this?)

That test was introduced in configure.in 1.9, on 1994/11/03. It might
well be that autoconf did not support that test at that time.

> How should we actually fix this problem?  It seems to me that the
> right solution is to define _XOPEN_SOURCE in Tru64 and somehow
> guarantee that configure runs its tests with that defined, too.  How
> would we achieve that?

I think it is generally the right thing to define _XOPEN_SOURCE on
Unix, providing a negative list of systems that cannot support this
setting (or preferably solving whatever problems remain).

I'd put an (unconditional) AC_DEFINE into configure.in early on; it
*should* go into confdefs.h as configure proceeds, and thus be active
when other tests are performed.

Regards,
Martin



From aleax@aleax.it  Thu Jul 18 22:12:11 2002
From: aleax@aleax.it (Alex Martelli)
Date: Thu, 18 Jul 2002 23:12:11 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207181930.g6IJUfX22643@odiug.zope.com>
References: <200207172339.g6HNd5j23845@oma.cosc.canterbury.ac.nz> <E17V4MO-0008SY-00@mail.python.org> <200207181930.g6IJUfX22643@odiug.zope.com>
Message-ID: <E17VIZe-0004FU-00@mail.python.org>

On Thursday 18 July 2002 09:30 pm, Guido van Rossum wrote:
> > I suspect read and write would best be kept on separate
> > interfaces.  Ability to read, write, seek-and-tell, being three
> > atoms of which it makes sense to have about 6 combos
> > (R, W, R+W, each with or without S&T).  Rewind might
> > make sense separately from S&T if streaming tapes were still in
> > fashion and OS's gave natural access to them.
>
> 5, because R+W without S&T makes little sense.

Reasonably little, yes -- hard to make up a non-contrived example
('preserve data up to the first occurrence of "bzz" and then overwrite
the rest of the file with "spam"'...?-).


> > But I do think it's all pretty academic.
>
> C++ has tried very hard to do this with its istream, ostream and
> iostream classes; I believe I heard C++ people say once that it's not
> considered a success.  

As a C++ person I agree.  It's better by far than C, mind you -- for
text I/O, at least -- but it's complex and intricate.

> I believe Java has tried to address this too.
> What do you think of Java's solution?

In the only time in my life when I was using Java in earnest (in code
intended for production purposes, though think3 later dropped the
idea), Java hit me with a deprecation to the solar plexus exactly in
this area, forcing me to do much unproductive rewriting -- so I find
it hard to be unbiased.  But even striving to be fair, I don't see the
advantage compared e.g. to C++'s streams.


Alex



From jeremy@alum.mit.edu  Thu Jul 18 22:16:09 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jul 2002 17:16:09 -0400
Subject: [Python-Dev] staticforward
In-Reply-To: <3D372E1D.50009@lemburg.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
 <3D35A188.20407@lemburg.com>
 <15669.47553.15097.651868@slothrop.zope.com>
 <3D35D466.5090903@lemburg.com>
 <200207172045.g6HKjBg13729@odiug.zope.com>
 <3D35DA67.8060206@lemburg.com>
 <3D35DBB9.9000103@lemburg.com>
 <15670.62611.943840.954629@slothrop.zope.com>
 <3D371361.7050908@lemburg.com>
 <15671.6078.577033.943393@slothrop.zope.com>
 <3D372E1D.50009@lemburg.com>
Message-ID: <15671.12313.725886.680036@slothrop.zope.com>

>>>>> "MAL" == mal  <M.-A.> writes:

  MAL> The configure script tests whether static forwards work or
  MAL> not. If you'd rip out the test as well, then I'd have to add
  MAL> those platforms which still have problems manually.

  MAL> The problem is: I don't know which platforms these are (because
  MAL> configure found these itself).
  >>
  >> If you think the configure test works, why do you have platform
  >> specific ifdefs in your header file?

  MAL> Because it doesn't always work :-)

Let's make sure I've got this straight:

You believe there are platforms on which staticforward is necessary,
because you can not have a tentative definition of a static followed
by a definition with an initializer.  Yet the configure test of
exactly this behavior succeeds.  Further, you don't believe the
configure test works but you want us to leave it in anyway?

Jeremy




From aleax@aleax.it  Thu Jul 18 22:23:50 2002
From: aleax@aleax.it (Alex Martelli)
Date: Thu, 18 Jul 2002 23:23:50 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207181918.g6IJIcW22539@odiug.zope.com>
References: <200207172332.g6HNWMp23835@oma.cosc.canterbury.ac.nz> <E17V4Ij-00073d-00@mail.python.org> <200207181918.g6IJIcW22539@odiug.zope.com>
Message-ID: <E17VIl3-0007EI-00@mail.python.org>

On Thursday 18 July 2002 09:18 pm, Guido van Rossum wrote:
> > > I've just had a thought. Maybe it would be less of a mess
> > > if what we are calling "iterators" had been called "streams"
> >
> > Possibly -- I did use the "streams" name often in the tutorial
> > on iterators and generators, it's a very natural term.
>
> OTOH in C++ and Java, "stream" refers to an open file object (to
> emphasize the iteratorish feeling of a file opened for sequential
> reading or writing, as opposed to the concept of a file as a
> random-access array of bytes on disk).

...and in Unix Sys/V, if I recall correctly, it refered to an allegedly
superior way to do things BSD did with sockets (and more).  Any
nice-looking term will be complicatedly overloaded by now.  I
think "seborrea" is still free, though (according to some old Dilbert
strips, at least).


> > Seekable files can be multi-pass, but in the strict sense
> > that you can rewind them -- it's still impractical to have
> > them produce multiple *independent* iterators (needing
> > some sort of in-memory caching).
>
> It would be trivial if you had an object representing the notion of a
> file on disk rather than an open file.  Each iterator would be
> implemented as a separate open file referring to the same filename.

For a *read-only* disk file, yes -- at least on Unix-ish systems, you
could also get the same effect with dup2 without even needing any
filename around (e.g. on an already-unlinked file).   Hmmm, I do
think win32 has something like dup2 -- my copy of Richter remained
with think3 (it was actually theirs:-), and I do little Windows these days
so I haven't bought another, but I'm pretty sure half an hour on
MSDN would let me find it.

Maybe something can be built around this -- the underlying disk file
as the container, dup2 or equivalent to make independent iterators/
streams (as long as nobody's writing the file... but that's not too
different from iterating on e.g. a list, where an insert or del would
mess things up...).  But surely not by sticking with stdio.

Which leads us back to my "this is rather academic" statement:
don't we need to stick with stdio to support existing extensions
which use FILE*'s, anyway?


Alex



From guido@python.org  Thu Jul 18 22:28:03 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 18 Jul 2002 17:28:03 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Thu, 18 Jul 2002 23:23:50 +0200."
 <E17VIki-0007EH-00@mail.python.org>
References: <200207172332.g6HNWMp23835@oma.cosc.canterbury.ac.nz> <E17V4Ij-00073d-00@mail.python.org> <200207181918.g6IJIcW22539@odiug.zope.com>
 <E17VIki-0007EH-00@mail.python.org>
Message-ID: <200207182128.g6ILS3u04720@odiug.zope.com>

> > > > I've just had a thought. Maybe it would be less of a mess
> > > > if what we are calling "iterators" had been called "streams"
> > >
> > > Possibly -- I did use the "streams" name often in the tutorial
> > > on iterators and generators, it's a very natural term.
> >
> > OTOH in C++ and Java, "stream" refers to an open file object (to
> > emphasize the iteratorish feeling of a file opened for sequential
> > reading or writing, as opposed to the concept of a file as a
> > random-access array of bytes on disk).
> 
> ...and in Unix Sys/V, if I recall correctly, it refered to an allegedly
> superior way to do things BSD did with sockets (and more).  Any
> nice-looking term will be complicatedly overloaded by now.  I
> think "seborrea" is still free, though (according to some old Dilbert
> strips, at least).

Bah.  I rather like the idea of using "stream" to denote the future
rewritten I/O object, so I don't want to use it for iterators.

> Which leads us back to my "this is rather academic" statement:
> don't we need to stick with stdio to support existing extensions
> which use FILE*'s, anyway?

We'll need to support the old style files for a long time.  But that
doesn't mean we can't invent something new that does't use stdio (or
perhaps it uses stdio, just doesn't rely on stdio for various
features).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal@lemburg.com  Thu Jul 18 22:38:59 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jul 2002 23:38:59 +0200
Subject: [Python-Dev] staticforward
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>	<3D35A188.20407@lemburg.com>	<15669.47553.15097.651868@slothrop.zope.com>	<3D35D466.5090903@lemburg.com>	<200207172045.g6HKjBg13729@odiug.zope.com>	<3D35DA67.8060206@lemburg.com>	<3D35DBB9.9000103@lemburg.com>	<15670.62611.943840.954629@slothrop.zope.com>	<3D371361.7050908@lemburg.com>	<15671.6078.577033.943393@slothrop.zope.com>	<3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com>
Message-ID: <3D373573.8070001@lemburg.com>

Jeremy Hylton wrote:
>>>>>>"MAL" == mal  <M.-A.> writes:
>>>>>
> 
>   MAL> The configure script tests whether static forwards work or
>   MAL> not. If you'd rip out the test as well, then I'd have to add
>   MAL> those platforms which still have problems manually.
> 
>   MAL> The problem is: I don't know which platforms these are (because
>   MAL> configure found these itself).
>   >>
>   >> If you think the configure test works, why do you have platform
>   >> specific ifdefs in your header file?
> 
>   MAL> Because it doesn't always work :-)
> 
> Let's make sure I've got this straight:
> 
> You believe there are platforms on which staticforward is necessary,
> because you can not have a tentative definition of a static followed
> by a definition with an initializer.  Yet the configure test of
> exactly this behavior succeeds. 

Yes. The test doesn't seem to catch the case of having
arrays being declared as static forward. If you look in
configure.in you'll find that the test code only checks
whether struct behave well.

 > Further, you don't believe the
> configure test works but you want us to leave it in anyway?

I believe that it works in most cases, but not all of them.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From jeremy@alum.mit.edu  Thu Jul 18 23:02:41 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jul 2002 18:02:41 -0400
Subject: [Python-Dev] staticforward
In-Reply-To: <3D373573.8070001@lemburg.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
 <3D35A188.20407@lemburg.com>
 <15669.47553.15097.651868@slothrop.zope.com>
 <3D35D466.5090903@lemburg.com>
 <200207172045.g6HKjBg13729@odiug.zope.com>
 <3D35DA67.8060206@lemburg.com>
 <3D35DBB9.9000103@lemburg.com>
 <15670.62611.943840.954629@slothrop.zope.com>
 <3D371361.7050908@lemburg.com>
 <15671.6078.577033.943393@slothrop.zope.com>
 <3D372E1D.50009@lemburg.com>
 <15671.12313.725886.680036@slothrop.zope.com>
 <3D373573.8070001@lemburg.com>
Message-ID: <15671.15105.563068.700997@slothrop.zope.com>

>>>>> "MAL" == mal  <M.-A.> writes:

  MAL> Yes. The test doesn't seem to catch the case of having arrays
  MAL> being declared as static forward. If you look in configure.in
  MAL> you'll find that the test code only checks whether struct
  MAL> behave well.

Then you'll be no better off if we leave the test in.  I expect you
don't actually have a problem.  On the off chance that you do, you've
already got all the ifdef trickery you need in your own .h file.

Jeremy




From barry@zope.com  Thu Jul 18 23:05:31 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Thu, 18 Jul 2002 18:05:31 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
References: <Pine.LNX.4.44.0207171540460.17524-100000@ziggy>
 <200207180043.g6I0hKB25427@pcp02138704pcs.reston01.va.comcast.net>
 <oqele1th3v.fsf@carouge.sram.qc.ca>
 <200207182042.g6IKg2n22947@odiug.zope.com>
Message-ID: <15671.15275.429784.303580@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    >> Container-like objects usually support protocol 1.  Iterators are
    >> currently required to support both protocols.  The semantics of
    >> iteration come only from protocol 2; protocol 1 is present to make
    >> iterators behave like sequences.  But the analogy is weak -- unlike
    >> ordinary sequences, iterators are "sequences" that are destroyed by
    >> the act of looking at their elements.

    GvR> (I could do without the last sentence, since this expresses a
    GvR> value judgement rather than fact -- not a good thing to have
    GvR> in a PEP's "specification" section.)

What about:

    "...sequences.  Note that the act of looking at an iterator's
    elements mutates the iterator."

-Barry



From tim@zope.com  Thu Jul 18 23:26:47 2002
From: tim@zope.com (Tim Peters)
Date: Thu, 18 Jul 2002 18:26:47 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <15671.15275.429784.303580@anthem.wooz.org>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEEMDHAA.tim@zope.com>

> What about:
>
>     "...sequences.  Note that the act of looking at an iterator's
>     elements mutates the iterator."

That doesn't belong in the spec either -- nothing requires an iterator to
have mutable state, let alone to mutate it when next() is called.




From mal@lemburg.com  Thu Jul 18 23:31:48 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jul 2002 00:31:48 +0200
Subject: [Python-Dev] staticforward
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>	<3D35A188.20407@lemburg.com>	<15669.47553.15097.651868@slothrop.zope.com>	<3D35D466.5090903@lemburg.com>	<200207172045.g6HKjBg13729@odiug.zope.com>	<3D35DA67.8060206@lemburg.com>	<3D35DBB9.9000103@lemburg.com>	<15670.62611.943840.954629@slothrop.zope.com>	<3D371361.7050908@lemburg.com>	<15671.6078.577033.943393@slothrop.zope.com>	<3D372E1D.50009@lemburg.com>	<15671.12313.725886.680036@slothrop.zope.com>	<3D373573.8070001@lemburg.com> <15671.15105.563068.700997@slothrop.zope.com>
Message-ID: <3D3741D4.8020408@lemburg.com>

Jeremy Hylton wrote:
>>>>>>"MAL" == mal  <M.-A.> writes:
>>>>>
> 
>   MAL> Yes. The test doesn't seem to catch the case of having arrays
>   MAL> being declared as static forward. If you look in configure.in
>   MAL> you'll find that the test code only checks whether struct
>   MAL> behave well.
> 
> Then you'll be no better off if we leave the test in.  I expect you
> don't actually have a problem.  On the off chance that you do, you've
> already got all the ifdef trickery you need in your own .h file.

Except that I don't know on which other platforms I'd have to
enable it... and no, I don't want to go through another
two years of user feedback to find out !

What are you after here ? Remove the configure.in test as well ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From jeremy@alum.mit.edu  Thu Jul 18 23:32:30 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jul 2002 18:32:30 -0400
Subject: [Python-Dev] staticforward
In-Reply-To: <3D3741D4.8020408@lemburg.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
 <3D35A188.20407@lemburg.com>
 <15669.47553.15097.651868@slothrop.zope.com>
 <3D35D466.5090903@lemburg.com>
 <200207172045.g6HKjBg13729@odiug.zope.com>
 <3D35DA67.8060206@lemburg.com>
 <3D35DBB9.9000103@lemburg.com>
 <15670.62611.943840.954629@slothrop.zope.com>
 <3D371361.7050908@lemburg.com>
 <15671.6078.577033.943393@slothrop.zope.com>
 <3D372E1D.50009@lemburg.com>
 <15671.12313.725886.680036@slothrop.zope.com>
 <3D373573.8070001@lemburg.com>
 <15671.15105.563068.700997@slothrop.zope.com>
 <3D3741D4.8020408@lemburg.com>
Message-ID: <15671.16894.185299.672286@slothrop.zope.com>

>>>>> "MAL" == mal  <M.-A.> writes:

  MAL> What are you after here ? Remove the configure.in test as well
  MAL> ?

It is already gone.  And earlier in this thread, we established that
it did you no good, right?  You only care about compilers that choke
on static array decls with later initialization, and the test doesn't
catch that.

Jeremy




From jeremy@alum.mit.edu  Thu Jul 18 23:36:46 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jul 2002 18:36:46 -0400
Subject: [Python-Dev] Re: configure problems porting to Tru64
In-Reply-To: <m3znwo920h.fsf@mira.informatik.hu-berlin.de>
References: <15671.4640.361811.434411@slothrop.zope.com>
 <m3znwo920h.fsf@mira.informatik.hu-berlin.de>
Message-ID: <15671.17150.922349.270282@slothrop.zope.com>

Thanks.  This suggestions gets the compile to succeed on Tru64 and
does not harm on Linux.  I'll check it in and see what happens on the
snake farm tonight.

There's one more problem with Tru64: 

 cc   -o python  Modules/python.o  libpython2.3.a -lrt  -lpthread   -lm  -threads
ld:
Unresolved:
makedev

It looks like Tru64 doesn't have a makedev().  You added the patch
that included this a while back.  Do you have any idea what we should
do on Tru64?

Jeremy




From skip@pobox.com  Thu Jul 18 23:51:09 2002
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 18 Jul 2002 17:51:09 -0500
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects dictobject.c,2.127,2.128
 floatobject.c,2.113,2.114 intobject.c,2.84,2.85 listobject.c,2.120,2.121
 longobject.c,1.119,1.120 rangeobject.c,2.42,2.43 stringobject.c,2.169,2.170
 tupleobject.c,2.69,2.70 typeobject.c,2.160,2.161 unicodeobject.c,2.155,2.156
 xxobject.c,2.20,2.21
In-Reply-To: <3D372E1D.50009@lemburg.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
 <3D35A188.20407@lemburg.com>
 <15669.47553.15097.651868@slothrop.zope.com>
 <3D35D466.5090903@lemburg.com>
 <200207172045.g6HKjBg13729@odiug.zope.com>
 <3D35DA67.8060206@lemburg.com>
 <3D35DBB9.9000103@lemburg.com>
 <15670.62611.943840.954629@slothrop.zope.com>
 <3D371361.7050908@lemburg.com>
 <15671.6078.577033.943393@slothrop.zope.com>
 <3D372E1D.50009@lemburg.com>
Message-ID: <15671.18013.841675.41967@localhost.localdomain>

    >> If you think the configure test works, why do you have platform
    >> specific ifdefs in your header file?

    mal> Because it doesn't always work :-)

Why not just add the necessary goo to configure so it does work for the
various reported cases?

Skip



From mhammond@skippinet.com.au  Fri Jul 19 00:03:38 2002
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Fri, 19 Jul 2002 09:03:38 +1000
Subject: [Python-Dev] Review of build system patch requested
In-Reply-To: <200207171418.g6HEIZo00747@odiug.zope.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPOECPGAAA.mhammond@skippinet.com.au>

> > * Makefile.pre.in has been changed to pass "-DPy_BUILD_CORE" to
> the compiler
> > when building Python itself and any builtin modules.  This flag is
> > not passed to extension modules.
>
> My only concern would be that tools which parse the Makefile (I
> believe distutils does this?) should not accidentally pick up the
> "-DPy_BUILD_CORE" flag.
>
> Apart from that I trust your judgement and Neal's test drive.

Thanks Guido.  I mailed the distutils sig, and Andrew Kuchling replied that
my change should be safe.

Now I need some help checking this baby in!  My change touches
Makefile.pre.in and configure.in, and require that both "autoheader" and
"autoconf" be run to correctly regenerate output files.

How should I do this checkin?  Is it necessary for me to perform any
additional steps, or is there some magic that allows me to simply check
these 2 files in and have everything else work?

Thanks,

Mark.




From jeremy@alum.mit.edu  Fri Jul 19 00:05:27 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jul 2002 19:05:27 -0400
Subject: [Python-Dev] Re: staticforward
In-Reply-To: <15671.18013.841675.41967@localhost.localdomain>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
 <3D35A188.20407@lemburg.com>
 <15669.47553.15097.651868@slothrop.zope.com>
 <3D35D466.5090903@lemburg.com>
 <200207172045.g6HKjBg13729@odiug.zope.com>
 <3D35DA67.8060206@lemburg.com>
 <3D35DBB9.9000103@lemburg.com>
 <15670.62611.943840.954629@slothrop.zope.com>
 <3D371361.7050908@lemburg.com>
 <15671.6078.577033.943393@slothrop.zope.com>
 <3D372E1D.50009@lemburg.com>
 <15671.18013.841675.41967@localhost.localdomain>
Message-ID: <15671.18871.846980.217653@slothrop.zope.com>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

  SM> Why not just add the necessary goo to configure so it does work
  SM> for the various reported cases?

Because there are not first-hand reported cases.  The only case
that MAL has mentioned is an unnecessary use of staticforward with an
array declaration and later initialization in a third-party extension
module.  There's nothing in the core that needs help from configure.

Jeremy




From mhammond@skippinet.com.au  Fri Jul 19 00:15:46 2002
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Fri, 19 Jul 2002 09:15:46 +1000
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <034701c22e92$9473dfc0$ced241d5@hagrid>
Message-ID: <LCEPIIGDJPKCOIHOBJEPOEDAGAAA.mhammond@skippinet.com.au>

Fredrik:
> greg wrote:
>
> > Someone told me that Pyrex should be generating
> > __declspec(dllexport) for the module init func.
>
> almost; for portability, it's better to use the DL_EXPORT
> provided by Python.h:
>
> DL_EXPORT(void)
> init_module(void)
> {
>     ...
> }
>
> > But someone else says this is only needed if
> > you're importing a dll as a library, and that
> > it's not needed for Python extensions.

FWIW, www.python.org/sf/566100 deprecates DL_IMPORT/DL_EXPORT as it is
broken!  Once this patch is checked in, the new blessed way to declare your
function will be:

PyMODINIT_FUNC init_module(void)
{
...
}

This macro will do the right thing in all situations and for all platforms.
It even provides the 'extern "C"' if your extension is in a C++ module.

The-patch-even-updates-the-doc ly,

Mark.




From neal@metaslash.com  Fri Jul 19 01:49:38 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Thu, 18 Jul 2002 20:49:38 -0400
Subject: [Python-Dev] Re: configure problems porting to Tru64
References: <15671.4640.361811.434411@slothrop.zope.com>
 <m3znwo920h.fsf@mira.informatik.hu-berlin.de> <15671.17150.922349.270282@slothrop.zope.com>
Message-ID: <3D376222.B0ED0D63@metaslash.com>

Jeremy Hylton wrote:
> 
> There's one more problem with Tru64:
> 
> cc -o python Modules/python.o libpython2.3.a -lrt -lpthread -lm -threads
> ld:
> Unresolved:
> makedev
> 
> It looks like Tru64 doesn't have a makedev().  You added the patch
> that included this a while back.  Do you have any idea what we should
> do on Tru64?

>From a distant memory, makedev is a macro (or may be depending on #define's)
and needs the proper header file.  I hope my memory is correct,
but I don't even trust it.

...maybe I should, there is a makedev macro in sys/types.h on a 
Compaq Tru64 UNIX V5.1 (Rev. 732) (192.233.54.155) (compaq testdrive box).

It looks like _OSF_SOURCE must be defined, possibly other macros.

Neal



From greg@cosc.canterbury.ac.nz  Fri Jul 19 01:48:47 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 19 Jul 2002 12:48:47 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <E17V4Ij-00073d-00@mail.python.org>
Message-ID: <200207190048.g6J0ml904071@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleax@aleax.it>:

> Me:
> 
> Then the term "iterator" could have been reserved
> > for the special case of an object that provides stream
> > access to a random-access collection.
> 
> > Nice touch, except that I keep quibbling on the "random
> > access" need -- see my previous msg about sets.

Well, substitute the term "non-destructively readable"
or "multi-pass capable" or something like that if
you prefer.

> Seekable files can be multi-pass, but in the strict sense
> that you can rewind them -- it's still impractical to have
> them produce multiple *independent* iterators (needing
> some sort of in-memory caching).

Yes, that's the key idea I had in mind. So make it
"independent multi-pass capable". :-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Fri Jul 19 01:52:20 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 19 Jul 2002 12:52:20 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <E17V4MO-0008SY-00@mail.python.org>
Message-ID: <200207190052.g6J0qKS04080@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleax@aleax.it>:

> I suspect read and write would best be kept on separate
> interfaces.

Yes, obviously you would be allowed to have streams that
implemented one or the other or both.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Fri Jul 19 01:55:22 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 19 Jul 2002 12:55:22 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207181930.g6IJUfX22643@odiug.zope.com>
Message-ID: <200207190055.g6J0tLk04092@oma.cosc.canterbury.ac.nz>

> C++ has tried very hard to do this with its istream, ostream and
> iostream classes; I believe I heard C++ people say once that it's not
> considered a success.

Well, everything in C++ seems to end up being way more
complicated than it ought to. The Python version would
be much simpler, since you wouldn't have to formally
spell out all the interface conventions.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From neal@metaslash.com  Fri Jul 19 02:04:13 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Thu, 18 Jul 2002 21:04:13 -0400
Subject: [Python-Dev] Review of build system patch requested
References: <LCEPIIGDJPKCOIHOBJEPOECPGAAA.mhammond@skippinet.com.au>
Message-ID: <3D37658D.E41060C4@metaslash.com>

Mark Hammond wrote:
> 
> > > * Makefile.pre.in has been changed to pass "-DPy_BUILD_CORE" to
> > the compiler
> > > when building Python itself and any builtin modules.  This flag is
> > > not passed to extension modules.
> >
> > My only concern would be that tools which parse the Makefile (I
> > believe distutils does this?) should not accidentally pick up the
> > "-DPy_BUILD_CORE" flag.
> 
> Thanks Guido.  I mailed the distutils sig, and Andrew Kuchling replied that
> my change should be safe.
> 
> Now I need some help checking this baby in!  My change touches
> Makefile.pre.in and configure.in, and require that both "autoheader" and
> "autoconf" be run to correctly regenerate output files.
> 
> How should I do this checkin?  Is it necessary for me to perform any
> additional steps, or is there some magic that allows me to simply check
> these 2 files in and have everything else work?

I regenerated configure and Makefile.pre.in and attached it 
to the patch.  While regenerating I got a warning:

	autoheader: missing template: _XOPEN_SOURCE

It would be good to have someone look over/test the new configure, etc.

Neal



From greg@cosc.canterbury.ac.nz  Fri Jul 19 02:25:53 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 19 Jul 2002 13:25:53 +1200 (NZST)
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHMEDMDHAA.tim@zope.com>
Message-ID: <200207190125.g6J1PrG04203@oma.cosc.canterbury.ac.nz>

Tim Peters <tim@zope.com>:

> The best thing to do for Windows is ask that Windows users supply
> patches.

It was using a patch supplied by a Windows user that got
me into this mess. He said that the DL_EXPORT macro
didn't work for him.

But it sounds like using DL_EXPORT is the officially
correct thing to do, so I'll do that.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Fri Jul 19 02:40:06 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 19 Jul 2002 13:40:06 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <01KK9VLD2I56A296UI@it.canterbury.ac.nz>
Message-ID: <200207190140.g6J1e6U04243@oma.cosc.canterbury.ac.nz>

> at least on Unix-ish systems, you
> could also get the same effect with dup2 without even needing any
> filename around

No, you couldn't. dup() or dup2() will give you another
file descriptor sharing the same file-position pointer.
To get a completely independent access path I think
you have to open the file again starting from the
pathname.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From tim.one@comcast.net  Fri Jul 19 04:52:24 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 18 Jul 2002 23:52:24 -0400
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <200207190125.g6J1PrG04203@oma.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEOAAFAB.tim.one@comcast.net>

[Tim]
> The best thing to do for Windows is ask that Windows users supply
> patches.

[Greg Ewing]
> It was using a patch supplied by a Windows user that got
> me into this mess. He said that the DL_EXPORT macro
> didn't work for him.

Sucker <wink>.

> But it sounds like using DL_EXPORT is the officially
> correct thing to do, so I'll do that.

Until Mark's patch, yes (see his post in this thread).



From tim.one@comcast.net  Fri Jul 19 04:54:16 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 18 Jul 2002 23:54:16 -0400
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPOEDAGAAA.mhammond@skippinet.com.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEOBAFAB.tim.one@comcast.net>

[Mark Hammond]
> FWIW, www.python.org/sf/566100 deprecates DL_IMPORT/DL_EXPORT as it is
> broken!  Once this patch is checked in, the new blessed way to
> declare your function will be:
>
> PyMODINIT_FUNC init_module(void)
> {
> ...
> }
>
> This macro will do the right thing in all situations and for all
> platforms.
> It even provides the 'extern "C"' if your extension is in a C++ module.
>
> The-patch-even-updates-the-doc ly,

This patch is a Good Thing, and I demand that everyone show you more
appreciation for it.

for-my-next-act-i'll-command-the-tide-to-retreat-ly y'rs  - tim




From guido@python.org  Fri Jul 19 05:24:13 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 00:24:13 -0400
Subject: [Python-Dev] Review of build system patch requested
In-Reply-To: Your message of "Fri, 19 Jul 2002 09:03:38 +1000."
 <LCEPIIGDJPKCOIHOBJEPOECPGAAA.mhammond@skippinet.com.au>
References: <LCEPIIGDJPKCOIHOBJEPOECPGAAA.mhammond@skippinet.com.au>
Message-ID: <200207190424.g6J4ODA08239@pcp02138704pcs.reston01.va.comcast.net>

> Now I need some help checking this baby in!  My change touches
> Makefile.pre.in and configure.in, and require that both "autoheader" and
> "autoconf" be run to correctly regenerate output files.
> 
> How should I do this checkin?  Is it necessary for me to perform any
> additional steps, or is there some magic that allows me to simply check
> these 2 files in and have everything else work?

You need to check in the files that result from running these two; I
believe that's configure and pyconfig.h.in.

Note that we require just about the latest and greatest autoconf.  If
you screw up MvL will correct you. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From greg@cosc.canterbury.ac.nz  Fri Jul 19 05:50:03 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 19 Jul 2002 16:50:03 +1200 (NZST)
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEOAAFAB.tim.one@comcast.net>
Message-ID: <200207190450.g6J4o3w05817@oma.cosc.canterbury.ac.nz>

> > But it sounds like using DL_EXPORT is the officially
> > correct thing to do, so I'll do that.
> 
> Until Mark's patch, yes (see his post in this thread).

Yeah, but I'm not going to worry about that until
it becomes part of a regular release.

Thanks,

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From aleax@aleax.it  Fri Jul 19 07:16:34 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 19 Jul 2002 08:16:34 +0200
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHCEEMDHAA.tim@zope.com>
References: <BIEJKCLHCIOIHAGOKOLHCEEMDHAA.tim@zope.com>
Message-ID: <E17VR4d-0003pX-00@mail.python.org>

On Friday 19 July 2002 12:26 am, Tim Peters wrote:
> > What about:
> >
> >     "...sequences.  Note that the act of looking at an iterator's
> >     elements mutates the iterator."
>
> That doesn't belong in the spec either -- nothing requires an iterator to
> have mutable state, let alone to mutate it when next() is called.

Right, for unbounded iterators returning constant values, such as:

class Ones:
    def __iter__(self): return self
    def next(self): return 1

However, such "exceptions that prove the rule" are rare enough that I
wouldn't consider their existence as forbidding to say _anything_ about
state mutation.  I _would_ similarly say that x[y]=z normally mutates x,
even though "del __setitem__(self, key): pass" is quite legal.  Inserting
an adverb such as "generally" or "usually" should suffice to make even
the most grizzled sea lawyer happy while keeping the information in.


Alex



From mal@lemburg.com  Fri Jul 19 09:31:50 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jul 2002 10:31:50 +0200
Subject: [Python-Dev] staticforward
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>	<3D35A188.20407@lemburg.com>	<15669.47553.15097.651868@slothrop.zope.com>	<3D35D466.5090903@lemburg.com>	<200207172045.g6HKjBg13729@odiug.zope.com>	<3D35DA67.8060206@lemburg.com>	<3D35DBB9.9000103@lemburg.com>	<15670.62611.943840.954629@slothrop.zope.com>	<3D371361.7050908@lemburg.com>	<15671.6078.577033.943393@slothrop.zope.com>	<3D372E1D.50009@lemburg.com>	<15671.12313.725886.680036@slothrop.zope.com>	<3D373573.8070001@lemburg.com>	<15671.15105.563068.700997@slothrop.zope.com>	<3D3741D4.8020408@lemburg.com> <15671.16894.185299.672286@slothrop.zope.com>
Message-ID: <3D37CE76.4020803@lemburg.com>

Jeremy Hylton wrote:
>>>>>>"MAL" == mal  <M.-A.> writes:
>>>>>
> 
>   MAL> What are you after here ? Remove the configure.in test as well
>   MAL> ?
> 
> It is already gone. And earlier in this thread, we established that
> it did you no good, right? 

No and I think I was clear about the fact that I don't want this
to be removed.

> You only care about compilers that choke
> on static array decls with later initialization, and the test doesn't
> catch that.

The test tries to catch a general problem in some compilers: that
static forward declarations cause compile time errors. However,
it only tests this for structs, not arrays and functions.
So not all problems related to static forward declarations are
catched. That's why I had to add support for this to the
header file I'm using.

As a result, the test should be extended to also check for the
array case and the function case, so that all relevant static
forward declaration bugs in the compiler trigger the
#define of BAD_STATIC_FORWARD since that's what the symbol
is all about.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Fri Jul 19 09:44:17 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jul 2002 10:44:17 +0200
Subject: [Python-Dev] Incompatible changes to xmlrpclib
References: <3D240FF2.3060708@lemburg.com> <3D2F3F06.1060800@lemburg.com>
Message-ID: <3D37D161.5@lemburg.com>

> Any news on this one ?

If noone objects, I'd like to restore the old interface.

>> I noticed yesterday that the xmlrcplib.py version in CVS
>> is incompatible with the version in Python 2.2: all the
>> .dump_XXX() interfaces changed and now include a third
>> argument.
>>
>> Since the Marshaller can be subclassed, this breaks all
>> existing application space subclasses extending or changing
>> the default xmlrpclib behaviour.
>>
>> I'd opt for moving back to the previous style of calling the
>> write method via self.write.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From martin@v.loewis.de  Fri Jul 19 08:40:22 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 19 Jul 2002 09:40:22 +0200
Subject: [Python-Dev] Re: configure problems porting to Tru64
In-Reply-To: <15671.17150.922349.270282@slothrop.zope.com>
References: <15671.4640.361811.434411@slothrop.zope.com>
 <m3znwo920h.fsf@mira.informatik.hu-berlin.de>
 <15671.17150.922349.270282@slothrop.zope.com>
Message-ID: <m3n0so2mjd.fsf@mira.informatik.hu-berlin.de>

jeremy@alum.mit.edu (Jeremy Hylton) writes:

> It looks like Tru64 doesn't have a makedev().  You added the patch
> that included this a while back.  Do you have any idea what we should
> do on Tru64?

Neal says you need to define _OSF_SOURCE, but it would better if we
could do without. If not, we should both define _OSF_SOURCE (perhaps
only on OSF), and add an autoconf test for makedev.

Regards,
Martin




From fredrik@pythonware.com  Fri Jul 19 10:31:56 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 19 Jul 2002 11:31:56 +0200
Subject: [Python-Dev] Incompatible changes to xmlrpclib
References: <3D240FF2.3060708@lemburg.com> <3D2F3F06.1060800@lemburg.com> <3D37D161.5@lemburg.com>
Message-ID: <003701c22f07$21945140$0900a8c0@spiff>

mal wrote:

> > Any news on this one ?
>=20
> If noone objects, I'd like to restore the old interface.

the dump methods are an internal implementation details, and are
only accessed through an internal dispatcher table.  even if you
override them, the marshaller won't use your new methods.

so what exactly is your use case?

(and whatever you did to make that use case work, how do I stop
you from doing the same thing with some other internal part of the
standard library? ;-)

</F>




From mal@lemburg.com  Fri Jul 19 10:46:18 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jul 2002 11:46:18 +0200
Subject: [Python-Dev] PEP: Support for System Upgrades
Message-ID: <3D37DFEA.9070506@lemburg.com>

PEP: 0???
Title: Support for System Upgrades
Version: $Revision: 0.0 $
Author: mal@lemburg.com (Marc-Andr? Lemburg)
Status: Draft
Type: Standards Track
Python-Version: 2.3
Created: 19-Jul-2001
Post-History:

Abstract

     This PEP proposes strategies to allow the Python standard library
     to be upgraded in parts without having to reinstall the complete
     distribution or having to wait for a new patch level release.

Problem

     Python currently does not allow overriding modules or packages in
     the standard library per default. Even though this is possible by
     defining a PYTHONPATH environment variable (the paths defined in
     this variable are prepended to the Python standard library path),
     there is no standard way of achieving this without changing the
     configuration.

     Since Python's standard library is starting to host packages which
     are also available separately, e.g. the distutils, email and PyXML
     packages, which can also be installed independently of the Python
     distribution, it is desireable to have an option to upgrade these
     packages without having to wait for a new patch level release of
     the Python interpreter to bring along the changes.

Proposed Solutions

     This PEP proposes two different but not necessarily conflicting
     solutions:

     1. Adding a new standard search path to sys.path:
        $stdlibpath/system-packages just before the $stdlibpath
        entry. This complements the already existing entry for site
        add-ons $stdlibpath/site-packages which is appended to the
        sys.path at interpreter startup time.

        To make use of this new standard location, distutils will need
        to grow support for installing certain packages in
        $stdlibpath/system-packages rather than the standard location
        for third-party packages $stdlibpath/site-packages.

     2. Tweaking distutils to install directly into $stdlibpath for the
        system upgrades rather than into $stdlibpath/site-packages.

     The first solution has a few advantages over the second:

     * upgrades can be easily identified (just look in
       $stdlibpath/system-packages)

     * upgrades can be deinstalled without affecting the rest
       of the interpreter installation

     * modules can be virtually removed from packages; this is
       due to the way Python imports packages: once it finds the
       top-level package directory it stay in this directory for
       all subsequent package submodule imports

     * the approach has an overall much cleaner design than the
       hackish install on top of an existing installation approach

     The only advantages of the second approach are that the Python
     interpreter does not have to changed and that it works with
     older Python versions.

     Both solutions require changes to distutils. These changes can
     also be implemented by package authors, but it would be better to
     define a standard way of switching on the proposed behaviour.

Scope

     Solution 1: Python 2.3 and up
     Solution 2: all Python versions supported by distutils

Credits

     None

References

     None

Copyright

     This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From mal@lemburg.com  Fri Jul 19 11:00:42 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jul 2002 12:00:42 +0200
Subject: [Python-Dev] Incompatible changes to xmlrpclib
References: <3D240FF2.3060708@lemburg.com> <3D2F3F06.1060800@lemburg.com> <3D37D161.5@lemburg.com> <003701c22f07$21945140$0900a8c0@spiff>
Message-ID: <3D37E34A.9050207@lemburg.com>

Fredrik Lundh wrote:
> mal wrote:
> 
> 
>>>Any news on this one ?
>>
>>If noone objects, I'd like to restore the old interface.
> 
> the dump methods are an internal implementation details, and are
> only accessed through an internal dispatcher table.  even if you
> override them, the marshaller won't use your new methods.

If I subclass the Marshaller and Unmarshaller and
then use the subclasses, it would :-)

> so what exactly is your use case?

I needed to adapt the type mapping in xmlrpclib a
bit to better fit our needs. This is done by
adding a few more methods to the Marshaller
and Unmarshaller (it's a hack, but the module doesn't
allow any other method, AFAIK):

def install_xmlrpclib_addons(xmlrpclib):
     m = xmlrpclib.Marshaller
     m.dump_datetime = _dump_datetime
     m.dispatch[DateTime.DateTimeType] = m.dump_datetime
     m.dump_buffer = _dump_buffer
     m.dispatch[types.BufferType] = m.dump_buffer
     m.dump_int = _dump_int
     m.dispatch[types.IntType] = m.dump_int
     u = xmlrpclib.Unmarshaller
     u.end_dateTime = _load_datetime
     u.dispatch['dateTime.iso8601'] = u.end_dateTime
     u.end_base64 = _load_buffer
     u.dispatch['base64'] = u.end_base64
     u.end_boolean = _load_boolean
     u.dispatch['boolean'] = u.end_boolean

> (and whatever you did to make that use case work, how do I stop
> you from doing the same thing with some other internal part of the
> standard library? ;-)

It would be nice to open up the module a little
more so that hacks like the one above are not necessary,
e.g. by making the used classes parameters to the
loads/dumps functions.

Then you'd run into the same problem, though, since now
subclasses would need to access the dump/load methods.

PS: Standard support for None would be nice to have
in xmlrpclib... at least for the Marshalling side, since
this is a very common problem with xmlrpc.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From jmiller@stsci.edu  Fri Jul 19 12:29:37 2002
From: jmiller@stsci.edu (Todd Miller)
Date: Fri, 19 Jul 2002 07:29:37 -0400
Subject: [Python-Dev] Fw: Behavior of buffer()
Message-ID: <3D37F821.8010908@stsci.edu>

This is a re-post in plain text of a message I sent yesterday in HTML. 
 Anyone not "consumed with interest" in the buffer object should 
probably skip it.  

Scott Gilbert wrote:

>--- Todd Miller <jmiller@stsci.edu> wrote:
>
>>>I don't understand what you say, but I believe you.
>>>
>>I meant we call  PyBuffer_FromReadWriteObject and the resulting buffer 
>>lives longer than the extension function call that created it.   I have 
>>heard that it is possible for the original object to "move" leaving the 
>>buffer object pointer to it dangling.
>>
>
>Yes.  The PyBufferObject grabs the pointer from the PyBufferProcs
>supporting object when the PyBufferObject is created.  If the PyBufferProcs
>supporting object reallocates the memory (possibly from a resize) the
>
Thanks for the example.

>
>PyBufferObject can be left with a bad pointer.  This is easily possible if
>you try to use the array module arrays as a buffer.
>
This is good to know.

>
>
>I've submitted a patch to fix this particular problem (among others), but
>there are still enough things that the buffer object can't do that
>something new is needed.
>
I understand.  I saw your patches and they sounded good to me.

>
>>>
>>>>>Maybe instead of the buffer() function/type, there should be a way to
>>>>>allocate raw memory?
>>>>>
>>>>Yes.    It would also be nice to be able to:
>>>>
>>>>1.  Know (at the python level) that a type supports the buffer C-API.
>>>>
>>>Good idea.  (I guess right now you can see if calling buffer() with an
>>>instance as argument works. :-)
>>>
>>>>2.  Copy bytes from one buffer to another (writeable buffer).  
>>>>
>
>And the copy operations shouldn't create any large temporaries:
>
I agree with this completely.    I could summarize my opinion by saying 
that while
I regard the current buffering system as pretty complete,  the buffer 
object places emphasis
on the wrong behavior.  In terms of modelling memory regions, strings 
are the wrong way
to go.   

>
>
>  buf1 = memory(50000)
>  buf2 = memory(50000)
>  # no 10K temporary should be created in the next line
>  buf1[10000:20000] = buf2[30000:40000] 
>
>The current buffer object could be used like this, but it would create a
>temporary string.  
>
Looking at buffering most of this week, the fact that mmap slicing also 
returns strings is one justification I've found for having a buffer 
object,  i.e.,  mmap slicing is not a substitute for the buffer object. 
 The buffer object makes it possible to partition a mmap or any 
bufferable object into pseudo-independent, possibly writable, pieces.  

One justification to have a new buffer object is pickling (one of 
Scott's posts alerted me to this).   I think the behavior we want for 
numarray is to be able to pickle a view of a bufferable object more or 
less like a string containing the buffer image, and to unpickle it as a 
memory object.   The prospect of adding pickling support makes me wonder 
if seperating the allocator and view aspects of the buffer object is a 
good idea;  I thought it was, but now I wonder.

>
>So getting an efficient copy operation seems to require that slices just
>create new "views" to the same memory.
>
Other justifications for a new buffer object might be:

1. The ability to partition any bufferable object into regions which can 
be passed around.  These regions
would themselves be buffers.

2. The ability to efficiently pickle a view of any bufferable object.

>
>>>Maybe you would like to work on a requirements gathering for a memory
>>>object
>>>
>>Sure.  I'd be willing to poll comp.lang.python (python-list?) and 
>>collate the results of any discussion that ensues.  Is that what you had 
>>in mind?
>>
>
>
>In the PEP that I'm drafting, I've been calling the new object "bytes"
>(since it is just a simple array of bytes).  Now that you guys are
>referring to it as the "memory object", should I change the name?  Doesn't
>really matter, but it might avoid confusion to know we're all talking about
>the same thing.
>
Calling this a memory type  sounds the best to me.  The question I have 
not resolved for myself
is whether there should be one type which "does it all" or two types, a 
memory allocator and a bufferable
object manipulator.  

>
>
>
>__________________________________________________
>Do You Yahoo!?
>Yahoo! Autos - Get free new car price quotes
>http://autos.yahoo.com
>






From ping@zesty.ca  Fri Jul 19 12:44:09 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Fri, 19 Jul 2002 04:44:09 -0700 (PDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207181422.g6IEMBr14526@odiug.zope.com>
Message-ID: <Pine.LNX.4.44.0207190429410.25751-100000@ziggy>

On Thu, 18 Jul 2002, Guido van Rossum wrote:
> First of all, I'm not sure what exactly the issue is with destructive
> for-loops.

It's just not the way i expect for-loops to work.  Perhaps we would
need to survey people for objective data, but i feel that most people
would be surprised if

    for x in y: print x
    for x in y: print x

did not print the same thing twice, or if

    if x in y: print 'got it'
    if x in y: print 'got it'

did not do the same thing twice.  I realize this is my own opinion,
but it's a fairly strong impression i have.

Even if it's okay for for-loops to destroy their arguments, i still
think it sets up a bad situation: we may end up with functions
manipulating sequence-like things all over, but it becomes unclear
whether they destroy their arguments or not.  It becomes possible
to write a function which sometimes destroys its argument and sometimes
doesn't.  Bugs get deeper and harder to find.

I believe this is where the biggest debate lies: whether "for" should be
non-destructive.  I realize we are currently on the other side of the
fence, but i foresee enough potential pain that i would like you to
consider the value of keeping "for" loops non-destructive.

> Maybe the for-loop is a red herring?  Calling next() on an
> iterator may or may not be destructive on the underlying "sequence" --
> if it is a generator, for example, I would call it destructive.

Well, for a generator, there is no underlying sequence.

    while 1: print next(gen)

makes it clear that there is no sequence, but

    for x in gen: print x

seems to give me the impression that there is.

> Perhaps you're trying to assign properties to the iterator abstraction
> that aren't really there?

I'm assigning properties to "for" that you aren't.  I think they
are useful properties, though, and worth considering.

I don't think i'm assigning properties to the iterator abstraction;
i expect iterators to destroy themselves.  But the introduction of
iterators, in the way they are now, breaks this property of "for"
loops that i think used to hold almost all the time in Python, and
that i think holds all the time in almost all other languages.

> Next, I'm not sure how renaming next() to __next__() would affect the
> situation w.r.t. the destructivity of for-loops.  Or were you talking
> about some other migration?

The connection is indirect.  The renaming is related to: (a) making
__next__() a real, honest-to-goodness protocol independent of __iter__;
and (b) getting rid of __iter__ on iterators.  It's the presence of
__iter__ on iterators that breaks the non-destructive-for property.

I think the renaming of next() to __next__() is a good idea in any
case.  It is distant enough from the other issues that it can be done
independently of any decisions about __iter__.


-- ?!ng




From ping@zesty.ca  Fri Jul 19 12:28:32 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Fri, 19 Jul 2002 04:28:32 -0700 (PDT)
Subject: [Python-Dev] The iterator story
Message-ID: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy>

Here is a summary of the whole iterator picture as i currently see it.
This is necessarily subjective, but i will try to be precise so that
it's clear where i'm making a value judgement and where i'm trying to
state fact, and so we can pinpoint areas where we agree and disagree.

In the subjective sections, i have marked with [@] the places where
i solicit agreement or disagreement.

I would like to know your opinions on the issues listed below,
and on the places marked [@].


Definitions (objective)
-----------------------

Container: a thing that provides non-destructive access to a varying
number of other things.

    Why "non-destructive"?  Because i don't expect that merely looking
    at the contents will cause a container to be altered.  For example,
    i expect to be able to look inside a container, see that there are
    five elements; leave it alone for a while, come back to it later
    and observe once again that there are five elements.

    Consequently, a file object is not a container in general.  Given
    a file object, you cannot look at it to see if it contains an "A",
    and then later look at it once again to see if it contains an "A"
    and get the same result.  If you could seek, then you could do
    this, but not all files support seeking.  Even if you could seek,
    the act of reading the file would still alter the file object.

    The file object provides no way of getting at the contents without
    mutating itself.  According to my definition, it's fine for a
    container to have ways of mutating itself; but there has to be
    *some* way of getting the contents without mutating the container,
    or it just ain't a container to me.

    A file object is better described as a stream.  Hypothetically
    one could create an interface to seekable files that offered some
    non-mutating read operations; this would cause the file to look
    more like an array of bytes, and i would find it appropriate to
    call that interface a container.

Iterator: a thing that you can poke (i.e. send a no-argument message),
where each time you poke it, it either yields something or announces
that it is exhausted.

    For an iterator to mutate itself every time you poke it is not
    part of my definition.  But the only non-mutating iterator would
    be an iterator that returns the same thing forever, or an iterator
    that is always exhausted.  So most iterators usually mutate.

    Some iterators are associated with a container, but not all.

    There can be many kinds of iterators associated with a container.
    The most natural kind is one that yields the elements of the
    container, one by one, mutating itself each time it is poked,
    until it has yielded all of the elements of the container and
    announces exhaustion.

A Container's Natural Iterator: an iterator that yields the elements
of the container, one by one, in the order that makes the most sense
for the container.  If the container has a finite size n, then the
iterator can be poked exactly n times, and thereafter it is exhausted.


Issues (objective)
------------------

I alluded to a set of issues in an earlier message, and i'll begin
there, by defining what i meant more precisely.

The Destructive-For Issue:

    In most languages i can think of, and in Python for the most
    part, a statement such as "for x in y: print x" is a
    non-destructive operation on y.  Repeating "for x in y: print x"
    will produce exactly the same results once more.

    For pre-iterator versions of Python, this fails to be true only
    if y's __getitem__ method mutates y.  The introduction of
    iterators has caused this to now be untrue when y is any iterator.

    The issue is, should "for" be non-destructive?

The Destructive-In Issue:

    Notice that the iteration that takes place for the "in" operator
    is implemented in the same way as "for".  So if "for" destroys
    its second operand, so will "in".

    The issue is, should "in" be non-destructive?

    (Similar issues exist for built-ins that iterate, like list().)

The __iter__-On-Iterators Issue:

    Some people have mentioned that the presence of an __iter__()
    method is a way of signifying that an object supports the
    iterator protocol.  It has been said that this is necessary
    because the presence of a "next()" method is not sufficiently
    distinguishing.

    Some have said that __iter__() is a completely distinct protocol
    from the iterator protocol.

    The issue is, what is __iter__() really for?

    And secondarily, if it is not part of the iterator protocol,
    then should we require __iter__() on iterators, and why?

The __next__-Naming Issue:

    The iteration method is currently called "next()".

    Previous candidates for the name of this method were "next",
    "__next__", and "__call__".  After some previous debate,
    it was pronounced to be "next()".

    There are concerns that "next()" might collide with existing
    methods named "next()".  There is also a concern that "next()"
    is inconsistent because it is the only type-slot-method that
    does not have a __special__ name.

    The issue is, should it be called "next" or "__next__"?


My Positions (subjective)
-------------------------

I believe that "for" and "in" and list() should be non-destructive.
I believe that __iter__() should not be required on iterators.
I believe that __next__() is a better name than next().

Destructive-For, Destructive-In:

    I think "for" should be non-destructive because that's the way
    it has almost always behaved, and that's the way it behaves in
    any other language [@] i can think of.

    For a container's __getitem__ method to mutate the container is,
    in my opinion, bad behaviour.  In pre-iterator Python, we needed
    some way to allow the convenience of "for" on user-implemented
    containers.  So "for" supported a special protocol where it would
    call __getitem__ with increasing integers starting from 0 until
    it hit an IndexError.  This protocol works great for sequence-like
    containers that were indexable by integers.

    But other containers had to be hacked somewhat to make them fit.
    For example, there was no good way to do "for" over a dictionary-like
    container.  If you attempted "for" over a user-implemented dictionary,
    you got a really weird "KeyError: 0", which only made sense if you
    understood that the "for" loop was attempting __getitem__(0).

    (Hey!  I just noticed that

        from UserDict import UserDict
        for k in UserDict(): print k

    still produces "KeyError: 0"!  This oughta be fixed...)

    If you wanted to support "for" on something else, sometimes you
    would have to make __getitem__ mutate the object, like it does
    in the fileinput module.  But then the user has to know that
    this object is a special case: "for" only works the first time.

    When iterators were introduced, i believed they were supposed
    to solve this problem.  Currently, they don't.

    Currently, "in" can even be destructive.  This is more serious.
    While one could argue that it's not so strange for

        for x in y: ...

    to alter y (even though i do think it is strange), i believe
    just about anyone would find it very counterintuitive for

        if x in y:

    to alter y.  [@]

__iter__-On-Iterators:

    I believe __iter__ is not a type flag.  As i argued previously,
    i think that looking for the presence of methods that don't actually
    implement a protocol is a poor way to check for protocol support.
    And as things stand, the presence of __iter__ doesn't even work [@]
    as a type flag.

    There are objects with __iter__ that are not iterators (like most
    containers).  And there are objects without __iter__ that work as
    iterators.  I know you can legislate the latter away, but i think
    such legislation would amount to fighting the programmers -- and
    it is infeasible [@] to enforce the presence of __iter__ in practice.

    Based on Guido's positive response, in which he asked me to make
    an addition to the PEP, i believe Guido agrees with me that
    __iter__ is distinct from the protocol of an iterator.  This
    surprised me because it runs counter to the philosophy previously
    expressed in the PEP.

    Now suppose we agree that __iter__ and next are distinct protocols.
    Then why require iterators to support both?  The only reason we
    would want __iter__ on iterators is so that we can use "for" [@]
    with an iterator as the second operand.

    I have just argued, above, that it's *not* a good idea for "for"
    and "in" to be destructive.  Since most iterators self-mutate,
    it follows that it's not advisable to use an iterator directly
    as the second operand of a "for" or "in".

    I realize this seems radical!  This may be the most controversial
    point i have made.  But if you accept that "in" should not
    destroy its second argument, the conclusion is unavoidable.

__next__-Naming:

    I think the potential for collision, though small, is significant,
    and this makes "__next__" a better choice than "next".  A built-in
    function next() should be introduced; this function would call the
    tp_iternext slot, and for instance objects tp_iternext would call
    the __next__ method implemented in Python.

    The connection between this issue and the __iter__ issue is that,
    if next() were renamed to __next__(), the argument that __iter__
    is needed as a flag would also go away.


The Current PEP (objective)
---------------------------

The current PEP takes the position that "for" and "in" can be
destructive; that __iter__() and next() represent two distinct
protocols, yet iterators are required to support both; and that
the name of the method on iterators is called "next()".


My Ideal Protocol (subjective)
------------------------------

So by now the biggest question/objection you probably have is
"if i can't use an iterator with 'for', then how can i use it?"

The answer is that "for" is a great way to iterate over things;
it's just that it iterates over containers and i want to preserve
that.  We need a different way to iterate over iterators.

In my ideal world, we would allow a new form of "for", such as

    for line from file:
        print line

The use if "from" instead of "in" would imply that we were
(destructively) pulling things out of the iterator, and would
remove any possible parallel to the test "x in y", which should
rightly remain non-destructive.

Here's the whole deal:

    - Iterators provide just one method, __next__().

    - The built-in next() calls tp_iternext.  For instances,
      tp_iternext calls __next__.

    - Objects wanting to be iterated over provide just one method,
      __iter__().  Some of these are containers, but not all.

    - The built-in iter(foo) calls tp_iter.  For instances,
      tp_iter calls __iter__.

    - "for x in y" gets iter(y) and uses it as an iterator.

    - "for x from y" just uses y as the iterator.

That's it.

Benefits:

    - We have a nice clean division between containers and iterators.

    - When you see "for x in y" you know that y is a container.

    - When you see "for x from y" you know that y is an iterator.

    - "for x in y" never destroys y.

    - "if x in y" never destroys y.

    - If you have an object that is container-like, you can add
      an __iter__ method that gives its natural iterator.  If
      you want, you can supply more iterators that do different
      things; no problem.  No one using your object is confused
      about whether it mutates.

    - If you have an object that is cursor-like or stream-like,
      you can safely make it into an iterator by adding __next__.
      No one using your object is confused about whether it mutates.

Other notes:

    - Iterator algebra still works fine, and is still easy to write:

        def alternate(it):
            while 1:
                yield next(it)
                next(it)

    - The file problem has a consistent solution.  Instead of writing
      "for line in file" you write

        for line from file:
            print line

      Being forced to write "from" signals to you that the file is
      eaten up.  There is no expectation that "for line from file"
      will work again.

      The best would be a convenience function "readlines", to
      make this even clearer:

        for line in readlines("foo.txt"):
            print line

      Now you can do this as many times as you want, and there is
      no possibility of confusion; there is no file object on which
      to call methods that might mess up the reading of lines.


My Not-So-Ideal Protocol
------------------------

All right.  So new syntax may be hard to swallow.  An alternative
is to introduce an adapter that turns an iterator into something
that "for" will accept -- that is, the opposite of iter().

    - The built-in seq(it) returns x such that iter(x) yields it.

Then instead of writing

    for x from it:

you would write

    for x in seq(it):

and the rest would be the same.  The use of "seq" here is what
would flag the fact that "it" will be destroyed.


-- ?!ng




From jeremy@alum.mit.edu  Fri Jul 19 13:20:20 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 19 Jul 2002 08:20:20 -0400
Subject: [Python-Dev] staticforward
In-Reply-To: <3D37CE76.4020803@lemburg.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
 <3D35A188.20407@lemburg.com>
 <15669.47553.15097.651868@slothrop.zope.com>
 <3D35D466.5090903@lemburg.com>
 <200207172045.g6HKjBg13729@odiug.zope.com>
 <3D35DA67.8060206@lemburg.com>
 <3D35DBB9.9000103@lemburg.com>
 <15670.62611.943840.954629@slothrop.zope.com>
 <3D371361.7050908@lemburg.com>
 <15671.6078.577033.943393@slothrop.zope.com>
 <3D372E1D.50009@lemburg.com>
 <15671.12313.725886.680036@slothrop.zope.com>
 <3D373573.8070001@lemburg.com>
 <15671.15105.563068.700997@slothrop.zope.com>
 <3D3741D4.8020408@lemburg.com>
 <15671.16894.185299.672286@slothrop.zope.com>
 <3D37CE76.4020803@lemburg.com>
Message-ID: <15672.1028.161004.894848@slothrop.zope.com>

>>>>> "MAL" == mal  <M.-A.> writes:

  MAL> What are you after here ? Remove the configure.in test as well
  MAL> ?
  >>
  >> It is already gone. And earlier in this thread, we established
  >> that it did you no good, right?

  MAL> No and I think I was clear about the fact that I don't want
  MAL> this to be removed.

It's clear you don't want it to be removed, but not entirely clear
why.  We've got a whole alpha and beta cycle to see if anyone finds an
actual compiler problem with the Python core.  During that time, you
can see if the problem occurs for the header file you mentioned.  (The
one where you use it for an array even though you could rearrange the
code to eliminate it.)

  >> You only care about compilers that choke on static array decls
  >> with later initialization, and the test doesn't catch that.

  MAL> The test tries to catch a general problem in some compilers:

No one has produced any evidence that there are still compilers that
have this problem.

  MAL> that static forward declarations cause compile time
  MAL> errors. However, it only tests this for structs, not arrays and
  MAL> functions.  So not all problems related to static forward
  MAL> declarations are catched. That's why I had to add support for
  MAL> this to the header file I'm using.

The Python core has no need for tests on arrays or functions.
(Indeed, staticforward was not intended for function prototypes.)  

Jeremy




From neal@metaslash.com  Fri Jul 19 13:42:58 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Fri, 19 Jul 2002 08:42:58 -0400
Subject: [Python-Dev] Re: configure problems porting to Tru64
References: <15671.4640.361811.434411@slothrop.zope.com>
 <m3znwo920h.fsf@mira.informatik.hu-berlin.de>
 <15671.17150.922349.270282@slothrop.zope.com> <m3n0so2mjd.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D380952.CF927B10@metaslash.com>

"Martin v. Loewis" wrote:
> 
> jeremy@alum.mit.edu (Jeremy Hylton) writes:
> 
> > It looks like Tru64 doesn't have a makedev().  You added the patch
> > that included this a while back.  Do you have any idea what we should
> > do on Tru64?
> 
> Neal says you need to define _OSF_SOURCE, but it would better if we
> could do without. If not, we should both define _OSF_SOURCE (perhaps
> only on OSF), and add an autoconf test for makedev.

I agree with Martin.  It would be best to only define _OSF_SOURCE
if absolutely necessary and use autoconf.

Neal



From guido@python.org  Fri Jul 19 13:59:15 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 08:59:15 -0400
Subject: [Python-Dev] staticforward
In-Reply-To: Your message of "Fri, 19 Jul 2002 10:31:50 +0200."
 <3D37CE76.4020803@lemburg.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net> <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com> <3D373573.8070001@lemburg.com> <15671.15105.563068.700997@slothrop.zope.com> <3D3741D4.8020408@lemburg.com> <15671.16894.185299.672286@slothrop.zope.com>
 <3D37CE76.4020803@lemburg.com>
Message-ID: <200207191259.g6JCxGp24808@pcp02138704pcs.reston01.va.comcast.net>

> The test tries to catch a general problem in some compilers: that
> static forward declarations cause compile time errors. However,
> it only tests this for structs, not arrays and functions.
> So not all problems related to static forward declarations are
> catched. That's why I had to add support for this to the
> header file I'm using.
> 
> As a result, the test should be extended to also check for the
> array case and the function case, so that all relevant static
> forward declaration bugs in the compiler trigger the
> #define of BAD_STATIC_FORWARD since that's what the symbol
> is all about.

Sorry, Marc-Andre, this has lasted long enough.

Compilers that don't support this are clearly broken according to the
ANSI C std.  When Python was first released, such broken compilers
perhaps had the excuse that it was a tricky issue in the std and that
K&R didn't do it that way.  That was many years ago.  Platforms whose
compiler is still broken in this way ought to be extinct, and I have
every reason to believe that they are.

It's just not worth our while to try to cater for every possible way
that compilers used to be broken in the distant past.  When we spot a
real live broken compiler, and there's no better work-around (like
rewriting the code), and we care about that platform, and there's no
alternative compiler available, we may add some cruft to the code.
But there's no point in gathering cruft forever without every once in
a while cleaning some things up.

I'll gladly put this back in as soon as you have a paying customer who
wants to run Python 2.3 on a platform where the compiler is still
broken in this way.  Until then, it's a non-issue.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 19 13:59:37 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 08:59:37 -0400
Subject: [Python-Dev] Incompatible changes to xmlrpclib
In-Reply-To: Your message of "Fri, 19 Jul 2002 10:44:17 +0200."
 <3D37D161.5@lemburg.com>
References: <3D240FF2.3060708@lemburg.com> <3D2F3F06.1060800@lemburg.com>
 <3D37D161.5@lemburg.com>
Message-ID: <200207191259.g6JCxbW24819@pcp02138704pcs.reston01.va.comcast.net>

> If noone objects, I'd like to restore the old interface.

That's between you & Fredrik Lundh.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From oren-py-d@hishome.net  Fri Jul 19 14:23:51 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 19 Jul 2002 09:23:51 -0400
Subject: [Python-Dev] The iterator story
In-Reply-To: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy>
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy>
Message-ID: <20020719132351.GA40829@hishome.net>

> The Destructive-For Issue:
> 
>     In most languages i can think of, and in Python for the most
>     part, a statement such as "for x in y: print x" is a
>     non-destructive operation on y.  Repeating "for x in y: print x"
>     will produce exactly the same results once more.
> 
>     For pre-iterator versions of Python, this fails to be true only
>     if y's __getitem__ method mutates y.  The introduction of
>     iterators has caused this to now be untrue when y is any iterator.

The most significant example of an object that mutates on __getitem__ in
pre-iterator Python is the xreadlines object.  Its __getitem__ method 
increments an internal counter and raises an exception if accessed out of 
order.  This hack may be the 'original sin' - the first widely used 
destructive for.

I just wish the time machine could have picked up your posting when the
iteration protcols were designed. Good work.

Your questions will require some serious meditation on the relative 
importance of semantic purity and backward compatibility. 

	Oren



From mal@lemburg.com  Fri Jul 19 14:41:40 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jul 2002 15:41:40 +0200
Subject: [Python-Dev] staticforward
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net> <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com> <3D373573.8070001@lemburg.com> <15671.15105.563068.700997@slothrop.zope.com> <3D3741D4.8020408@lemburg.com> <15671.16894.185299.672286@slothrop.zope.com>              <3D37CE76.4020803@lemburg.com> <200207191259.g6JCxGp24808@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D381714.7040606@lemburg.com>

Guido van Rossum wrote:
>>The test tries to catch a general problem in some compilers: that
>>static forward declarations cause compile time errors. However,
>>it only tests this for structs, not arrays and functions.
>>So not all problems related to static forward declarations are
>>catched. That's why I had to add support for this to the
>>header file I'm using.
>>
>>As a result, the test should be extended to also check for the
>>array case and the function case, so that all relevant static
>>forward declaration bugs in the compiler trigger the
>>#define of BAD_STATIC_FORWARD since that's what the symbol
>>is all about.
> 
> 
> Sorry, Marc-Andre, this has lasted long enough.
> 
> Compilers that don't support this are clearly broken according to the
> ANSI C std.  When Python was first released, such broken compilers
> perhaps had the excuse that it was a tricky issue in the std and that
> K&R didn't do it that way.  That was many years ago.  Platforms whose
> compiler is still broken in this way ought to be extinct, and I have
> every reason to believe that they are.

"""
Albert Chin-A-Young wrote on 2002-05-04:
 > >
 > > The AIX xlc ANSI compiler does not allow forward declaration of
 > > variables. This leads to a lot of problems with .c files that use
 > > staticforward (e.g. mxDateTime.c, mxProxy.c, etc.). Any chance of
 > > fixing these?
"""

I'm not making this up.

> It's just not worth our while to try to cater for every possible way
> that compilers used to be broken in the distant past.  When we spot a
> real live broken compiler, and there's no better work-around (like
> rewriting the code), and we care about that platform, and there's no
> alternative compiler available, we may add some cruft to the code.

This sounds too much like "we == PythonLabs". Is that
intended ?

> But there's no point in gathering cruft forever without every once in
> a while cleaning some things up.
> 
> I'll gladly put this back in as soon as you have a paying customer who
> wants to run Python 2.3 on a platform where the compiler is still
> broken in this way.  Until then, it's a non-issue.

Hmm, a few messages ago you confirmed that my usage of
staticforward and statichere was corrrect, later on, you say
that it's not necessary anymore in the core so it's OK
to rip it out. I am telling you that there are compilers
around which don't get it right for arrays and propose
to add a check for those as well -- if only to help extenions
writers like myself.

Nevermind, I'll add code to my stuff to emulate the
configure.in check using distutils. Still, I find
it frustrating that PythonLabs is giving me such a
hard time because of 15 lines of code in configure.in.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From guido@python.org  Fri Jul 19 15:10:19 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 10:10:19 -0400
Subject: [Python-Dev] staticforward
In-Reply-To: Your message of "Fri, 19 Jul 2002 15:41:40 +0200."
 <3D381714.7040606@lemburg.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net> <3D35A188.20407@lemburg.com> <15669.47553.15097.651868@slothrop.zope.com> <3D35D466.5090903@lemburg.com> <200207172045.g6HKjBg13729@odiug.zope.com> <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com> <15670.62611.943840.954629@slothrop.zope.com> <3D371361.7050908@lemburg.com> <15671.6078.577033.943393@slothrop.zope.com> <3D372E1D.50009@lemburg.com> <15671.12313.725886.680036@slothrop.zope.com> <3D373573.8070001@lemburg.com> <15671.15105.563068.700997@slothrop.zope.com> <3D3741D4.8020408@lemburg.com> <15671.16894.185299.672286@slothrop.zope.com> <3D37CE76.4020803@lemburg.com> <200207191259.g6JCxGp24808@pcp02138704pcs.reston01.va.comcast.net>
 <3D381714.7040606@lemburg.com>
Message-ID: <200207191410.g6JEAKf25935@pcp02138704pcs.reston01.va.comcast.net>

> """
> Albert Chin-A-Young wrote on 2002-05-04:
>  > >
>  > > The AIX xlc ANSI compiler does not allow forward declaration of
>  > > variables. This leads to a lot of problems with .c files that use
>  > > staticforward (e.g. mxDateTime.c, mxProxy.c, etc.). Any chance of
>  > > fixing these?
> """
> 
> I'm not making this up.

He doesn't complain about the core.

> > It's just not worth our while to try to cater for every possible way
> > that compilers used to be broken in the distant past.  When we spot a
> > real live broken compiler, and there's no better work-around (like
> > rewriting the code), and we care about that platform, and there's no
> > alternative compiler available, we may add some cruft to the code.
> 
> This sounds too much like "we == PythonLabs". Is that
> intended ?

I hope this is in general the attitude of most core Python
developers.  Adding cruft should be frowned upon!  Else the code will
become unmaintainable over time, and everybody loses.

> Hmm, a few messages ago you confirmed that my usage of
> staticforward and statichere was corrrect, later on, you say
> that it's not necessary anymore in the core so it's OK
> to rip it out. I am telling you that there are compilers
> around which don't get it right for arrays and propose
> to add a check for those as well -- if only to help extenions
> writers like myself.

You're the only person who seems to be suffering from this.

> Nevermind, I'll add code to my stuff to emulate the
> configure.in check using distutils. Still, I find
> it frustrating that PythonLabs is giving me such a
> hard time because of 15 lines of code in configure.in.

I find it frustrating that you're not seeing our side.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From David Abrahams" <david.abrahams@rcn.com  Fri Jul 19 15:15:46 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 19 Jul 2002 10:15:46 -0400
Subject: [Python-Dev] The iterator story
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <20020719132351.GA40829@hishome.net>
Message-ID: <0d3001c22f2f$5e2d2320$6501a8c0@boostconsulting.com>

From: "Oren Tirosh" <oren-py-d@hishome.net>

> > The Destructive-For Issue:
> >
> >     In most languages i can think of, and in Python for the most
> >     part, a statement such as "for x in y: print x" is a
> >     non-destructive operation on y.  Repeating "for x in y: print x"
> >     will produce exactly the same results once more.
> >
> >     For pre-iterator versions of Python, this fails to be true only
> >     if y's __getitem__ method mutates y.  The introduction of
> >     iterators has caused this to now be untrue when y is any iterator.
>
> The most significant example of an object that mutates on __getitem__ in
> pre-iterator Python is the xreadlines object.  Its __getitem__ method
> increments an internal counter and raises an exception if accessed out of
> order.  This hack may be the 'original sin' - the first widely used
> destructive for.
>
> I just wish the time machine could have picked up your posting when the
> iteration protcols were designed. Good work.

Yeah, Ping's article sure went "thunk" when I read it.

At the risk of boring everyone, I think I should explain why I started the
multipass iterator thread. One of the most important jobs of Boost.Python
is the conversion between C++ and Python types (and if you don't give a fig
for C++, hang on, because I hope this will be relevant to pure Python
also). In order to support wrapping of overloaded C++ functions and member
functions, it's important to be able to be able to do this in two steps:

1. Discover whether a Python object is convertible to a given C++ type
2. Perform the conversion

The overload resolution mechanism is currently pretty simple-minded: it
looks through the overloaded function objects until it can find one for
which all the arguments are convertible to the corresponding C++ type, then
it converts them and calls the wrapped C++ function.

My users really want to be able to define converters which, given any
Python iterable/sequence type, can extract a particular C++ container type.
In order to do that, we might commonly need to inspect each element of the
source object to see that it's convertible to the C++ container's value
type. It's pretty easy to see that if step 1 destroys the state of an
argument, it can foul the whole scheme: even if we store the result
somewhere so that step 2 can re-use it, overload resolution might fail for
arguments later in the function signature. Then the other overloads will be
looking at a different argument object.

What we were looking for was a way to quickly reject an overload if the
source object was not re-iterable, without modifying it.

It sure seems to me that we'd benefit from being able to do the same sort
of thing in Pure Python. It's not clear to me that anyone else cares about
this, but I hope one day we'll get built-in overloading or multimethod
dispatch in Python beyond what's currently offered by the numeric
operators.

Incidentally, I'm not sure whether PEP 246 provides much help here. If the
adaptation protocol only gives us a way to say "is this, or can this be
adapted to be a re-iterable sequence", something could easily answer:

    [ x for x in y ]

Which would produce a re-iterable sequence, but might also destroy the
source. Of course, I'll say up front I've only skimmed the PEP and might've
missed something crucial.

-Dave





From aahz@pythoncraft.com  Fri Jul 19 15:16:58 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 19 Jul 2002 10:16:58 -0400
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEOBAFAB.tim.one@comcast.net>
References: <LCEPIIGDJPKCOIHOBJEPOEDAGAAA.mhammond@skippinet.com.au> <LNBBLJKPBEHFEDALKOLCCEOBAFAB.tim.one@comcast.net>
Message-ID: <20020719141658.GA7919@panix.com>

[Mark Hammond's patch -- with docs!]

On Thu, Jul 18, 2002, Tim Peters wrote:
>
> This patch is a Good Thing, and I demand that everyone show you more
> appreciation for it.

If I still used Windoze for anything, I would.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From aleax@aleax.it  Fri Jul 19 15:30:41 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 19 Jul 2002 16:30:41 +0200
Subject: [Python-Dev] The iterator story
In-Reply-To: <0d3001c22f2f$5e2d2320$6501a8c0@boostconsulting.com>
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <20020719132351.GA40829@hishome.net> <0d3001c22f2f$5e2d2320$6501a8c0@boostconsulting.com>
Message-ID: <E17VYmn-0001GJ-00@mail.python.org>

On Friday 19 July 2002 04:15 pm, David Abrahams wrote:
	...
> Incidentally, I'm not sure whether PEP 246 provides much help here. If the
> adaptation protocol only gives us a way to say "is this, or can this be
> adapted to be a re-iterable sequence", something could easily answer:

Yes: that's all PEP 246 provides -- a unified way to express a request for
adaptation of an object to a protocol, with the ability for the object's type,
the protocol, AND a registry of installable adapters, to have a say about it
(the registry is not well explained in the PEP as it stands, it's part of what
I have to clarify when I rewrite it -- but my rewrite won't change what's
being discussed in your quoted paragraph and the start of this one).

>     [ x for x in y ]

or more concisely and speedily list(y).

> Which would produce a re-iterable sequence, but might also destroy the
> source. Of course, I'll say up front I've only skimmed the PEP and might've
> missed something crucial.

PEP 246 cannot in any way impede "something" (or more likely "somebody") from
writing inappropriate or totally incorrect code, nor will it even try.  Maybe 
I'm missing your point...?


Alex



From aahz@pythoncraft.com  Fri Jul 19 15:23:49 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 19 Jul 2002 10:23:49 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <Pine.LNX.4.44.0207190429410.25751-100000@ziggy>
References: <200207181422.g6IEMBr14526@odiug.zope.com> <Pine.LNX.4.44.0207190429410.25751-100000@ziggy>
Message-ID: <20020719142349.GA9051@panix.com>

On Fri, Jul 19, 2002, Ka-Ping Yee wrote:
>
> I believe this is where the biggest debate lies: whether "for" should be
> non-destructive.  I realize we are currently on the other side of the
> fence, but i foresee enough potential pain that i would like you to
> consider the value of keeping "for" loops non-destructive.

Consider

    for line in f.readlines():

in any version of Python.  Adding iterators made this more convenient
and efficient, but I just can't see your POV in the general case.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From aleax@aleax.it  Fri Jul 19 15:39:11 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 19 Jul 2002 16:39:11 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <20020719142349.GA9051@panix.com>
References: <200207181422.g6IEMBr14526@odiug.zope.com> <Pine.LNX.4.44.0207190429410.25751-100000@ziggy> <20020719142349.GA9051@panix.com>
Message-ID: <E17VYuX-0004PB-00@mail.python.org>

On Friday 19 July 2002 04:23 pm, Aahz wrote:
> On Fri, Jul 19, 2002, Ka-Ping Yee wrote:
> > I believe this is where the biggest debate lies: whether "for" should be
> > non-destructive.  I realize we are currently on the other side of the
> > fence, but i foresee enough potential pain that i would like you to
> > consider the value of keeping "for" loops non-destructive.
>
> Consider
>
>     for line in f.readlines():
>
> in any version of Python.  Adding iterators made this more convenient
> and efficient, but I just can't see your POV in the general case.

The 'for', per se, is destroying nothing here -- the object returned by
f.readlines() is destroyed by its reference count falling to 0 after the
for, just as, say:

    for c in raw_input():

or

    x = raw_input()+raw_input()

and so forth.  I.e., any object gets destroyed if there are no more
references to it -- that's a completely different issue.  In all of these
cases, you can, if you want, just bind a name to the object as you
call the function, then use that object over and over again at will.

_Method calls_ mutating the object on which they're called is indeed
quite common, of course.  f.readlines() does mutate f's state.  But
the object it returns, as long as there are references to it, remains.


Alex



From fredrik@pythonware.com  Fri Jul 19 15:42:34 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 19 Jul 2002 16:42:34 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <200207181422.g6IEMBr14526@odiug.zope.com> <Pine.LNX.4.44.0207190429410.25751-100000@ziggy> <20020719142349.GA9051@panix.com>
Message-ID: <017a01c22f32$865123d0$0900a8c0@spiff>

aahz wrote:

> > I believe this is where the biggest debate lies: whether "for" =
should be
> > non-destructive.  I realize we are currently on the other side of =
the
> > fence, but i foresee enough potential pain that i would like you to
> > consider the value of keeping "for" loops non-destructive.
>
> Consider
>=20
>     for line in f.readlines():
>=20
> in any version of Python.

and?  for-in doesn't modify the object returned
by f.readlines(), and never has.

</F>




From David Abrahams" <david.abrahams@rcn.com  Fri Jul 19 15:40:17 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 19 Jul 2002 10:40:17 -0400
Subject: [Python-Dev] The iterator story
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <20020719132351.GA40829@hishome.net> <0d3001c22f2f$5e2d2320$6501a8c0@boostconsulting.com> <E17VYmN-0002U6-00@mx07.mrf.mail.rcn.net>
Message-ID: <0d6701c22f32$a135c0c0$6501a8c0@boostconsulting.com>

From: "Alex Martelli" <aleax@aleax.it>

> PEP 246 cannot in any way impede "something" (or more likely "somebody")
from
> writing inappropriate or totally incorrect code, nor will it even try.
Maybe
> I'm missing your point...?

Maybe, or maybe not. I guess if the reiterable sequence adapter says
"list(x)", nobody should be using it to find out whether a thing is
reiterable. Or maybe the reiterable sequence adapter shouldn't say
"list(x)" because that's destructive -- though that begs the question of
finding out whether x is reiterable.

Maybe the PEP is just a red herring as far as the iterator problem is
concerned. As long as the language has built-in facilities like 'for' and
'in' which use iteration protocols at the core of the language,
re-iterability ought to be expressible likewise, in core language terms,
regardless of the more-extensible mechanisms of PEP 246.

whole-pile-of-maybes-ly y'rs,
dave





From barry@zope.com  Fri Jul 19 15:59:33 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 19 Jul 2002 10:59:33 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <200207181422.g6IEMBr14526@odiug.zope.com>
 <Pine.LNX.4.44.0207190429410.25751-100000@ziggy>
Message-ID: <15672.10581.693016.553036@anthem.wooz.org>

>>>>> "KY" == Ka-Ping Yee <ping@zesty.ca> writes:

    KY> It's just not the way i expect for-loops to work.  Perhaps we
    KY> would need to survey people for objective data, but i feel
    KY> that most people would be surprised if

    |     for x in y: print x
    |     for x in y: print x

    KY> did not print the same thing twice, or if

As with many things Pythonic, it all depends.  Specifically, I think
it depends on the type of y.  Certainly in a pre-iterator world there
was little preventing (or encouraging?) you to write y's __getitem__()
non-destructively, so I don't see much difference if y is an iterator.

    KY> Even if it's okay for for-loops to destroy their arguments, i
    KY> still think it sets up a bad situation: we may end up with
    KY> functions manipulating sequence-like things all over, but it
    KY> becomes unclear whether they destroy their arguments or not.
    KY> It becomes possible to write a function which sometimes
    KY> destroys its argument and sometimes doesn't.  Bugs get deeper
    KY> and harder to find.

How is that different than pre-iterators with __getitem__()?

    KY> I'm assigning properties to "for" that you aren't.  I think
    KY> they are useful properties, though, and worth considering.

These aren't properties of for-loops, they are properties of the
things you're iterating (little-i) over.

-Barry



From aahz@pythoncraft.com  Fri Jul 19 16:20:29 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 19 Jul 2002 11:20:29 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <017a01c22f32$865123d0$0900a8c0@spiff>
References: <200207181422.g6IEMBr14526@odiug.zope.com> <Pine.LNX.4.44.0207190429410.25751-100000@ziggy> <20020719142349.GA9051@panix.com> <017a01c22f32$865123d0$0900a8c0@spiff>
Message-ID: <20020719152029.GA18810@panix.com>

On Fri, Jul 19, 2002, Fredrik Lundh wrote:
> aahz wrote:
>>Ping: 
>>>
>>> I believe this is where the biggest debate lies: whether "for" should be
>>> non-destructive.  I realize we are currently on the other side of the
>>> fence, but i foresee enough potential pain that i would like you to
>>> consider the value of keeping "for" loops non-destructive.
>>
>> Consider
>> 
>>     for line in f.readlines():
>> 
>> in any version of Python.
> 
> and?  for-in doesn't modify the object returned
> by f.readlines(), and never has.

While technically true, that seems to be sidestepping the point from my
POV.  I think that few people see for loops as inherently
non-destructive due to the use case I presented above.  Beyond that, the
for loop is itself inherently mutating in Python older than 2.2, which I
see as functionally equivalent to "destructive"; the primary intention
of iterators (from my recollections of the tenor of the discussions) was
to package that mutating state in a way that could capture the
iterability of objects other than sequences.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From Paul.Moore@atosorigin.com  Fri Jul 19 16:28:11 2002
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Fri, 19 Jul 2002 16:28:11 +0100
Subject: [Python-Dev] Single- vs. Multi-pass iterability
Message-ID: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com>

Ka-Ping Yee <ping@zesty.ca> writes:

> It's just not the way i expect for-loops to work.  Perhaps we
> would need to survey people for objective data, but i feel
> that most people would be surprised if
>
>     for x in y: print x
>     for x in y: print x
>
> did not print the same thing twice, or if

Overall, I think I would say "it depends". Barry pointed out that it depends
on the type of y. That's what I mean, although my intuition isn't quite that
specific by itself.

By the way, not all languages that I am aware of even have "for ... in"
constructs. Perl does, and Visual Basic does. C and C++ don't. In Perl, "for
$x (<>)" or whatever magic line noise Perl uses, does the same as Python's
"for line in f", so the same non-repeatable for issue exists there (at least
for files, and I *bet* you can do nasty things with tied variables to have
it happen elsewhere, too). Even in Visual Basic, "for each x in obj" can in
theory do anything (depending on the type of obj), much like Python.

So I think that existing practice goes against your expectation.

There *is* an issue of some sort with being able to find out whether a given
object offers reproducible for behaviour in the way you describe above. The
problem is determining real-world cases where knowing is useful. There are a
lot of theoretical issues here, but few simple, comprehensible, practical
use cases.

FWIW,

- I'm +1 for renaming next() to __next__().
- I'm +0 on dropping the requirements that iterators *must*
  implement __iter__() (as per your description of the 2
  orthogonal proposals). I'd like to see iterators strongly
  advised to implement __iter__() as returning self (and
  all built in ones doing so), but not have it mandated.
- I'm -1 on your for...from syntax.

Hope this helps,
Paul.



From barry@zope.com  Fri Jul 19 16:36:45 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 19 Jul 2002 11:36:45 -0400
Subject: [Python-Dev] The iterator story
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy>
Message-ID: <15672.12813.512623.968270@anthem.wooz.org>

Nice write-up Ka-Ping.  Maybe you need to transform this into a PEP
called Iterators.next()

1/2 :)

-Barry



From jafo-python-dev@tummy.com  Fri Jul 19 16:43:03 2002
From: jafo-python-dev@tummy.com (Sean Reifschneider)
Date: Fri, 19 Jul 2002 09:43:03 -0600
Subject: [Python-Dev] Judy for replacing internal dictionaries?
Message-ID: <20020719094303.B24220@tummy.com>

Recently at a Hacking Society meeting someone was working on packaging Judy
for Debian.  Apparently, Judy is a data-structure designed by some
researchers at Hewlett-Packard.  It's goal is to be a very fast
implementation of an associative array or (possibly sparse) integer indexed
array.

Judy has recently been released under the LGPL.

After reding the FAQ and 10 minute introduction, I started wondering about
wether it could improve the overall performance of Python by replacing
dictionaries used for namespaces, classes, etc...  Since then, I've
realized that I probably won't have time to do the implementation any time
soon, and Evelyn urged me to bring it up here.

I realize that Python's dictionaries are fairly well optimized.  It sounds
like Judy may be even faster though.  It apparently works fairly hard at
reducing L2 cache misses, for example.

Some URLs:
   Judy FAQ:
      http://atwnt909.external.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,1949,00.html

   Judy 10 minute introduction:
      http://atwnt909.external.hp.com/dspp/ddl/ddl_Download_File_TRX/1,1249,702,00.pdf

   SourceForge Project Page:
      http://sourceforge.net/projects/judy/

Sean
-- 
 YOU ARE WITNESSING A FRONT THREE-QUARTER VIEW OF TWO ADULTS SHARING A
 TENDER MOMENT.  -- Gordon Cole, _Twin_Peaks_
Sean Reifschneider, Inimitably Superfluous <jafo@tummy.com>
tummy.com - Linux Consulting since 1995. Qmail, KRUD, Firewalls, Python



From fredrik@pythonware.com  Fri Jul 19 17:07:21 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 19 Jul 2002 18:07:21 +0200
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <200207181422.g6IEMBr14526@odiug.zope.com> <Pine.LNX.4.44.0207190429410.25751-100000@ziggy> <20020719142349.GA9051@panix.com> <017a01c22f32$865123d0$0900a8c0@spiff> <20020719152029.GA18810@panix.com>
Message-ID: <001b01c22f3e$5e25ab40$0900a8c0@spiff>

aahz wrote:

> While technically true, that seems to be sidestepping the point from =
my
> POV.

really?  are you arguing that when Ping says that for-in shouldn't
destroy the target, he's really saying that python shouldn't allow
methods to have side effects if they can be called from an
expression used in a for-in statement?  why would he say that?

> I think that few people see for loops as inherently non-destructive
> due to the use case I presented above.

I think most people can tell the difference between an object and
a method with side-effects.  I doubt they would be able to get much
done in Python if they couldn't.

> Beyond that, the for loop is itself inherently mutating in Python
> older than 2.2

in what sense?  it calls the object's __getitem__ method with an
integer index value, until it gets an IndexError.  in what way is that
"inherently mutating"?

</F>




From martin@v.loewis.de  Fri Jul 19 17:09:40 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 19 Jul 2002 18:09:40 +0200
Subject: [Python-Dev] staticforward
In-Reply-To: <3D381714.7040606@lemburg.com>
References: <E17UrhI-0004e8-00@usw-pr-cvs1.sourceforge.net>
 <3D35A188.20407@lemburg.com>
 <15669.47553.15097.651868@slothrop.zope.com>
 <3D35D466.5090903@lemburg.com>
 <200207172045.g6HKjBg13729@odiug.zope.com>
 <3D35DA67.8060206@lemburg.com> <3D35DBB9.9000103@lemburg.com>
 <15670.62611.943840.954629@slothrop.zope.com>
 <3D371361.7050908@lemburg.com>
 <15671.6078.577033.943393@slothrop.zope.com>
 <3D372E1D.50009@lemburg.com>
 <15671.12313.725886.680036@slothrop.zope.com>
 <3D373573.8070001@lemburg.com>
 <15671.15105.563068.700997@slothrop.zope.com>
 <3D3741D4.8020408@lemburg.com>
 <15671.16894.185299.672286@slothrop.zope.com>
 <3D37CE76.4020803@lemburg.com>
 <200207191259.g6JCxGp24808@pcp02138704pcs.reston01.va.comcast.net>
 <3D381714.7040606@lemburg.com>
Message-ID: <m3vg7bem2j.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> """
> Albert Chin-A-Young wrote on 2002-05-04:
>  > >
>  > > The AIX xlc ANSI compiler does not allow forward declaration of
>  > > variables. This leads to a lot of problems with .c files that use
>  > > staticforward (e.g. mxDateTime.c, mxProxy.c, etc.). Any chance of
>  > > fixing these?
> """
> 
> I'm not making this up.

Yes, but the user might be. I don't believe this statement is
factually correct - the compiler most certainly does allow forward
declaration of variables.

Also, such a statement is of little value unless associated with an
operating system release number (or better a compiler release number).

This conversation snippet indicates that the problem has not been
fully understood (atleast by Albert Chin-A-Young); solving an
incompletely-understood problem is a recipe for desasters, when it
comes to portability.

Regards,
Martin



From pinard@iro.umontreal.ca  Fri Jul 19 17:02:10 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 19 Jul 2002 12:02:10 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com>
Message-ID: <oqu1mv7lkt.fsf@carouge.sram.qc.ca>

[Moore, Paul]

> - I'm +0 on dropping the requirements that iterators *must*
>   implement __iter__() (as per your description of the 2
>   orthogonal proposals).

In Ka-Ping's letter, I did not read that the proposals were orthogonal.
__iter__ would not be required anymore to identify an iterator as such,
because __next__ would be sufficient, alone, for this purpose.  That would
have the effect of cleaning up the iterator protocol from the double
constraint it currently has, and probably makes things clearer as well.

> I'd like to see iterators strongly advised to implement __iter__() as
> returning self

Strong advice should not be merely given "ex cathedra", there should be
some kind of (convincing) justification behind it.  It makes sense for
generators at least, so they could be used in a few places where Python
expects containers to provide their iterator.

The justification is more fuzzy outside generators, especially when
programmers do not see the need of obtaining an iterator from itself,
the usual and only case I see right now is resuming an iterator which has
not bee fully consumed.

Ka-Ping also stresses, indirectly, that `element in iterator' (resuming
an iterator instead of obtaining a new one from a container) could have
a strange meaning, and might even represent a user error.  I even wonder
if it would not be wise to have iterators _not_ defining an __iter__ method!

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard




From guido@python.org  Fri Jul 19 17:30:43 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 12:30:43 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 19 Jul 2002 12:02:10 EDT."
 <oqu1mv7lkt.fsf@carouge.sram.qc.ca>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com>
 <oqu1mv7lkt.fsf@carouge.sram.qc.ca>
Message-ID: <200207191630.g6JGUh626683@pcp02138704pcs.reston01.va.comcast.net>

> In Ka-Ping's letter, I did not read that the proposals were orthogonal.
> __iter__ would not be required anymore to identify an iterator as such,
> because __next__ would be sufficient, alone, for this purpose.  That would
> have the effect of cleaning up the iterator protocol from the double
> constraint it currently has, and probably makes things clearer as well.

I think there's been some confusion.  I never intended the test for
"is this an iterator" to be "does it have a next() and an __iter__()
method".  I *do* strongly advise iterators to define __iter__(), but
only because I expect that "for x in iterator:" is useful in  iterator
algebra functions and the like.

In fact, I don't really think that Python currently has foolproof ways
to test for *any* kind of abstract protocol.  Questions like "Is x a
mapping" or "is x a sequence" are equally impossible to answer.

The recommended approach is simply to go ahead and use something; if
it doesn't obey the protocol, it will fail.  Of course, you should
*document* the requirements (e.g., "argument x should be a sequence),
but I've always considered it a case of LBYL syndrome if code wants to
check first.  Note that you can't write code that does something
different for a sequence than for a mapping; for example, the
following class could be either:

  class C:
      def __getitem__(self, i): return i

I realize that this won't make David Abrahams and his Boost users
happy, but that's how Python has approached this issue since its
inception.

I'm fine with suggestions that we should really fix this; I expect
that some way to assert interfaces or protocols will eventually find
its way into the language.

But I *don't* think that the current inability to test for
iterator-ness (or iterable-ness, or multi-iteratable-ness, etc.)
should be used as an argument that there's anything wrong with the
iterator protocol.

(And I've *still* not read Ping's original message...)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Fri Jul 19 17:50:47 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 19 Jul 2002 12:50:47 -0400
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <20020719141658.GA7919@panix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEAMAGAB.tim.one@comcast.net>

[Tim]
> This patch is a Good Thing, and I demand that everyone show [MarkH] more
> appreciation for it.

[Aahz]
> If I still used Windoze for anything, I would.

Then you missed the point of the patch.  My demand stands unabated.

relentlessly y'rs  - tim



From aahz@pythoncraft.com  Fri Jul 19 17:58:37 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 19 Jul 2002 12:58:37 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <001b01c22f3e$5e25ab40$0900a8c0@spiff>
References: <200207181422.g6IEMBr14526@odiug.zope.com> <Pine.LNX.4.44.0207190429410.25751-100000@ziggy> <20020719142349.GA9051@panix.com> <017a01c22f32$865123d0$0900a8c0@spiff> <20020719152029.GA18810@panix.com> <001b01c22f3e$5e25ab40$0900a8c0@spiff>
Message-ID: <20020719165836.GA14402@panix.com>

On Fri, Jul 19, 2002, Fredrik Lundh wrote:
> aahz wrote:
>> 
>> While technically true, that seems to be sidestepping the point from my
>> POV.
> 
> really?  are you arguing that when Ping says that for-in shouldn't
> destroy the target, he's really saying that python shouldn't allow
> methods to have side effects if they can be called from an
> expression used in a for-in statement?  why would he say that?

I'm saying that I think Ping is overstating the case in terms of the way
people look at things.  Whatever the technicalities of an implicit
method versus an explicit method, people have long used for loops in
destructive ways.

>> I think that few people see for loops as inherently non-destructive
>> due to the use case I presented above.
> 
> I think most people can tell the difference between an object and
> a method with side-effects.  I doubt they would be able to get much
> done in Python if they couldn't.

To be sure.  But I don't think there's much difference in the way for
loops are actually used.  Continuing my point above, I see the current
usage of for loops as calling an implicit method with side-effects as
opposed to an explicit method with side-effects.  Lo and behold!  That's
actually the case.

>> Beyond that, the for loop is itself inherently mutating in Python
>> older than 2.2
> 
> in what sense?  it calls the object's __getitem__ method with an
> integer index value, until it gets an IndexError.  in what way is that
> "inherently mutating"?

And how does that integer index change?  The for loop in Python <2.2 has
an internal state object.  Iterators are the external manifestation of
that state object, generalized to objects other than sequences.  I'm
surprised that anyone is surprised that the state object gets
mutated/destroyed.  I'm also surprised that people are surprised about
what happens when that state object is coupled to an inherently mutating
object such as file objects.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From barry@zope.com  Fri Jul 19 18:07:29 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 19 Jul 2002 13:07:29 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <200207181422.g6IEMBr14526@odiug.zope.com>
 <Pine.LNX.4.44.0207190429410.25751-100000@ziggy>
 <20020719142349.GA9051@panix.com>
 <017a01c22f32$865123d0$0900a8c0@spiff>
 <20020719152029.GA18810@panix.com>
 <001b01c22f3e$5e25ab40$0900a8c0@spiff>
 <20020719165836.GA14402@panix.com>
Message-ID: <15672.18257.829735.736033@anthem.wooz.org>

>>>>> "A" == Aahz  <aahz@pythoncraft.com> writes:

    A> The for loop in Python <2.2 has an internal state object.
    A> Iterators are the external manifestation of that state object,
    A> generalized to objects other than sequences.  I'm surprised
    A> that anyone is surprised that the state object gets
    A> mutated/destroyed.  I'm also surprised that people are
    A> surprised about what happens when that state object is coupled
    A> to an inherently mutating object such as file objects.

Well said.
-Barry



From aahz@pythoncraft.com  Fri Jul 19 18:02:20 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 19 Jul 2002 13:02:20 -0400
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEAMAGAB.tim.one@comcast.net>
References: <20020719141658.GA7919@panix.com> <LNBBLJKPBEHFEDALKOLCIEAMAGAB.tim.one@comcast.net>
Message-ID: <20020719170220.GB14402@panix.com>

On Fri, Jul 19, 2002, Tim Peters wrote:
>
> [Tim]
> > This patch is a Good Thing, and I demand that everyone show [MarkH] more
> > appreciation for it.
> 
> [Aahz]
> > If I still used Windoze for anything, I would.
> 
> Then you missed the point of the patch.  My demand stands unabated.

All right, then, I hereby show MarkH ill understood appreciation.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From David Abrahams" <david.abrahams@rcn.com  Fri Jul 19 18:16:27 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 19 Jul 2002 13:16:27 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com>              <oqu1mv7lkt.fsf@carouge.sram.qc.ca>  <200207191630.g6JGUh626683@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>


> > In Ka-Ping's letter, I did not read that the proposals were orthogonal.
> > __iter__ would not be required anymore to identify an iterator as such,
> > because __next__ would be sufficient, alone, for this purpose.  That
would
> > have the effect of cleaning up the iterator protocol from the double
> > constraint it currently has, and probably makes things clearer as well.
>
> I think there's been some confusion.  I never intended the test for
> "is this an iterator" to be "does it have a next() and an __iter__()
> method".

Do you intend to have a test for "is this an iterator" at all?

> I *do* strongly advise iterators to define __iter__(), but
> only because I expect that "for x in iterator:" is useful in  iterator
> algebra functions and the like.

Makes sense.

> In fact, I don't really think that Python currently has foolproof ways
> to test for *any* kind of abstract protocol.  Questions like "Is x a
> mapping" or "is x a sequence" are equally impossible to answer.

True.

> The recommended approach is simply to go ahead and use something; if
> it doesn't obey the protocol, it will fail.  Of course, you should
> *document* the requirements (e.g., "argument x should be a sequence),
> but I've always considered it a case of LBYL syndrome if code wants to
> check first.

If LBYL is bad, what is introspection good for?

>  Note that you can't write code that does something
> different for a sequence than for a mapping; for example, the
> following class could be either:
>
>   class C:
>       def __getitem__(self, i): return i
>
> I realize that this won't make David Abrahams and his Boost users
> happy, but that's how Python has approached this issue since its
> inception.

I understand that that's always been "the Python way". However, isn't there
also some implication that some of the special functions are more than just
a way to provide implementations of Python's syntax?  Notes in the docs
like those on __getitem__ tend to argue for that, at least by convention.
Unless I'm misinterpreting things, "the Python way" isn't quite so
one-sided where protocols are concerned.

> I'm fine with suggestions that we should really fix this; I expect
> that some way to assert interfaces or protocols will eventually find
> its way into the language.
>
> But I *don't* think that the current inability to test for
> iterator-ness (or iterable-ness, or multi-iteratable-ness, etc.)
> should be used as an argument that there's anything wrong with the
> iterator protocol.

Just for the record, I never meant to imply that it was broken, only that
I'd like to get a little more from it than I currently can.

-Dave





From paul-python@svensson.org  Fri Jul 19 18:21:26 2002
From: paul-python@svensson.org (Paul Svensson)
Date: Fri, 19 Jul 2002 13:21:26 -0400 (EDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <20020719165836.GA14402@panix.com>
Message-ID: <Pine.LNX.4.44.0207191307440.25834-100000@familjen.svensson.org>

On Fri, 19 Jul 2002, Aahz wrote:

>And how does that integer index change?  The for loop in Python <2.2 has
>an internal state object.  Iterators are the external manifestation of
>that state object, generalized to objects other than sequences.  I'm
>surprised that anyone is surprised that the state object gets
>mutated/destroyed.  I'm also surprised that people are surprised about
>what happens when that state object is coupled to an inherently mutating
>object such as file objects.

All the surprises I see stem from confusion between what is the object
being iterated over, and what is the object holding the state of the
iteration.  Iterators returning self for __iter__() is the major cause
of this confusion.  I agree that in the general case, the boundary may
not always be clear, but Ping's proposal cleans up what's seen 99.9%
of the time.

Pending the pain of the yet unseen migration plan, I'm
+1 on removing __iter__ from all core iterators
+1 on renaming next() to __next__()
+1 on presenting file objects as iterators rather than iterables
+0 on the new 'for x from y' syntax

	/Paul




From guido@python.org  Fri Jul 19 18:23:19 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 13:23:19 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 19 Jul 2002 13:16:27 EDT."
 <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> <oqu1mv7lkt.fsf@carouge.sram.qc.ca> <200207191630.g6JGUh626683@pcp02138704pcs.reston01.va.comcast.net>
 <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com>
Message-ID: <200207191723.g6JHNJf27635@pcp02138704pcs.reston01.va.comcast.net>

> Do you intend to have a test for "is this an iterator" at all?

Not right now, see the rest of my email.  The best you can do is check
for a next method and hope for the best.

> If LBYL is bad, what is introspection good for?

Ask Alex.

> I understand that that's always been "the Python way". However,
> isn't there also some implication that some of the special functions
> are more than just a way to provide implementations of Python's
> syntax?

Like what?

> Notes in the docs like those on __getitem__ tend to argue
> for that, at least by convention.  Unless I'm misinterpreting
> things, "the Python way" isn't quite so one-sided where protocols
> are concerned.

Can you quote specific places in the docs you read this way?  I don't
see it, but I've only scanned chapter 3 of the Language Reference
Manual.

> Just for the record, I never meant to imply that it was broken, only
> that I'd like to get a little more from it than I currently can.

Maybe I should read Ping's email.  From the discussion I figured he
was arguing this way.  I think you have to settle with what I proposed
at the top.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 19 18:32:19 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 13:32:19 -0400
Subject: [Python-Dev] Where's time.daylight???
In-Reply-To: Your message of "Fri, 19 Jul 2002 13:13:40 EDT."
 <15672.18628.831787.897474@anthem.wooz.org>
References: <E17VbDN-00015m-00@usw-pr-cvs1.sourceforge.net>
 <15672.18628.831787.897474@anthem.wooz.org>
Message-ID: <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net>

[Barry, in python-checkins]
> I've noticed one breakage already I believe.  On my systems (RH6.1 and
> RH7.3) time.daylight as disappeared.
> 
> I don't think test_time.py actually tests this parameter, but
> test_email.py which is what's failing for me:
[...]

Yup, time.daylight has disappeared.  But the bizarre thing is that if
I roll back to rev. 1.129, it's *still* gone!  Even rev 1.128 still
doesn't fix this.  I wonder if something in configure changed???

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aleax@aleax.it  Fri Jul 19 18:35:53 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 19 Jul 2002 19:35:53 +0200
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <200207191723.g6JHNJf27635@pcp02138704pcs.reston01.va.comcast.net>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com> <200207191723.g6JHNJf27635@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E17Vbfa-0001NS-00@mail.python.org>

On Friday 19 July 2002 07:23 pm, Guido van Rossum wrote:
> > Do you intend to have a test for "is this an iterator" at all?
>
> Not right now, see the rest of my email.  The best you can do is check
> for a next method and hope for the best.
>
> > If LBYL is bad, what is introspection good for?
>
> Ask Alex.

Introspection is good when you need to dispatch in a way that is
not supported by the language you're using.  In Python (and most
other languages), this mostly mean multiple dispatch -- you don't
get it from the language, therefore, on the non-frequent occasions
when you NEED it, you have to kludge it up.  Very similar to
multiple inheritance in languages that don't support THAT, really.

(Particularly in how people who've never used multiple X don't
really understand that it buys you anything -- try interesting a
dyed-in-the-wool Smalltalker in multiple inheritance, or anybody
*but* a CLOS-head or Dylan-head in multiple dispatch...:-).

Other aspects of introspection help you implement other primitives
lacking in the language.  E.g. "make another like myself but not
initialized" can be self.__class__.__new__(self.__class__) -- not
the most elegant expression, but, hey, I've seen worse (such as
NOT being able to express it at all, in languages lacking the
needed ability to introspect:-).

Looking at *ANOTHER* object this way isn't really INTROspection,
btw -- it's EXTRAspection, by the Latin roots of these words:-).



Alex



From tim.one@comcast.net  Fri Jul 19 18:36:47 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 19 Jul 2002 13:36:47 -0400
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <20020719170220.GB14402@panix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEBDAGAB.tim.one@comcast.net>

[Aahz]
> All right, then, I hereby show MarkH ill understood appreciation.

Excellent!  One down, about two hundred thousand to go.



From David Abrahams" <david.abrahams@rcn.com  Fri Jul 19 18:41:24 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 19 Jul 2002 13:41:24 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> <oqu1mv7lkt.fsf@carouge.sram.qc.ca> <200207191630.g6JGUh626683@pcp02138704pcs.reston01.va.comcast.net>              <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com>  <200207191723.g6JHNJf27635@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <0e4201c22f4b$840d44f0$6501a8c0@boostconsulting.com>

From: "Guido van Rossum" <guido@python.org>

> > Do you intend to have a test for "is this an iterator" at all?
>
> Not right now, see the rest of my email.  The best you can do is check
> for a next method and hope for the best.

I only asked because the rest of your email seemed to imply that you didn't
believe in such checks at this time, while the sentence above my question
seemed to imply there is/should be such a test. Thanks for clarifying.

> > If LBYL is bad, what is introspection good for?
>
> Ask Alex.

OK. Alex, what's introspection good for?

> > I understand that that's always been "the Python way". However,
> > isn't there also some implication that some of the special functions
> > are more than just a way to provide implementations of Python's
> > syntax?
>
> Like what?
>
> > Notes in the docs like those on __getitem__ tend to argue
> > for that, at least by convention.  Unless I'm misinterpreting
> > things, "the Python way" isn't quite so one-sided where protocols
> > are concerned.
>
> Can you quote specific places in the docs you read this way?

Just for example:

__getitem__:

"For sequence types, the accepted keys should be integers and slice
objects.  .... If key is of an inappropriate type, TypeError may be raised;
if of a value outside the set of indexes for the sequence (after any
special interpretation of negative values), IndexError should be raised.
Note: for loops expect that an IndexError will be raised for illegal
indexes to allow proper detection of the end of the sequence. "


__delitem__:

"Same note as for __getitem__(). This should only be implemented for
mappings if the objects support removal of keys, or for sequences if
elements can be removed from the sequence. The same exceptions should be
raised for improper key values as for the __getitem__() method."

__iter__:

"This method should return a new iterator object that can iterate over all
the objects in the container. For mappings, it should iterate over the keys
of the container, and should also be made available as the method
iterkeys()."

The way I read these, the behavior of an implementation of these functions
isn't really open-ended. It ought to follow certain conventions, if you
want your type to behave sensibly. And that's about as strong as any
legislation I've seen anywhere in the Python docs.

> I don't
> see it, but I've only scanned chapter 3 of the Language Reference
> Manual.
>
> > Just for the record, I never meant to imply that it was broken, only
> > that I'd like to get a little more from it than I currently can.
>
> Maybe I should read Ping's email.  From the discussion I figured he
> was arguing this way.  I think you have to settle with what I proposed
> at the top.

Of course I do; I never expected otherwise. Like most of my other
suggestions, this is a case of "OK, whatever you say Guido... but as long
as people are interested in discussing the issues I'd like them to
understand my reasons for bringing it up".

-Dave






From David Abrahams" <david.abrahams@rcn.com  Fri Jul 19 18:45:11 2002
From: David Abrahams" <david.abrahams@rcn.com (David Abrahams)
Date: Fri, 19 Jul 2002 13:45:11 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com> <200207191723.g6JHNJf27635@pcp02138704pcs.reston01.va.comcast.net> <E17VbfX-0002Zk-00@mx07.mrf.mail.rcn.net>
Message-ID: <0e5201c22f4c$37c62e30$6501a8c0@boostconsulting.com>

From: "Alex Martelli" <aleax@aleax.it>

> Introspection is good when you need to dispatch in a way that is
> not supported by the language you're using.  In Python (and most
> other languages), this mostly mean multiple dispatch -- you don't
> get it from the language, therefore, on the non-frequent occasions
> when you NEED it, you have to kludge it up.  Very similar to
> multiple inheritance in languages that don't support THAT, really.
>
> (Particularly in how people who've never used multiple X don't
> really understand that it buys you anything -- try interesting a
> dyed-in-the-wool Smalltalker in multiple inheritance, or anybody
> *but* a CLOS-head or Dylan-head in multiple dispatch...:-).

Ahem. *I'm* interested in multiple-dispatch (never used CLOS or Dylan). You
might not have noticed that I mentioned multimethods in my post about
supporting overloading in Boost.Python.

> Other aspects of introspection help you implement other primitives
> lacking in the language.  E.g. "make another like myself but not
> initialized" can be self.__class__.__new__(self.__class__) -- not
> the most elegant expression, but, hey, I've seen worse (such as
> NOT being able to express it at all, in languages lacking the
> needed ability to introspect:-).

Is that really introspection? It doesn't seem to ask a question.

> Looking at *ANOTHER* object this way isn't really INTROspection,
> btw -- it's EXTRAspection, by the Latin roots of these words:-).

Okay. I hope you won't be offended if I continue to use the wrong term so
that everyone else can understand me ;-)

-Dave





From tim.one@comcast.net  Fri Jul 19 18:50:19 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 19 Jul 2002 13:50:19 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <Pine.LNX.4.44.0207190429410.25751-100000@ziggy>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBFAGAB.tim.one@comcast.net>

[Ping]
> ...
> I believe this is where the biggest debate lies: whether "for" should be
> non-destructive.  I realize we are currently on the other side of the
> fence, but i foresee enough potential pain that i would like you to
> consider the value of keeping "for" loops non-destructive.

I'm having a hard time getting excited about this.  If you had made this
argument before the iterator protocol was implemented, it may have been more
or less intriguing.  But it was implemented and released some time ago, and
I just haven't seen any evidence of such problems on c.l.py, the Help list,
or the Tutor list (all of which I still pay significant attention to).

"for" did and does work in accord with a simple protocol, and whether that's
"destructive" depends on how the specific objects involved implement their
pieces of the protocol, not on the protocol itself.  The same is true of all
of Python's hookable protocols.  What's so special about "for" that it
should pretend to deliver purely functional behavior in a highly
non-functional language?  State mutates.  That's its purpose <wink>.




From aahz@pythoncraft.com  Fri Jul 19 18:54:56 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 19 Jul 2002 13:54:56 -0400 (EDT)
Subject: [Python-Dev] CANCEL: OSCON Community dinner Weds 7/24 6pm
References: <agv3gq$9ao$1@panix1.panix.com>
Message-ID: <200207191754.g6JHsuV00747@panix1.panix.com>

Given the lack of response, I'm hereby canceling any official Python
community dinner.  I hope to see many of you at the conference, though.
I'm including the original message below in case someone else wants to
run with the ball.

In article <agv3gq$9ao$1@panix1.panix.com>, Aahz <aahz@pythoncraft.com> wrote:
>[posted to c.l.py with cc to c.l.py.announce and python-dev]
>
>I'm proposing a Python community dinner at OSCON next week, for Weds
>7/24 at 6pm.  Is there anyone familiar with the San Diego area who wants
>to suggest a location near the Sheraton?  If I don't get any
>recommendations, we'll probably just have the dinner at the Sheraton.
>
>If you're interested, please send me an e-mail so I have some idea of
>the number of people.  Also, please include a way of getting in touch
>with you at OSCON in case plans change (phone numbers accepted, but
>e-mail addresses preferred).
>
>(There's a meeting for PSF members at 8pm, so some of us will likely
>have to skip out early.)


-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/
-- 



From guido@python.org  Fri Jul 19 19:08:57 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 14:08:57 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 19 Jul 2002 13:50:19 EDT."
 <LNBBLJKPBEHFEDALKOLCEEBFAGAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEEBFAGAB.tim.one@comcast.net>
Message-ID: <200207191808.g6JI8wE28214@pcp02138704pcs.reston01.va.comcast.net>

> I'm having a hard time getting excited about this.  If you had made
> this argument before the iterator protocol was implemented, it may
> have been more or less intriguing.  But it was implemented and
> released some time ago, and I just haven't seen any evidence of such
> problems on c.l.py, the Help list, or the Tutor list (all of which I
> still pay significant attention to).

This is an important argument IMO that the theorists here seem to be
missing somewhat.  Releasing a feature and monitoring feedback is a
good way of user testing, something that has been ignored too often by
language designers.  Elegant or minimal abstractions have their place;
but in the end, users are more important.

Quoting Steven Pemberton's home page (http://www.cwi.nl/~steven/):

    ABC: Simple but Powerful Interactive Programming Language and
    Environment. : A Simple but Powerful Interactive Programming
    Language and Environment. We did requirements and task analysis,
    iterative design, and user testing. You'd almost think programming
    languages were an interface between people and computers. Now
    famous because Python was strongly influenced by it.

I still favor this approach to language design.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 19 19:15:45 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 14:15:45 -0400
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 19 Jul 2002 13:41:24 EDT."
 <0e4201c22f4b$840d44f0$6501a8c0@boostconsulting.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> <oqu1mv7lkt.fsf@carouge.sram.qc.ca> <200207191630.g6JGUh626683@pcp02138704pcs.reston01.va.comcast.net> <0e2201c22f48$05ff1870$6501a8c0@boostconsulting.com> <200207191723.g6JHNJf27635@pcp02138704pcs.reston01.va.comcast.net>
 <0e4201c22f4b$840d44f0$6501a8c0@boostconsulting.com>
Message-ID: <200207191815.g6JIFja28258@pcp02138704pcs.reston01.va.comcast.net>

> The way I read these, the behavior of an implementation of these
> functions isn't really open-ended. It ought to follow certain
> conventions, if you want your type to behave sensibly. And that's
> about as strong as any legislation I've seen anywhere in the Python
> docs.

Note the qualification: "if you want your type to behave sensibly".
You can interpret the paragraphs you quoted as explaining what makes a
good sequence or mapping.  IOW they hint at some of the invariants of
those protocols.  But I wouldn't call this legislation.

> Of course I do; I never expected otherwise. Like most of my other
> suggestions, this is a case of "OK, whatever you say Guido... but as
> long as people are interested in discussing the issues I'd like them
> to understand my reasons for bringing it up".

Maybe I should just tune out of this discussion if it's only of
theoretical importance?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From trentm@ActiveState.com  Fri Jul 19 19:26:02 2002
From: trentm@ActiveState.com (Trent Mick)
Date: Fri, 19 Jul 2002 11:26:02 -0700
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEBDAGAB.tim.one@comcast.net>; from tim.one@comcast.net on Fri, Jul 19, 2002 at 01:36:47PM -0400
References: <20020719170220.GB14402@panix.com> <LNBBLJKPBEHFEDALKOLCGEBDAGAB.tim.one@comcast.net>
Message-ID: <20020719112602.A17763@ActiveState.com>

[Tim Peters wrote]
> Excellent!  One down, about two hundred thousand to go.

Mark rocks!

1,999,999-ly,
Trent


-- 
Trent Mick
TrentM@ActiveState.com



From aahz@pythoncraft.com  Fri Jul 19 19:29:22 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 19 Jul 2002 14:29:22 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <Pine.LNX.4.44.0207191307440.25834-100000@familjen.svensson.org>
References: <20020719165836.GA14402@panix.com> <Pine.LNX.4.44.0207191307440.25834-100000@familjen.svensson.org>
Message-ID: <20020719182922.GA9585@panix.com>

On Fri, Jul 19, 2002, Paul Svensson wrote:
>
> Pending the pain of the yet unseen migration plan, I'm
> +1 on removing __iter__ from all core iterators
> +1 on renaming next() to __next__()
> +1 on presenting file objects as iterators rather than iterables
> +0 on the new 'for x from y' syntax

I'd vote this way:

-0 on removing __iter__
+1 on renaming next() to __next__()
+0 on presenting file objects as iterators
+1 on finishing up the patch that fixes the xreadlines() mess
-1 on for x from y
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From aahz@pythoncraft.com  Fri Jul 19 19:30:30 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 19 Jul 2002 14:30:30 -0400
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <20020719112602.A17763@ActiveState.com>
References: <20020719170220.GB14402@panix.com> <LNBBLJKPBEHFEDALKOLCGEBDAGAB.tim.one@comcast.net> <20020719112602.A17763@ActiveState.com>
Message-ID: <20020719183029.GB9585@panix.com>

On Fri, Jul 19, 2002, Trent Mick wrote:
> [Tim Peters wrote]
>>
>> Excellent!  One down, about two hundred thousand to go.
> 
> Mark rocks!
> 
> 1,999,999-ly,

Next up: MarkH writes a patch to fix Trent's arithmetic.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From tim.one@comcast.net  Fri Jul 19 19:39:38 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 19 Jul 2002 14:39:38 -0400
Subject: [Python-Dev] Judy for replacing internal dictionaries?
In-Reply-To: <20020719094303.B24220@tummy.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEBLAGAB.tim.one@comcast.net>

[Sean Reifschneider]
> Recently at a Hacking Society meeting someone was working on
> packaging Judy for Debian.  Apparently, Judy is a data-structure
> designed by some researchers at Hewlett-Packard.  It's goal is to
> be a very fast implementation of an associative array or
> (possibly sparse) integer indexed array.
>
> Judy has recently been released under the LGPL.
>
> After reding the FAQ and 10 minute introduction, I started wondering
> about wether it could improve the overall performance of Python by
> replacing dictionaries used for namespaces, classes, etc...

Sorry, almost certainly not.  In a typical Python namespace lookup, the pure
overheads of calling and returning from the lookup function cost more than
doing the lookup.  Python dicts are more optimized for this use than you
realize.  Judy looks like it would be faster than Python dicts for large
mappings, though (and given the boggling complexity of Judy's data
structures, it damn well better be <wink>).

As a general replacement for Python dicts, it wouldn't fly because it
requires a total ordering on keys, and an ordering explicitly given by
bitstrings, not implicitly via calls to an opaque ordering function.

Looks like it may be an excellent alternative to in-memory B-Trees keyed by
manifest bitstrings (like ints and character strings or even addresses).




From nas@python.ca  Fri Jul 19 20:00:43 2002
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 19 Jul 2002 12:00:43 -0700
Subject: [Python-Dev] The iterator story
In-Reply-To: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy>; from ping@zesty.ca on Fri, Jul 19, 2002 at 04:28:32AM -0700
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy>
Message-ID: <20020719120043.A21503@glacier.arctrix.com>

Ka-Ping Yee wrote:
>     I think "for" should be non-destructive because that's the way
>     it has almost always behaved, and that's the way it behaves in
>     any other language [@] i can think of.

I agree that it can be surprising to have "for" destory the object it's
looping over.  I myself was bitten once by it.  I'm not yet sure if this
is something that will repeatedly bite.  I suspect it might. :-(

>     And as things stand, the presence of __iter__ doesn't even work [@]
>     as a type flag.

__iter__ is not a flag.  When you want to loop over an object you call
__iter__ to get an iterator.  Since you should be able to loop over all
iterators they should provide a __iter__ that returns self.

>     Now suppose we agree that __iter__ and next are distinct protocols.

I suppose you can call them distinct but they both pertain to iteration.
One gets the iterator, the other uses it.

>     Then why require iterators to support both?  The only reason we
>     would want __iter__ on iterators is so that we can use "for" [@]
>     with an iterator as the second operand.

Isn't that a good reason?  It's not just "for" though.  Anytime you have
an object that you want to loop over you should call iter() to get an
iterator and then call .next() on that object.

>     I think the potential for collision, though small, is significant,
>     and this makes "__next__" a better choice than "next".

When this issue originally came up, my position was that double
underscores should be used only if there is a risk of of namespace
collision.  The fact that the method was stored on a type slot is
irrelevant.  If objects implement iterators as a separate, specialized
object there wouldn't be any namespace collisions.  Now it looks like
people want to have iterators that also do other things.  In that case,
__next__ would have been a better choice.

>     The connection between this issue and the __iter__ issue is that,
>     if next() were renamed to __next__(), the argument that __iter__
>     is needed as a flag would also go away.

Sorry, I don't see the connection.  __iter__ is not a flag.  How does
renaming next() help?

> In my ideal world, we would allow a new form of "for", such as
> 
>     for line from file:
>         print line

Nice syntax but I think it creates other problems.  Basically, you are
saying that iterators should not implement __iter__ and we should have
some other way of looping over them (in order to make it clear that they
are being mutated).  

First, people could implement __iter__ such that it returns an iterator
the mutates the original object (e.g. a file object __iter__ that
returns xreadlines).

Second, it will be confusing to have two different ways of looping over
things.  Imagine a library with this bit of code:

    for item in sequence:
        do something

Now I want to use this library but I have an iterator, not something
that implements __iter__.  I would need to create a little wrapper with
a __iter__ method that returns my object.  Should people prefer to
write:

    for item from iterator:
        do something

when they only need to loop over something once?  Doing so makes the
code most generally useful.  What about functions like map() and max()?
Should they accept iterators or sequences as arguments?

It would be confusing if some functions accepted iterators as arguments
but not "container" objects (i.e. things that implement __iter__) and
vice versa.  People will wonder if they should call iter() before
passing their sequence as an argument.

To summarize, I agree that "for" mutating the object can be surprising.
I don't think that removing the __iter__ from iterators is the right
solution.  Unfortunately I don't have any alternative suggestions.

  Neil



From aleax@aleax.it  Fri Jul 19 19:55:06 2002
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 19 Jul 2002 20:55:06 +0200
Subject: [Python-Dev] Re: Single- vs. Multi-pass iterability
In-Reply-To: <0e5201c22f4c$37c62e30$6501a8c0@boostconsulting.com>
References: <714DFA46B9BBD0119CD000805FC1F53B01B5B462@UKRUX002.rundc.uk.origin-it.com> <E17VbfX-0002Zk-00@mx07.mrf.mail.rcn.net> <0e5201c22f4c$37c62e30$6501a8c0@boostconsulting.com>
Message-ID: <E17Vcui-0005X2-00@mail.python.org>

On Friday 19 July 2002 07:45 pm, David Abrahams wrote:
	...
> > dyed-in-the-wool Smalltalker in multiple inheritance, or anybody
> > *but* a CLOS-head or Dylan-head in multiple dispatch...:-).
>
> Ahem. *I'm* interested in multiple-dispatch (never used CLOS or Dylan). You
> might not have noticed that I mentioned multimethods in my post about
> supporting overloading in Boost.Python.

Sorry, I hadn't noticed.  I never did production work in CLOS or Dylan,
either, so I guess that enough C++ and templates warp one's brain
enough to increase ones' perceptivity (only way to account for both of us:-).


> > Other aspects of introspection help you implement other primitives
> > lacking in the language.  E.g. "make another like myself but not
> > initialized" can be self.__class__.__new__(self.__class__) -- not
> > the most elegant expression, but, hey, I've seen worse (such as
> > NOT being able to express it at all, in languages lacking the
> > needed ability to introspect:-).
>
> Is that really introspection? It doesn't seem to ask a question.

"What is this concrete object's actual runtime class?" is a question,
even though it may not look like one since the answer is in a
special attribute rather than being obtained from a method call.

Feel free to code type(self) instead of self.__class__ if this feels
more question-ish, of course.  Six of one, half a dozen of the other.

The object is "looking inside itself" -> introspection.  Specifically,
looking as its own metadata.


> > Looking at *ANOTHER* object this way isn't really INTROspection,
> > btw -- it's EXTRAspection, by the Latin roots of these words:-).
>
> Okay. I hope you won't be offended if I continue to use the wrong term so
> that everyone else can understand me ;-)

How depressingly pragmatic.


Alex



From guido@python.org  Fri Jul 19 20:10:30 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 15:10:30 -0400
Subject: [Python-Dev] Where's time.daylight???
In-Reply-To: Your message of "Fri, 19 Jul 2002 13:32:19 EDT."
 <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net>
References: <E17VbDN-00015m-00@usw-pr-cvs1.sourceforge.net> <15672.18628.831787.897474@anthem.wooz.org>
 <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net>

> [Barry, in python-checkins]
> > I've noticed one breakage already I believe.  On my systems (RH6.1 and
> > RH7.3) time.daylight as disappeared.
> > 
> > I don't think test_time.py actually tests this parameter, but
> > test_email.py which is what's failing for me:
> [...]
> 
> Yup, time.daylight has disappeared.  But the bizarre thing is that if
> I roll back to rev. 1.129, it's *still* gone!  Even rev 1.128 still
> doesn't fix this.  I wonder if something in configure changed???

Alas, this is the effect of defining _XOPEN_SOURCE in configure.in.
This somehow has the effect of not defining these symbols in
pyconfig.h:

HAVE_STRUCT_TM_TM_ZONE
HAVE_TM_ZONE
HAVE_TZNAME

I'm going to remove the _XOPEN_SOURCE define; Jeremy and Martin can
try to figure out what the right thing is for Tru64.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From andymac@bullseye.apana.org.au  Fri Jul 19 14:32:18 2002
From: andymac@bullseye.apana.org.au (Andrew MacIntyre)
Date: Sat, 20 Jul 2002 00:32:18 +1100 (edt)
Subject: [Python-Dev] test_socket failure on FreeBSD
In-Reply-To: <200207181627.g6IGRPE21459@odiug.zope.com>
Message-ID: <Pine.OS2.4.32.0207200020450.42796-400000@tenring.andymac.org>

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

---888574994-29658-1027085538=:42796
Content-Type: TEXT/PLAIN; charset=US-ASCII

On Thu, 18 Jul 2002, Guido van Rossum wrote:

{...}

> > Testing recvfrom() in chunks over TCP. ...
> > seg1='Michael Gilfix was he', addr='None'
> > seg2='re
> > ', addr='None'
> > ERROR
>
> Hm.  This looks like recvfrom() on a TCP stream doesn't return an
> address; not entirely unreasonable.  I wonder if
> self.cli_conn.getpeername() returns the expected address; can you
> check this?  Add this after each recvfrom() call.
>
>         if addr is None:
>             addr = self.cli_conn.getpeername()

This appears to have the effect you desired.  See the attached log.

{...}

> > Testing non-blocking accept. ...
> > conn=<socket object, fd=8, family=2, type=1, protocol=0>
> > addr=('127.0.0.1', 3144)
> > FAIL
>
> This is different.  It seems that the accept() call doesn't time out.
> But this could be because the client thread connects too fast.  Can
> you add a sleep (e.g. time.sleep(5)) to _testAccept() before the
> connect() call?

Likewise.  I took the sleep down to 1ms without failure, though that
system has HZ=100 so std resolution I expect would be 10ms.

I have also attached for info the log of the same modifications on EMX -
situation improved, but still a hiccup there.

Also attached is the diff I applied to test_socket.py (as of about 1900
UTC 020719).

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  | Snail: PO Box 370
        andymac@pcug.org.au            |        Belconnen  ACT  2616
Web:    http://www.andymac.org/        |        Australia

---888574994-29658-1027085538=:42796
Content-Type: TEXT/PLAIN; charset=US-ASCII; name="test_socket.log.fbsd44"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.OS2.4.32.0207200032180.42796@tenring.andymac.org>
Content-Description: test_socket.log.fbsd44
Content-Disposition: attachment; filename="test_socket.log.fbsd44"

dGVzdF9zb2NrZXQNClRlc3RpbmcgZm9yIG1pc3Npb24gY3JpdGljYWwgY29u
c3RhbnRzLiAuLi4gb2sNClRlc3RpbmcgZGVmYXVsdCB0aW1lb3V0LiAuLi4g
b2sNClRlc3RpbmcgZ2V0c2VydmJ5bmFtZSgpLiAuLi4gb2sNClRlc3Rpbmcg
Z2V0c29ja29wdCgpLiAuLi4gb2sNClRlc3RpbmcgaG9zdG5hbWUgcmVzb2x1
dGlvbiBtZWNoYW5pc21zLiAuLi4gb2sNCk1ha2luZyBzdXJlIGdldG5hbWVp
bmZvIGRvZXNuJ3QgY3Jhc2ggdGhlIGludGVycHJldGVyLiAuLi4gb2sNClRl
c3RpbmcgZm9yIGV4aXN0YW5jZSBvZiBub24tY3J1Y2lhbCBjb25zdGFudHMu
IC4uLiBvaw0KVGVzdGluZyByZWZlcmVuY2UgY291bnQgZm9yIGdldG5hbWVp
bmZvLiAuLi4gb2sNClRlc3Rpbmcgc2V0c29ja29wdCgpLiAuLi4gb2sNClRl
c3RpbmcgZ2V0c29ja25hbWUoKS4gLi4uIG9rDQpUZXN0aW5nIHRoYXQgc29j
a2V0IG1vZHVsZSBleGNlcHRpb25zLiAuLi4gb2sNClRlc3RpbmcgZnJvbWZk
KCkuIC4uLiBvaw0KVGVzdGluZyByZWNlaXZlIGluIGNodW5rcyBvdmVyIFRD
UC4gLi4uIG9rDQpUZXN0aW5nIHJlY3Zmcm9tKCkgaW4gY2h1bmtzIG92ZXIg
VENQLiAuLi4gDQpzZWcxPSdNaWNoYWVsIEdpbGZpeCB3YXMgaGUnLCBhZGRy
PScoJzEyNy4wLjAuMScsIDM4OTgpJw0Kc2VnMj0ncmUNCicsIGFkZHI9Jygn
MTI3LjAuMC4xJywgMzg5OCknDQpvaw0KVGVzdGluZyBsYXJnZSByZWNlaXZl
IG92ZXIgVENQLiAuLi4gb2sNClRlc3RpbmcgbGFyZ2UgcmVjdmZyb20oKSBv
dmVyIFRDUC4gLi4uIA0KbXNnPSdNaWNoYWVsIEdpbGZpeCB3YXMgaGVyZQ0K
JywgYWRkcj0nKCcxMjcuMC4wLjEnLCAzOTAwKScNCm9rDQpUZXN0aW5nIHNl
bmRhbGwoKSB3aXRoIGEgMjA0OCBieXRlIHN0cmluZyBvdmVyIFRDUC4gLi4u
IG9rDQpUZXN0aW5nIHNodXRkb3duKCkuIC4uLiBvaw0KVGVzdGluZyByZWN2
ZnJvbSgpIG92ZXIgVURQLiAuLi4gb2sNClRlc3Rpbmcgc2VuZHRvKCkgYW5k
IFJlY3YoKSBvdmVyIFVEUC4gLi4uIG9rDQpUZXN0aW5nIG5vbi1ibG9ja2lu
ZyBhY2NlcHQuIC4uLiBvaw0KVGVzdGluZyBub24tYmxvY2tpbmcgY29ubmVj
dC4gLi4uIG9rDQpUZXN0aW5nIG5vbi1ibG9ja2luZyByZWN2LiAuLi4gb2sN
ClRlc3Rpbmcgd2hldGhlciBzZXQgYmxvY2tpbmcgd29ya3MuIC4uLiBvaw0K
UGVyZm9ybWluZyBmaWxlIHJlYWRsaW5lIHRlc3QuIC4uLiBvaw0KUGVyZm9y
bWluZyBzbWFsbCBmaWxlIHJlYWQgdGVzdC4gLi4uIG9rDQpQZXJmb3JtaW5n
IHVuYnVmZmVyZWQgZmlsZSByZWFkIHRlc3QuIC4uLiBvaw0KDQotLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tDQpSYW4gMjcgdGVzdHMgaW4gMTAuMzEycw0K
DQpPSw0KMSB0ZXN0IE9LLg0KQ0FVVElPTjogIHN0ZG91dCBpc24ndCBjb21w
YXJlZCBpbiB2ZXJib3NlIG1vZGU6ICBhIHRlc3QNCnRoYXQgcGFzc2VzIGlu
IHZlcmJvc2UgbW9kZSBtYXkgZmFpbCB3aXRob3V0IGl0Lg0KGg==
---888574994-29658-1027085538=:42796
Content-Type: TEXT/PLAIN; charset=US-ASCII; name="test_socket.log.os2emx"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.OS2.4.32.0207200032181.42796@tenring.andymac.org>
Content-Description: test_socket.log.os2emx
Content-Disposition: attachment; filename="test_socket.log.os2emx"

dGVzdF9zb2NrZXQNClRlc3RpbmcgZm9yIG1pc3Npb24gY3JpdGljYWwgY29u
c3RhbnRzLiAuLi4gb2sNClRlc3RpbmcgZGVmYXVsdCB0aW1lb3V0LiAuLi4g
b2sNClRlc3RpbmcgZ2V0c2VydmJ5bmFtZSgpLiAuLi4gb2sNClRlc3Rpbmcg
Z2V0c29ja29wdCgpLiAuLi4gb2sNClRlc3RpbmcgaG9zdG5hbWUgcmVzb2x1
dGlvbiBtZWNoYW5pc21zLiAuLi4gb2sNCk1ha2luZyBzdXJlIGdldG5hbWVp
bmZvIGRvZXNuJ3QgY3Jhc2ggdGhlIGludGVycHJldGVyLiAuLi4gb2sNClRl
c3RpbmcgZm9yIGV4aXN0YW5jZSBvZiBub24tY3J1Y2lhbCBjb25zdGFudHMu
IC4uLiBvaw0KVGVzdGluZyByZWZlcmVuY2UgY291bnQgZm9yIGdldG5hbWVp
bmZvLiAuLi4gb2sNClRlc3Rpbmcgc2V0c29ja29wdCgpLiAuLi4gb2sNClRl
c3RpbmcgZ2V0c29ja25hbWUoKS4gLi4uIG9rDQpUZXN0aW5nIHRoYXQgc29j
a2V0IG1vZHVsZSBleGNlcHRpb25zLiAuLi4gb2sNClRlc3RpbmcgZnJvbWZk
KCkuIC4uLiBvaw0KVGVzdGluZyByZWNlaXZlIGluIGNodW5rcyBvdmVyIFRD
UC4gLi4uIG9rDQpUZXN0aW5nIHJlY3Zmcm9tKCkgaW4gY2h1bmtzIG92ZXIg
VENQLiAuLi4gDQpzZWcxPSdNaWNoYWVsIEdpbGZpeCB3YXMgaGUnLCBhZGRy
PScoJzEyNy4wLjAuMScsIDQyNzQpJw0Kc2VnMj0ncmUNCicsIGFkZHI9Jygn
MTI3LjAuMC4xJywgNDI3NCknDQpvaw0KVGVzdGluZyBsYXJnZSByZWNlaXZl
IG92ZXIgVENQLiAuLi4gb2sNClRlc3RpbmcgbGFyZ2UgcmVjdmZyb20oKSBv
dmVyIFRDUC4gLi4uIA0KbXNnPSdNaWNoYWVsIEdpbGZpeCB3YXMgaGVyZQ0K
JywgYWRkcj0nKCcxMjcuMC4wLjEnLCA0Mjc2KScNCm9rDQpUZXN0aW5nIHNl
bmRhbGwoKSB3aXRoIGEgMjA0OCBieXRlIHN0cmluZyBvdmVyIFRDUC4gLi4u
IEZBSUwNClRlc3Rpbmcgc2h1dGRvd24oKS4gLi4uIG9rDQpUZXN0aW5nIHJl
Y3Zmcm9tKCkgb3ZlciBVRFAuIC4uLiBvaw0KVGVzdGluZyBzZW5kdG8oKSBh
bmQgUmVjdigpIG92ZXIgVURQLiAuLi4gb2sNClRlc3Rpbmcgbm9uLWJsb2Nr
aW5nIGFjY2VwdC4gLi4uIG9rDQpUZXN0aW5nIG5vbi1ibG9ja2luZyBjb25u
ZWN0LiAuLi4gRVJST1INClRlc3Rpbmcgbm9uLWJsb2NraW5nIHJlY3YuIC4u
LiBvaw0KVGVzdGluZyB3aGV0aGVyIHNldCBibG9ja2luZyB3b3Jrcy4gLi4u
IG9rDQpQZXJmb3JtaW5nIGZpbGUgcmVhZGxpbmUgdGVzdC4gLi4uIG9rDQpQ
ZXJmb3JtaW5nIHNtYWxsIGZpbGUgcmVhZCB0ZXN0LiAuLi4gb2sNClBlcmZv
cm1pbmcgdW5idWZmZXJlZCBmaWxlIHJlYWQgdGVzdC4gLi4uIG9rDQoNCj09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT0NCkVSUk9SOiBUZXN0aW5nIG5vbi1i
bG9ja2luZyBjb25uZWN0Lg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLQ0K
VHJhY2ViYWNrIChtb3N0IHJlY2VudCBjYWxsIGxhc3QpOg0KICBGaWxlICIu
Li8uLi9MaWIvdGVzdC90ZXN0X3NvY2tldC5weSIsIGxpbmUgMTE3LCBpbiBf
dGVhckRvd24NCiAgICBzZWxmLmZhaWwobXNnKQ0KICBGaWxlICJGOi9ERVYv
Q1ZTX1RFU1QvUFlUSE9OLUNWUy9MaWIvdW5pdHRlc3QucHkiLCBsaW5lIDI1
NCwgaW4gZmFpbA0KICAgIHJhaXNlIHNlbGYuZmFpbHVyZUV4Y2VwdGlvbiwg
bXNnDQpBc3NlcnRpb25FcnJvcjogKDU2LCAnU29ja2V0IGlzIGFscmVhZHkg
Y29ubmVjdGVkJykNCg0KPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQ0KRkFJ
TDogVGVzdGluZyBzZW5kYWxsKCkgd2l0aCBhIDIwNDggYnl0ZSBzdHJpbmcg
b3ZlciBUQ1AuDQotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQpUcmFjZWJh
Y2sgKG1vc3QgcmVjZW50IGNhbGwgbGFzdCk6DQogIEZpbGUgIi4uLy4uL0xp
Yi90ZXN0L3Rlc3Rfc29ja2V0LnB5IiwgbGluZSA0MTIsIGluIHRlc3RTZW5k
QWxsDQogICAgc2VsZi5hc3NlcnRfKGxlbihyZWFkKSA9PSAxMDI0LCAiRXJy
b3IgcGVyZm9ybWluZyBzZW5kYWxsLiIpDQogIEZpbGUgIkY6L0RFVi9DVlNf
VEVTVC9QWVRIT04tQ1ZTL0xpYi91bml0dGVzdC5weSIsIGxpbmUgMjYyLCBp
biBmYWlsVW5sZXNzDQogICAgaWYgbm90IGV4cHI6IHJhaXNlIHNlbGYuZmFp
bHVyZUV4Y2VwdGlvbiwgbXNnDQpBc3NlcnRpb25FcnJvcjogRXJyb3IgcGVy
Zm9ybWluZyBzZW5kYWxsLg0KDQotLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
DQpSYW4gMjcgdGVzdHMgaW4gMTAuMDkwcw0KDQpGQUlMRUQgKGZhaWx1cmVz
PTEsIGVycm9ycz0xKQ0KdGVzdCB0ZXN0X3NvY2tldCBmYWlsZWQgLS0gZXJy
b3JzIG9jY3VycmVkOyBydW4gaW4gdmVyYm9zZSBtb2RlIGZvciBkZXRhaWxz
DQoxIHRlc3QgZmFpbGVkOg0KdGVzdF9zb2NrZXQNCg==
---888574994-29658-1027085538=:42796
Content-Type: TEXT/PLAIN; charset=US-ASCII; name="test_socket.py.diff"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.OS2.4.32.0207200032182.42796@tenring.andymac.org>
Content-Description: test_socket.py.diff
Content-Disposition: attachment; filename="test_socket.py.diff"

KioqIHRlc3Rfc29ja2V0LnB5Lm9yaWcJRnJpIEp1bCAxOSAyMzoxOTowMCAy
MDAyDQotLS0gdGVzdF9zb2NrZXQucHkJRnJpIEp1bCAxOSAyMzozMjozNiAy
MDAyDQoqKioqKioqKioqKioqKioNCioqKiA4LDEzICoqKioNCi0tLSA4LDE0
IC0tLS0NCiAgaW1wb3J0IHRpbWUNCiAgaW1wb3J0IHRocmVhZCwgdGhyZWFk
aW5nDQogIGltcG9ydCBRdWV1ZQ0KKyBpbXBvcnQgdHJhY2ViYWNrDQogIA0K
ICBQT1JUID0gNTAwMDcNCiAgSE9TVCA9ICdsb2NhbGhvc3QnDQoqKioqKioq
KioqKioqKioNCioqKiAzNzQsMzc5ICoqKioNCi0tLSAzNzUsMzgzIC0tLS0N
CiAgICAgIGRlZiB0ZXN0UmVjdkZyb20oc2VsZik6DQogICAgICAgICAgIiIi
VGVzdGluZyBsYXJnZSByZWN2ZnJvbSgpIG92ZXIgVENQLiIiIg0KICAgICAg
ICAgIG1zZywgYWRkciA9IHNlbGYuY2xpX2Nvbm4ucmVjdmZyb20oMTAyNCkN
CisgICAgICAgICBpZiBhZGRyIGlzIE5vbmU6DQorICAgICAgICAgICAgIGFk
ZHIgPSBzZWxmLmNsaV9jb25uLmdldHBlZXJuYW1lKCkNCisgICAgICAgICBw
cmludCAiXG5tc2c9JyVzJywgYWRkcj0nJXMnIiAlIChtc2csIHJlcHIoYWRk
cikpDQogICAgICAgICAgaG9zdG5hbWUsIHBvcnQgPSBhZGRyDQogICAgICAg
ICAgIyNzZWxmLmFzc2VydEVxdWFsKGhvc3RuYW1lLCBzb2NrZXQuZ2V0aG9z
dGJ5bmFtZSgnbG9jYWxob3N0JykpDQogICAgICAgICAgc2VsZi5hc3NlcnRF
cXVhbChtc2csIE1TRykNCioqKioqKioqKioqKioqKg0KKioqIDM4NCwzOTEg
KioqKg0KLS0tIDM4OCw0MDEgLS0tLQ0KICAgICAgZGVmIHRlc3RPdmVyRmxv
d1JlY3ZGcm9tKHNlbGYpOg0KICAgICAgICAgICIiIlRlc3RpbmcgcmVjdmZy
b20oKSBpbiBjaHVua3Mgb3ZlciBUQ1AuIiIiDQogICAgICAgICAgc2VnMSwg
YWRkciA9IHNlbGYuY2xpX2Nvbm4ucmVjdmZyb20obGVuKE1TRyktMykNCisg
ICAgICAgICBpZiBhZGRyIGlzIE5vbmU6DQorICAgICAgICAgICAgIGFkZHIg
PSBzZWxmLmNsaV9jb25uLmdldHBlZXJuYW1lKCkNCisgICAgICAgICBwcmlu
dCAiXG5zZWcxPSclcycsIGFkZHI9JyVzJyIgJSAoc2VnMSwgcmVwcihhZGRy
KSkNCiAgICAgICAgICBzZWcyLCBhZGRyID0gc2VsZi5jbGlfY29ubi5yZWN2
ZnJvbSgxMDI0KQ0KKyAgICAgICAgIGlmIGFkZHIgaXMgTm9uZToNCisgICAg
ICAgICAgICAgYWRkciA9IHNlbGYuY2xpX2Nvbm4uZ2V0cGVlcm5hbWUoKQ0K
ICAgICAgICAgIG1zZyA9IHNlZzEgKyBzZWcyDQorICAgICAgICAgcHJpbnQg
InNlZzI9JyVzJywgYWRkcj0nJXMnIiAlIChzZWcyLCByZXByKGFkZHIpKQ0K
ICAgICAgICAgIGhvc3RuYW1lLCBwb3J0ID0gYWRkcg0KICAgICAgICAgICMj
c2VsZi5hc3NlcnRFcXVhbChob3N0bmFtZSwgc29ja2V0LmdldGhvc3RieW5h
bWUoJ2xvY2FsaG9zdCcpKQ0KICAgICAgICAgIHNlbGYuYXNzZXJ0RXF1YWwo
bXNnLCBNU0cpDQoqKioqKioqKioqKioqKioNCioqKiA0NDQsNDQ5ICoqKioN
Ci0tLSA0NTQsNDYxIC0tLS0NCiAgICAgIGRlZiB0ZXN0UmVjdkZyb20oc2Vs
Zik6DQogICAgICAgICAgIiIiVGVzdGluZyByZWN2ZnJvbSgpIG92ZXIgVURQ
LiIiIg0KICAgICAgICAgIG1zZywgYWRkciA9IHNlbGYuc2Vydi5yZWN2ZnJv
bShsZW4oTVNHKSkNCisgICAgICAgICBpZiBhZGRyIGlzIE5vbmU6DQorICAg
ICAgICAgICAgIGFkZHIgPSBzZWxmLmNsaV9jb25uLmdldHBlZXJuYW1lKCkN
CiAgICAgICAgICBob3N0bmFtZSwgcG9ydCA9IGFkZHINCiAgICAgICAgICAj
I3NlbGYuYXNzZXJ0RXF1YWwoaG9zdG5hbWUsIHNvY2tldC5nZXRob3N0Ynlu
YW1lKCdsb2NhbGhvc3QnKSkNCiAgICAgICAgICBzZWxmLmFzc2VydEVxdWFs
KG1zZywgTVNHKQ0KKioqKioqKioqKioqKioqDQoqKiogNDc4LDQ4MyAqKioq
DQotLS0gNDkwLDQ5NiAtLS0tDQogICAgICAgICAgZXhjZXB0IHNvY2tldC5l
cnJvcjoNCiAgICAgICAgICAgICAgcGFzcw0KICAgICAgICAgIGVsc2U6DQor
ICAgICAgICAgICAgIHByaW50ICJcbmNvbm49IiArIHJlcHIoY29ubikgKyAi
XG5hZGRyPSIgKyByZXByKGFkZHIpDQogICAgICAgICAgICAgIHNlbGYuZmFp
bCgiRXJyb3IgdHJ5aW5nIHRvIGRvIG5vbi1ibG9ja2luZyBhY2NlcHQuIikN
CiAgICAgICAgICByZWFkLCB3cml0ZSwgZXJyID0gc2VsZWN0LnNlbGVjdChb
c2VsZi5zZXJ2XSwgW10sIFtdKQ0KICAgICAgICAgIGlmIHNlbGYuc2VydiBp
biByZWFkOg0KKioqKioqKioqKioqKioqDQoqKiogNDg2LDQ5MSAqKioqDQot
LS0gNDk5LDUwNSAtLS0tDQogICAgICAgICAgICAgIHNlbGYuZmFpbCgiRXJy
b3IgdHJ5aW5nIHRvIGRvIGFjY2VwdCBhZnRlciBzZWxlY3QuIikNCiAgDQog
ICAgICBkZWYgX3Rlc3RBY2NlcHQoc2VsZik6DQorICAgICAgICAgdGltZS5z
bGVlcCg1KQ0KICAgICAgICAgIHNlbGYuY2xpLmNvbm5lY3QoKEhPU1QsIFBP
UlQpKQ0KICANCiAgICAgIGRlZiB0ZXN0Q29ubmVjdChzZWxmKToNCioqKioq
KioqKioqKioqKg0KKioqIDUwNSw1MTAgKioqKg0KLS0tIDUxOSw1MjUgLS0t
LQ0KICAgICAgICAgIGV4Y2VwdCBzb2NrZXQuZXJyb3I6DQogICAgICAgICAg
ICAgIHBhc3MNCiAgICAgICAgICBlbHNlOg0KKyAgICAgICAgICAgICBwcmlu
dCAiXG5jb25uPSIgKyByZXByKGNvbm4pICsgIlxuYWRkcj0iICsgcmVwcihh
ZGRyKQ0KICAgICAgICAgICAgICBzZWxmLmZhaWwoIkVycm9yIHRyeWluZyB0
byBkbyBub24tYmxvY2tpbmcgcmVjdi4iKQ0KICAgICAgICAgIHJlYWQsIHdy
aXRlLCBlcnIgPSBzZWxlY3Quc2VsZWN0KFtjb25uXSwgW10sIFtdKQ0KICAg
ICAgICAgIGlmIGNvbm4gaW4gcmVhZDoNCioqKioqKioqKioqKioqKg0KKioq
IDUxNSw1MjAgKioqKg0KLS0tIDUzMCw1MzYgLS0tLQ0KICANCiAgICAgIGRl
ZiBfdGVzdFJlY3Yoc2VsZik6DQogICAgICAgICAgc2VsZi5jbGkuY29ubmVj
dCgoSE9TVCwgUE9SVCkpDQorICAgICAgICAgdGltZS5zbGVlcCg1KQ0KICAg
ICAgICAgIHNlbGYuY2xpLnNlbmQoTVNHKQ0KICANCiAgY2xhc3MgRmlsZU9i
amVjdENsYXNzVGVzdENhc2UoU29ja2V0Q29ubmVjdGVkVGVzdCk6DQoqKioq
KioqKioqKioqKioNCioqKiA1NzQsNTgwICoqKioNCiAgICAgICAgICBzZWxm
LmNsaV9maWxlLndyaXRlKE1TRykNCiAgICAgICAgICBzZWxmLmNsaV9maWxl
LmZsdXNoKCkNCiAgDQohIGRlZiBtYWluKCk6DQogICAgICBzdWl0ZSA9IHVu
aXR0ZXN0LlRlc3RTdWl0ZSgpDQogICAgICBzdWl0ZS5hZGRUZXN0KHVuaXR0
ZXN0Lm1ha2VTdWl0ZShHZW5lcmFsTW9kdWxlVGVzdHMpKQ0KICAgICAgc3Vp
dGUuYWRkVGVzdCh1bml0dGVzdC5tYWtlU3VpdGUoQmFzaWNUQ1BUZXN0KSkN
Ci0tLSA1OTAsNTk2IC0tLS0NCiAgICAgICAgICBzZWxmLmNsaV9maWxlLndy
aXRlKE1TRykNCiAgICAgICAgICBzZWxmLmNsaV9maWxlLmZsdXNoKCkNCiAg
DQohIGRlZiB0ZXN0X21haW4oKToNCiAgICAgIHN1aXRlID0gdW5pdHRlc3Qu
VGVzdFN1aXRlKCkNCiAgICAgIHN1aXRlLmFkZFRlc3QodW5pdHRlc3QubWFr
ZVN1aXRlKEdlbmVyYWxNb2R1bGVUZXN0cykpDQogICAgICBzdWl0ZS5hZGRU
ZXN0KHVuaXR0ZXN0Lm1ha2VTdWl0ZShCYXNpY1RDUFRlc3QpKQ0KKioqKioq
KioqKioqKioqDQoqKiogNTg0LDU4NyAqKioqDQogICAgICB0ZXN0X3N1cHBv
cnQucnVuX3N1aXRlKHN1aXRlKQ0KICANCiAgaWYgX19uYW1lX18gPT0gIl9f
bWFpbl9fIjoNCiEgICAgIG1haW4oKQ0KLS0tIDYwMCw2MDMgLS0tLQ0KICAg
ICAgdGVzdF9zdXBwb3J0LnJ1bl9zdWl0ZShzdWl0ZSkNCiAgDQogIGlmIF9f
bmFtZV9fID09ICJfX21haW5fXyI6DQohICAgICB0ZXN0X21haW4oKQ0K
---888574994-29658-1027085538=:42796--



From andymac@bullseye.apana.org.au  Fri Jul 19 14:37:12 2002
From: andymac@bullseye.apana.org.au (Andrew MacIntyre)
Date: Sat, 20 Jul 2002 00:37:12 +1100 (edt)
Subject: [Python-Dev] test_socket failure on FreeBSD
In-Reply-To: <Pine.OS2.4.32.0207200020450.42796-400000@tenring.andymac.org>
Message-ID: <Pine.OS2.4.32.0207200034080.42796-100000@tenring.andymac.org>

On Sat, 20 Jul 2002, Andrew MacIntyre wrote:

{...}

> Also attached is the diff I applied to test_socket.py (as of about 1900
> UTC 020719).

Oops, that timestamp is still a couple of hours in the future.  Should
have been 1900 UTC 020718.

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  | Snail: PO Box 370
        andymac@pcug.org.au            |        Belconnen  ACT  2616
Web:    http://www.andymac.org/        |        Australia




From gsw@agere.com  Fri Jul 19 20:41:09 2002
From: gsw@agere.com (Gerald S. Williams)
Date: Fri, 19 Jul 2002 15:41:09 -0400
Subject: [Python-Dev] The iterator story (Single- vs. Multi-pass iterability?)
In-Reply-To: <20020719185602.21423.41415.Mailman@mail.python.org>
Message-ID: <GBEGLOMMCLDACBPKDIHFCECHCKAA.gsw@agere.com>

I started to type this before looking back at the other
threads, so feel free to ignore it if it's entirely
superfluous. I'm sorry that I didn't have time to follow
the "Single- vs. Multi-pass iterability" thread. Code
freeze is today. :-)

 I'm a little confused about this destructive-for/iterator
 issue.

 Sure an iterator that destroys the original object might
 be unexpected, but wouldn't you expect a non-destructive
 iterator to be the default for any object unless there's
 a pretty good reason to use a destructive one? If there's
 a chance that the object may be destroyed/altered (such
 as a file stream or an iterator), shouldn't you already
 have some reason to suspect that?

-Jerry

Strong typing is for weak minds. Weak typing is for the
real troublemakers. ;-)

P.S. Leaving off the original subject line can be mildly
     annoying to those of us subscribing to the digest
     version of the list. Probably more so to those who
     read our responses. :-)




From guido@python.org  Fri Jul 19 21:24:04 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 16:24:04 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src configure,1.322,1.323 configure.in,1.333,1.334 pyconfig.h.in,1.43,1.44
In-Reply-To: Your message of "Fri, 19 Jul 2002 16:06:24 EDT."
 <LNBBLJKPBEHFEDALKOLCIECBAGAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCIECBAGAB.tim.one@comcast.net>
Message-ID: <200207192024.g6JKO4c14964@pcp02138704pcs.reston01.va.comcast.net>

[Tim, in python-checkins]

> I don't understand why this helps.  Are you sure it does?  Python.h still
> contains:
> 
> #ifndef _XOPEN_SOURCE
> # define _XOPEN_SOURCE 500
> #endif
> 
> The configure changes were consequences of that change, IIRC.  We surely
> shouldn't be defining this one way in Python.h and a different way in
> config, right?

I'm certain that it helps: test_time failed since Jeremy made the
change to configure, now it succeeds again.

It may not be the right fix, sure, but I recommend that we don't check
in a fix that breaks other things.  The search is on, and I trust that
Jeremy and Martin will figure something out (and that Jeremy will run
autoconf, autoheader, configure, *and* the test suite before checking
in more changes).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 19 21:29:29 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 16:29:29 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 19 Jul 2002 04:44:09 PDT."
 <Pine.LNX.4.44.0207190429410.25751-100000@ziggy>
References: <Pine.LNX.4.44.0207190429410.25751-100000@ziggy>
Message-ID: <200207192029.g6JKTU015005@pcp02138704pcs.reston01.va.comcast.net>

> It's just not the way i expect for-loops to work.  Perhaps we would
> need to survey people for objective data, but i feel that most people
> would be surprised if
> 
>     for x in y: print x
>     for x in y: print x
> 
> did not print the same thing twice, or if
> 
>     if x in y: print 'got it'
>     if x in y: print 'got it'
> 
> did not do the same thing twice.  I realize this is my own opinion,
> but it's a fairly strong impression i have.

I think it's a naive persuasion that doesn't hold under scrutiny.

For a long time people have faked iterators by providing
pseudo-sequences that did unspeakable things.

In general, I'm pretty sure that if I asked an uninitiated user what
"for line in file" would do, if it did anything, they would understand
that if you tried that a second time you'd hit EOF right away.

> Even if it's okay for for-loops to destroy their arguments, i still
> think it sets up a bad situation: we may end up with functions
> manipulating sequence-like things all over, but it becomes unclear
> whether they destroy their arguments or not.  It becomes possible
> to write a function which sometimes destroys its argument and sometimes
> doesn't.  Bugs get deeper and harder to find.

This sounds awfully similar to the old argument "functions (as opposed
to procedures) should never have side effects".  ABC implemented that
literally (the environment was saved and restored around function
calls, with an exception for the seed for the built-in random
generator), with the hope that it would provide fewer surprises.  It
did the opposite: it drove people crazy because the language was
trying to be smarter than them.

> I believe this is where the biggest debate lies: whether "for" should be
> non-destructive.  I realize we are currently on the other side of the
> fence, but i foresee enough potential pain that i would like you to
> consider the value of keeping "for" loops non-destructive.

I don't see any real debate.  I only see you chasing windmills.
Sorry.  For-loops have had the possibility to destroy their arguments
since the day __getitem__ was introduced.

> > Maybe the for-loop is a red herring?  Calling next() on an
> > iterator may or may not be destructive on the underlying "sequence" --
> > if it is a generator, for example, I would call it destructive.
> 
> Well, for a generator, there is no underlying sequence.
> 
>     while 1: print next(gen)
> 
> makes it clear that there is no sequence, but
> 
>     for x in gen: print x
> 
> seems to give me the impression that there is.

This seems to be a misrepresentation.  The idiom for using any
iterator (not just generators) *without* using a for-loop would have
to be something like:

    while 1:
        try:
            item = it.next() # or it.__next__() or next(it)
        except StopIteration:
            break
        ...do something with item...

(Similar to the traditional idiom for looping over the lines of a
file.)  The for-loop over an iterator was invented so you could write
this as:

    for item in it:
        ...do something with item...

I'm not giving that up so easily!

> > Perhaps you're trying to assign properties to the iterator abstraction
> > that aren't really there?
> 
> I'm assigning properties to "for" that you aren't.  I think they
> are useful properties, though, and worth considering.

I'm trying to be open-minded, but I just don't see it.  The for loop
is more flexible than you seem to want it to be.  Alas, it's been like
this for years, and I don't think the for-loop needs a face lift.

> I don't think i'm assigning properties to the iterator abstraction;
> i expect iterators to destroy themselves.  But the introduction of
> iterators, in the way they are now, breaks this property of "for"
> loops that i think used to hold almost all the time in Python, and
> that i think holds all the time in almost all other languages.

Again, the widespread faking of iterators using destructive
__getitem__ methods that were designed to be only used in a for-loop
defeats your assertion.

> > Next, I'm not sure how renaming next() to __next__() would affect the
> > situation w.r.t. the destructivity of for-loops.  Or were you talking
> > about some other migration?
> 
> The connection is indirect.  The renaming is related to: (a) making
> __next__() a real, honest-to-goodness protocol independent of __iter__;

next() is a real, honest-to-goodness protocol now, and it is
independent of __iter__() now.

> and (b) getting rid of __iter__ on iterators.  It's the presence of
> __iter__ on iterators that breaks the non-destructive-for property.

So you prefer the while-loop version above over the for-loop version?
Gotta be kidding.

> I think the renaming of next() to __next__() is a good idea in any
> case.  It is distant enough from the other issues that it can be done
> independently of any decisions about __iter__.

Yeah, it's just a pain that it's been deployed in Python 2.2 since
last December, and by the time 2.3 is out it will probably have been
at least a full year.  Worse, 2.2 is voted to be Python-in-a-Tie,
giving that particular idiom a very long lifetime.  I simply don't
think we can break compatibility that easily.  Remember the endless
threads we've had about the pace of change and stability.  We have to
live with warts, alas.  And this is a pretty minor one if you ask me.

(I realize that you're proposing another way out in a separate
message.  I'll reply to that next.  Since you changed the subject, I
can't wery well reply to it here.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas@python.ca  Fri Jul 19 21:57:09 2002
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 19 Jul 2002 13:57:09 -0700
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207171503.g6HF3mW01047@odiug.zope.com>; from guido@python.org on Wed, Jul 17, 2002 at 11:03:48AM -0400
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy> <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com> <20020717104935.A86293@doublegemini.com> <200207171503.g6HF3mW01047@odiug.zope.com>
Message-ID: <20020719135709.A22330@glacier.arctrix.com>

Guido van Rossum wrote:
> - There really isn't anything "broken" about the current situation;
>   it's just that "next" is the only method name mapped to a slot in
>   the type object that doesn't have leading and trailing double
>   underscores.

Are you saying the _only_ reason to rename it is for consistency with
the other type slot method names?  That's really weak, IMHO, and not
worth any kind of backwards incompatibility (which seems unavoidable).

  Neil



From paul@svensson.org  Fri Jul 19 21:49:54 2002
From: paul@svensson.org (Paul Svensson)
Date: Fri, 19 Jul 2002 16:49:54 -0400 (EDT)
Subject: [Python-Dev] The iterator story
In-Reply-To: <20020719120043.A21503@glacier.arctrix.com>
Message-ID: <Pine.LNX.4.44.0207191621540.25834-100000@familjen.svensson.org>

On Fri, 19 Jul 2002, Neil Schemenauer wrote:

>__iter__ is not a flag.  When you want to loop over an object you call
>__iter__ to get an iterator.  Since you should be able to loop over all
>iterators they should provide a __iter__ that returns self.

But you don't really loop _over_ the iterator, you loop _thru_ it.

To me there's a fundamental difference between providing a new object
and providing a reference to an existing object.  This difference
is mostly noticable for objects containing state.  The raison d'etre
for iterators is to contain state.  If it's sensible to sometimes
return an old object and sometimes a new, then we could have
'list(x) is x' being true when x is already a list.

What I'm trying to get to is, __iter__(x) returning an existing
object (self in this case) is really something very much different
from __iter__() creating new state, and returning that.

The problem is that we do want a way to loop _thru_ an iterator,
and having __iter__() return self gives us that, at the cost
of the above mentioned confusing conflagration.

Ping's suggested seq() function solves that quite nicely:

class seq:
    def __init__(self, i):
        self._iter = i
    def __iter__(self):
        return self._iter


		/Paul




From paul-python@svensson.org  Fri Jul 19 21:52:42 2002
From: paul-python@svensson.org (Paul Svensson)
Date: Fri, 19 Jul 2002 16:52:42 -0400 (EDT)
Subject: [Python-Dev] The iterator story
Message-ID: <Pine.LNX.4.44.0207191650260.25834-100000@familjen.svensson.org>

On Fri, 19 Jul 2002, Neil Schemenauer wrote:

>__iter__ is not a flag.  When you want to loop over an object you call
>__iter__ to get an iterator.  Since you should be able to loop over all
>iterators they should provide a __iter__ that returns self.

But you don't really loop _over_ the iterator, you loop _thru_ it.

To me there's a fundamental difference between providing a new object
and providing a reference to an existing object.  This difference
is mostly noticable for objects containing state.  The raison d'etre
for iterators is to contain state.  If it's sensible to sometimes
return an old object and sometimes a new, then we could as well have
'list(x) is x' being true when x is already a list.

What I'm trying to get to is, __iter__(x) returning an existing
object (self in this case) is really something very much different
from __iter__() creating new state, and returning that.

The problem is that we do want a way to loop _thru_ an iterator,
and having __iter__() return self gives us that, at the cost
of the above mentioned confusing conflagration.

Ping's suggested seq() function solves that quite nicely:

class seq:
    def __init__(self, i):
        self._iter = i
    def __iter__(self):
        return self._iter


		/Paul




From Jack.Jansen@oratrix.com  Fri Jul 19 21:58:57 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Fri, 19 Jul 2002 22:58:57 +0200
Subject: [Python-Dev] Added platform-specific directories to sys.path
Message-ID: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com>

I've a question that I'd like some feedback on. On MacOSX 
there's a set of directories that are meant especially for 
storing extensions to applications, and there's requests on the 
pythonmac-sig that I add these directories to the Python search 
path. This could easily be done optionally, with a .pth file in 
site-python.

MacOSX has rationalized where preferences, libraries, licenses, 
extensions, etc are stored, and for all of these there's a 
hierarchy of folders. In the case of Python extension modules 
the logical places would be ~/Library/Application Support/Python 
(for user-installed extension modules), /Library/Application 
Support/Python (for machine-wide installed extension modules) 
and /Network/Library/Application Support/Python (for 
workgroup-wide installed modules). The final location, in 
/System, is for factory-installed stuff from Apple, not needed 
just yet for this example:-).

I sympathize with the idea of making things more conform to the 
platform standard, on the other hand I'm a bit reluctant to do 
things differently again from what other Pythons do. But, one of 
the things that is sorely missing from Python is a standard 
place to install per-user extension modules, so this might well 
be the thing that triggers inclusion of such functionality into 
the grand scheme of things (including distutils support, etc).
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -




From guido@python.org  Fri Jul 19 22:10:45 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 17:10:45 -0400
Subject: [Python-Dev] The iterator story
In-Reply-To: Your message of "Fri, 19 Jul 2002 04:28:32 PDT."
 <Pine.LNX.4.44.0207190133440.25751-100000@ziggy>
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy>
Message-ID: <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net>

> Here is a summary of the whole iterator picture as i currently see it.
> This is necessarily subjective, but i will try to be precise so that
> it's clear where i'm making a value judgement and where i'm trying to
> state fact, and so we can pinpoint areas where we agree and disagree.
> 
> In the subjective sections, i have marked with [@] the places where
> i solicit agreement or disagreement.
> 
> I would like to know your opinions on the issues listed below,
> and on the places marked [@].
> 
> 
> Definitions (objective)
> -----------------------
> 
> Container: a thing that provides non-destructive access to a varying
> number of other things.
> 
>     Why "non-destructive"?  Because i don't expect that merely looking
>     at the contents will cause a container to be altered.  For example,
>     i expect to be able to look inside a container, see that there are
>     five elements; leave it alone for a while, come back to it later
>     and observe once again that there are five elements.
> 
>     Consequently, a file object is not a container in general.  Given
>     a file object, you cannot look at it to see if it contains an "A",
>     and then later look at it once again to see if it contains an "A"
>     and get the same result.  If you could seek, then you could do
>     this, but not all files support seeking.  Even if you could seek,
>     the act of reading the file would still alter the file object.
> 
>     The file object provides no way of getting at the contents without
>     mutating itself.  According to my definition, it's fine for a
>     container to have ways of mutating itself; but there has to be
>     *some* way of getting the contents without mutating the container,
>     or it just ain't a container to me.
> 
>     A file object is better described as a stream.  Hypothetically
>     one could create an interface to seekable files that offered some
>     non-mutating read operations; this would cause the file to look
>     more like an array of bytes, and i would find it appropriate to
>     call that interface a container.
> 
> Iterator: a thing that you can poke (i.e. send a no-argument message),
> where each time you poke it, it either yields something or announces
> that it is exhausted.
> 
>     For an iterator to mutate itself every time you poke it is not
>     part of my definition.  But the only non-mutating iterator would
>     be an iterator that returns the same thing forever, or an iterator
>     that is always exhausted.  So most iterators usually mutate.
> 
>     Some iterators are associated with a container, but not all.
> 
>     There can be many kinds of iterators associated with a container.
>     The most natural kind is one that yields the elements of the
>     container, one by one, mutating itself each time it is poked,
>     until it has yielded all of the elements of the container and
>     announces exhaustion.
> 
> A Container's Natural Iterator: an iterator that yields the elements
> of the container, one by one, in the order that makes the most sense
> for the container.  If the container has a finite size n, then the
> iterator can be poked exactly n times, and thereafter it is exhausted.

Sure.

But I note that there are hybrids, and I think files (at least
seekable files) fall in the hybrid category.  Other examples of
hybrids:

- Some dbm variants (e.g. dbhash and gdbm) provide first() and next()
  or firstkey() and nextkey() methods that combine iterator state with
  the container object.  These objects simply provide two different
  interfaces, a containerish interface (__getitem__ in fact), and an
  iteratorish interface.

- Before we invented the concept of interators, I believe it was
  common for tree data structures to provide iterators that didn't put
  the iteration state in a separate object, but simply kept a pointer
  to the current node of the iteration pass somewhere in the root of
  the tree.

The idea that a container also has some iterator state, and that you
have to do something simple (like calling firstkey() or seek(0)) to
reset the iterator, is quite common.

You may argue that this is poor design that should be fixed, and in
general I would agree (the firstkey()/nextkey() protocol in particular
is clumsy to use), but it is common nevertheless, and sometimes common
usage patterns as well as the relative cost of random access make it a
cood compromise sometimes.  For example, while a tape file is a
container in the sense that reading the data doesn't destroy it, it's
very heavily geared towards sequential access, and you can't
realistically have two iterators going over the same tape at once.  If
you're too young to remember, think of files on CD media -- there,
random access, while possible, is several orders of magnitude slower
than sequential access (better than tape, but a lot worse than regular
magnetic hard drives).

> Issues (objective)
> ------------------
> 
> I alluded to a set of issues in an earlier message, and i'll begin
> there, by defining what i meant more precisely.
> 
> The Destructive-For Issue:
> 
>     In most languages i can think of, and in Python for the most
>     part, a statement such as "for x in y: print x" is a
>     non-destructive operation on y.  Repeating "for x in y: print x"
>     will produce exactly the same results once more.
> 
>     For pre-iterator versions of Python, this fails to be true only
>     if y's __getitem__ method mutates y.  The introduction of
>     iterators has caused this to now be untrue when y is any iterator.
> 
>     The issue is, should "for" be non-destructive?

I don't see the benefit.  We've done this for years and the only
conceptual problem was the abuse of __getitem__, not the
destructiveness of the for-loop.

> The Destructive-In Issue:
> 
>     Notice that the iteration that takes place for the "in" operator
>     is implemented in the same way as "for".  So if "for" destroys
>     its second operand, so will "in".
> 
>     The issue is, should "in" be non-destructive?

If it can't be helped otherwise, sure, why not?

>     (Similar issues exist for built-ins that iterate, like list().)

At least list() keeps a copy of all the items, so you can then iterate
over them as often as you want. :-)

> The __iter__-On-Iterators Issue:
> 
>     Some people have mentioned that the presence of an __iter__()
>     method is a way of signifying that an object supports the
>     iterator protocol.  It has been said that this is necessary
>     because the presence of a "next()" method is not sufficiently
>     distinguishing.

Not me.

>     Some have said that __iter__() is a completely distinct protocol
>     from the iterator protocol.
> 
>     The issue is, what is __iter__() really for?

To support iter() and for-loops.

>     And secondarily, if it is not part of the iterator protocol,
>     then should we require __iter__() on iterators, and why?

So that you can use an iterator in a for-loop.

> The __next__-Naming Issue:
> 
>     The iteration method is currently called "next()".
> 
>     Previous candidates for the name of this method were "next",
>     "__next__", and "__call__".  After some previous debate,
>     it was pronounced to be "next()".
> 
>     There are concerns that "next()" might collide with existing
>     methods named "next()".  There is also a concern that "next()"
>     is inconsistent because it is the only type-slot-method that
>     does not have a __special__ name.
> 
>     The issue is, should it be called "next" or "__next__"?

That's a separate issue, and cleans up only a small wart that in
practice hasn't hurt anybody AFAIK.


> My Positions (subjective)
> -------------------------
> 
> I believe that "for" and "in" and list() should be non-destructive.
> I believe that __iter__() should not be required on iterators.
> I believe that __next__() is a better name than next().
> 
> Destructive-For, Destructive-In:
> 
>     I think "for" should be non-destructive because that's the way
>     it has almost always behaved, and that's the way it behaves in
>     any other language [@] i can think of.
> 
>     For a container's __getitem__ method to mutate the container is,
>     in my opinion, bad behaviour.  In pre-iterator Python, we needed
>     some way to allow the convenience of "for" on user-implemented
>     containers.  So "for" supported a special protocol where it would
>     call __getitem__ with increasing integers starting from 0 until
>     it hit an IndexError.  This protocol works great for sequence-like
>     containers that were indexable by integers.
> 
>     But other containers had to be hacked somewhat to make them fit.
>     For example, there was no good way to do "for" over a dictionary-like
>     container.  If you attempted "for" over a user-implemented dictionary,
>     you got a really weird "KeyError: 0", which only made sense if you
>     understood that the "for" loop was attempting __getitem__(0).
> 
>     (Hey!  I just noticed that
> 
>         from UserDict import UserDict
>         for k in UserDict(): print k
> 
>     still produces "KeyError: 0"!  This oughta be fixed...)

Check the CVS logs.  At one point before 2.2 was released, UserDict
has a __iter__ method.  But then SF bug 448153 was filed, presenting
evidence that this broke previously working code.  So a separate
class, IterableUserDict, was added that has the __iter__ method.

I agree that this is less than ideal, but that's life.

>     If you wanted to support "for" on something else, sometimes you
>     would have to make __getitem__ mutate the object, like it does
>     in the fileinput module.  But then the user has to know that
>     this object is a special case: "for" only works the first time.

This was and still is widespread.  There are a lot of objects that
have a way to return an iterators (old style using fake __getitem__,
and new ones using __iter__ and next) that are intended to be looped
over, once.  I have no desire to deprecate this behavior, since (a) it
would be a major upheaval for the user community (a lot worse than
integer division), and (b) I don't see that "fixing" this prevents a
particular category of programming errors.

>     When iterators were introduced, i believed they were supposed
>     to solve this problem.  Currently, they don't.

No, they solve the conceptual ugliness of providing a __getitem__ that
can only be called once.  The new rule is, if you provide __getitem__,
it must support random access; otherwise, you should provide __iter__.

>     Currently, "in" can even be destructive.  This is more serious.
>     While one could argue that it's not so strange for
> 
>         for x in y: ...
> 
>     to alter y (even though i do think it is strange), i believe
>     just about anyone would find it very counterintuitive for
> 
>         if x in y:
> 
>     to alter y.  [@]

That falls in the category of "then don't do that".

> __iter__-On-Iterators:
> 
>     I believe __iter__ is not a type flag.  As i argued previously,
>     i think that looking for the presence of methods that don't actually
>     implement a protocol is a poor way to check for protocol support.
>     And as things stand, the presence of __iter__ doesn't even work [@]
>     as a type flag.

And I never said it was a type flag.  I'm tired of repeating myself,
but you keep repeating this broken argument, so I have to keep
correcting you.

>     There are objects with __iter__ that are not iterators (like most
>     containers).  And there are objects without __iter__ that work as
>     iterators.  I know you can legislate the latter away, but i think
>     such legislation would amount to fighting the programmers -- and
>     it is infeasible [@] to enforce the presence of __iter__ in practice.

I think having next without having __iter__ is like having __getitem__
without having __len__.  There are corner cases where you might get
away with this because you know it won't be called, but (as I've
repeated umpteen times now), a for-loop over an iterator is a common
idiom.

>     Based on Guido's positive response, in which he asked me to make
>     an addition to the PEP, i believe Guido agrees with me that
>     __iter__ is distinct from the protocol of an iterator.  This
>     surprised me because it runs counter to the philosophy previously
>     expressed in the PEP.

I recognize that they are separate protocols.  But because I like the
for-loop as a convenient way to get all of the elements of an
iterator, I want iterators to support __iter__.

The alternative would be for iter() to see if the object implements
next (after finding that it has neither __iter__ nor __getitem__), and
return the object itself unchanged.  If we had picked __next__ instead
of 'next', that would perhaps been my choice (though I might *still*
have recommended implementing __iter__ returning self, to avoid two
failing getattr calls).

>     Now suppose we agree that __iter__ and next are distinct protocols.
>     Then why require iterators to support both?  The only reason we
>     would want __iter__ on iterators is so that we can use "for" [@]
>     with an iterator as the second operand.

Right.  Finally you got it.

>     I have just argued, above, that it's *not* a good idea for "for"
>     and "in" to be destructive.  Since most iterators self-mutate,
>     it follows that it's not advisable to use an iterator directly
>     as the second operand of a "for" or "in".
> 
>     I realize this seems radical!  This may be the most controversial
>     point i have made.  But if you accept that "in" should not
>     destroy its second argument, the conclusion is unavoidable.

Since I have little sympathy for your premise, this conclusion is all
from unavoidable for me. :-)

> __next__-Naming:
> 
>     I think the potential for collision, though small, is significant,
>     and this makes "__next__" a better choice than "next".  A built-in
>     function next() should be introduced; this function would call the
>     tp_iternext slot, and for instance objects tp_iternext would call
>     the __next__ method implemented in Python.
> 
>     The connection between this issue and the __iter__ issue is that,
>     if next() were renamed to __next__(), the argument that __iter__
>     is needed as a flag would also go away.

I really wish we had had this insight 18 months ago.  Right now, it's
too late.  Dragging all the other stuff in doesn't strengthen the
argument for fixing it now.

> The Current PEP (objective)
> ---------------------------
> 
> The current PEP takes the position that "for" and "in" can be
> destructive; that __iter__() and next() represent two distinct
> protocols, yet iterators are required to support both; and that
> the name of the method on iterators is called "next()".
> 
> 
> My Ideal Protocol (subjective)
> ------------------------------
> 
> So by now the biggest question/objection you probably have is
> "if i can't use an iterator with 'for', then how can i use it?"
> 
> The answer is that "for" is a great way to iterate over things;
> it's just that it iterates over containers and i want to preserve
> that.  We need a different way to iterate over iterators.
> 
> In my ideal world, we would allow a new form of "for", such as
> 
>     for line from file:
>         print line
> 
> The use if "from" instead of "in" would imply that we were
> (destructively) pulling things out of the iterator, and would
> remove any possible parallel to the test "x in y", which should
> rightly remain non-destructive.

Alternative syntaxes for for-loops have been proposed as solutions to
all sorts of things (e.g. what's called enumerate() in 2.3, and a
simplified syntax for range(), and probably other things).

I'm not keen on this.  I don't want to user-test it, but I expect that
it's too subtle a difference, and that we would see Aha! experiences
of the kind "Oh, it's a for-*from* loop!  I never noticed that, I
always read it as a for-*in* loop!  That explains the broken behavior."

> Here's the whole deal:
> 
>     - Iterators provide just one method, __next__().
> 
>     - The built-in next() calls tp_iternext.  For instances,
>       tp_iternext calls __next__.
> 
>     - Objects wanting to be iterated over provide just one method,
>       __iter__().  Some of these are containers, but not all.
> 
>     - The built-in iter(foo) calls tp_iter.  For instances,
>       tp_iter calls __iter__.
> 
>     - "for x in y" gets iter(y) and uses it as an iterator.
> 
>     - "for x from y" just uses y as the iterator.
> 
> That's it.
> 
> Benefits:
> 
>     - We have a nice clean division between containers and iterators.
> 
>     - When you see "for x in y" you know that y is a container.
> 
>     - When you see "for x from y" you know that y is an iterator.
> 
>     - "for x in y" never destroys y.
> 
>     - "if x in y" never destroys y.
> 
>     - If you have an object that is container-like, you can add
>       an __iter__ method that gives its natural iterator.  If
>       you want, you can supply more iterators that do different
>       things; no problem.  No one using your object is confused
>       about whether it mutates.
> 
>     - If you have an object that is cursor-like or stream-like,
>       you can safely make it into an iterator by adding __next__.
>       No one using your object is confused about whether it mutates.
> 
> Other notes:
> 
>     - Iterator algebra still works fine, and is still easy to write:
> 
>         def alternate(it):
>             while 1:
>                 yield next(it)
>                 next(it)
> 
>     - The file problem has a consistent solution.  Instead of writing
>       "for line in file" you write
> 
>         for line from file:
>             print line
> 
>       Being forced to write "from" signals to you that the file is
>       eaten up.  There is no expectation that "for line from file"
>       will work again.
> 
>       The best would be a convenience function "readlines", to
>       make this even clearer:
> 
>         for line in readlines("foo.txt"):
>             print line
> 
>       Now you can do this as many times as you want, and there is
>       no possibility of confusion; there is no file object on which
>       to call methods that might mess up the reading of lines.
> 
> 
> My Not-So-Ideal Protocol
> ------------------------
> 
> All right.  So new syntax may be hard to swallow.  An alternative
> is to introduce an adapter that turns an iterator into something
> that "for" will accept -- that is, the opposite of iter().
> 
>     - The built-in seq(it) returns x such that iter(x) yields it.
> 
> Then instead of writing
> 
>     for x from it:
> 
> you would write
> 
>     for x in seq(it):
> 
> and the rest would be the same.  The use of "seq" here is what
> would flag the fact that "it" will be destroyed.

I don't feel I have to drive it home any further, so I'll leave these
last few paragraphs without comments.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 19 22:20:35 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 17:20:35 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 19 Jul 2002 13:57:09 PDT."
 <20020719135709.A22330@glacier.arctrix.com>
References: <Pine.LNX.4.44.0207161656360.17524-100000@ziggy> <200207170129.g6H1Tt116117@pcp02138704pcs.reston01.va.comcast.net> <20020717094504.A85351@doublegemini.com> <200207171409.g6HE9Di00659@odiug.zope.com> <20020717104935.A86293@doublegemini.com> <200207171503.g6HF3mW01047@odiug.zope.com>
 <20020719135709.A22330@glacier.arctrix.com>
Message-ID: <200207192120.g6JLKZw15241@pcp02138704pcs.reston01.va.comcast.net>

> Guido van Rossum wrote:
> > - There really isn't anything "broken" about the current situation;
> >   it's just that "next" is the only method name mapped to a slot in
> >   the type object that doesn't have leading and trailing double
> >   underscores.
> 
> Are you saying the _only_ reason to rename it is for consistency with
> the other type slot method names?  That's really weak, IMHO, and not
> worth any kind of backwards incompatibility (which seems unavoidable).
> 
>   Neil

Almost.  This means that we're retroactively saying that all objects
with a next method are iterators, thereby slightly stomping on the
user's namespace.  But as long a you don't use such an object as an
iterator, it's harmless.

And if my position wasn't clear already, I agree it's not worth
"fixing". :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jul 19 22:23:07 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 17:23:07 -0400
Subject: [Python-Dev] Added platform-specific directories to sys.path
In-Reply-To: Your message of "Fri, 19 Jul 2002 22:58:57 +0200."
 <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com>
References: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com>
Message-ID: <200207192123.g6JLN7s15263@pcp02138704pcs.reston01.va.comcast.net>

> I've a question that I'd like some feedback on. On MacOSX 
> there's a set of directories that are meant especially for 
> storing extensions to applications, and there's requests on the 
> pythonmac-sig that I add these directories to the Python search 
> path. This could easily be done optionally, with a .pth file in 
> site-python.
> 
> MacOSX has rationalized where preferences, libraries, licenses, 
> extensions, etc are stored, and for all of these there's a 
> hierarchy of folders. In the case of Python extension modules 
> the logical places would be ~/Library/Application Support/Python 
> (for user-installed extension modules), /Library/Application 
> Support/Python (for machine-wide installed extension modules) 
> and /Network/Library/Application Support/Python (for 
> workgroup-wide installed modules). The final location, in 
> /System, is for factory-installed stuff from Apple, not needed 
> just yet for this example:-).
> 
> I sympathize with the idea of making things more conform to the 
> platform standard, on the other hand I'm a bit reluctant to do 
> things differently again from what other Pythons do. But, one of 
> the things that is sorely missing from Python is a standard 
> place to install per-user extension modules, so this might well 
> be the thing that triggers inclusion of such functionality into 
> the grand scheme of things (including distutils support, etc).

Traditionally, on Unix per-user extensions are done by pointing
PYTHONPATH to your per-user directory (-ies) in your .profile.

On Windows you can do this too, but I bet most people just have a
per-user computer. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From Jack.Jansen@oratrix.com  Fri Jul 19 22:34:40 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Fri, 19 Jul 2002 23:34:40 +0200
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <20020719112602.A17763@ActiveState.com>
Message-ID: <554A408C-9B5F-11D6-9B6B-003065517236@oratrix.com>

On vrijdag, juli 19, 2002, at 08:26 , Trent Mick wrote:

> [Tim Peters wrote]
>> Excellent!  One down, about two hundred thousand to go.
>
> Mark rocks!

Oh, it's MarkH appreciation that's wanted! In that case I'll 
gladly chime in, I was was afraid it was __declspec(dllexport) 
appreciation. Mark is one cool dude who knows where his towel is!

199998 to go. Should we start taking a poll who'll be the next 
python-devver we start appreciating when the counter hits zero?
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -




From barry@zope.com  Fri Jul 19 22:46:30 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 19 Jul 2002 17:46:30 -0400
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
References: <20020719112602.A17763@ActiveState.com>
 <554A408C-9B5F-11D6-9B6B-003065517236@oratrix.com>
Message-ID: <15672.34998.636509.747342@anthem.wooz.org>

>>>>> "JJ" == Jack Jansen <Jack.Jansen@oratrix.com> writes:

    JJ> Oh, it's MarkH appreciation that's wanted! In that case I'll
    JJ> gladly chime in, I was was afraid it was __declspec(dllexport)
    JJ> appreciation. Mark is one cool dude who knows where his towel
    JJ> is!

    JJ> 199998 to go. Should we start taking a poll who'll be the next
    JJ> python-devver we start appreciating when the counter hits
    JJ> zero?

My everlasting appreciation of MarkH was cemented the night, many IPCs
ago, that he drank me under the table and called us "purple".

199997-to-go-ly y'rs,
-Barry



From barry@zope.com  Fri Jul 19 22:48:53 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 19 Jul 2002 17:48:53 -0400
Subject: [Python-Dev] Added platform-specific directories to sys.path
References: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com>
 <200207192123.g6JLN7s15263@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15672.35141.803094.488541@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> Traditionally, on Unix per-user extensions are done by
    GvR> pointing PYTHONPATH to your per-user directory (-ies) in your
    GvR> .profile.

Or adding them to sys.path via your $PYTHONSTARTUP file.

OTOH, it might be nice if the distutils `install' command had some
switches to make installing in some of these common alternative
locations a little easier.  That might dovetail nicely if/when we
decide to add a site-updates directory to sys.path.

-Barry



From tommy@ilm.com  Fri Jul 19 23:11:07 2002
From: tommy@ilm.com (Hambozo)
Date: Fri, 19 Jul 2002 15:11:07 -0700 (PDT)
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <15672.34998.636509.747342@anthem.wooz.org>
References: <20020719112602.A17763@ActiveState.com>
 <554A408C-9B5F-11D6-9B6B-003065517236@oratrix.com>
 <15672.34998.636509.747342@anthem.wooz.org>
Message-ID: <15672.36408.362000.540999@mace.lucasdigital.com>

Barry A. Warsaw writes:
| 
| My everlasting appreciation of MarkH was cemented the night, many IPCs
| ago, that he drank me under the table and called us "purple".

When anyone asks my opinion of Mark I always say:

"F**kin' Ripper!"

:)

199996 and counting...  -Tommy



From barry@zope.com  Fri Jul 19 23:10:59 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Fri, 19 Jul 2002 18:10:59 -0400
Subject: [Python-Dev] Do we still need Lib/test/data?
Message-ID: <15672.36467.645262.622848@anthem.wooz.org>

I'm about to check in some changes to the email package, which will
include a re-organization of its test suite.  Part of this will be so
that I can add some huge torture tests to the standalone mimelib
project without committing megs of email samples to the Python
project.  It will also makes it easier for me to create the mimelib
distro because I'll then be able to put the setup.py file in the email
directory instead of having to maintain a fake hierarchy elsewhere
just to make distutils happy.

Specifically, I'm going to move the bulk of Lib/test_email.py and
Lib/test_email_codes.py to Lib/email/test and make email.test a
full-fledged subpackage of the email package.

I'm also going to move the Lib/test/data directory to Lib/email/test.
I'll do this by creating a new directory and cvs adding a copy of the
files to the new location (the cvs revision history isn't important
enough to preserve).

I believe this should be entirely transparent to most of you.  My
question is whether I should cvsrm the files that are currently in
Lib/test/data or not?  On the one hand, I don't want to maintain
duplicates, but OTOH, I'm not sure if any other code or tests depends
on those files (I did some attempts at grepping for this and didn't
/see/ anything but I'm trying to be conservative).

Needless to say I won't be actually removing the Lib/test/data
directory, but a "cvs up -P" would hide it from you.

Any opinions?
-Barry



From neal@metaslash.com  Fri Jul 19 23:32:09 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Fri, 19 Jul 2002 18:32:09 -0400
Subject: [Python-Dev] The iterator story
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <20020719120043.A21503@glacier.arctrix.com>
Message-ID: <3D389369.948547E0@metaslash.com>

Neil Schemenauer wrote:
> 
> Ka-Ping Yee wrote:
> >     I think "for" should be non-destructive because that's the way
> >     it has almost always behaved, and that's the way it behaves in
> >     any other language [@] i can think of.
> 
> I agree that it can be surprising to have "for" destory the object it's
> looping over.  I myself was bitten once by it.  I'm not yet sure if this
> is something that will repeatedly bite.  I suspect it might. :-(

In what context?  Were you iterating over a file or something else?
I'm wondering if this is a problem, perhaps pychecker could generate
a warning?

Neal



From aahz@pythoncraft.com  Fri Jul 19 23:29:38 2002
From: aahz@pythoncraft.com (Aahz)
Date: Fri, 19 Jul 2002 18:29:38 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <200207192029.g6JKTU015005@pcp02138704pcs.reston01.va.comcast.net>
References: <Pine.LNX.4.44.0207190429410.25751-100000@ziggy> <200207192029.g6JKTU015005@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020719222938.GA23413@panix.com>

On Fri, Jul 19, 2002, Guido van Rossum wrote:
>Ping:
>> 
>> I think the renaming of next() to __next__() is a good idea in any
>> case.  It is distant enough from the other issues that it can be done
>> independently of any decisions about __iter__.
> 
> Yeah, it's just a pain that it's been deployed in Python 2.2 since
> last December, and by the time 2.3 is out it will probably have been
> at least a full year.  Worse, 2.2 is voted to be Python-in-a-Tie,
> giving that particular idiom a very long lifetime.  I simply don't
> think we can break compatibility that easily.  Remember the endless
> threads we've had about the pace of change and stability.  We have to
> live with warts, alas.  And this is a pretty minor one if you ask me.

Is this a Pronouncement, or are we still waiting on the results of the
survey?  Note that several people have suggested a multi-release
strategy for fixing this problem; does that make any difference?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From neal@metaslash.com  Fri Jul 19 23:47:38 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Fri, 19 Jul 2002 18:47:38 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
References: <Pine.LNX.4.44.0207190429410.25751-100000@ziggy> <200207192029.g6JKTU015005@pcp02138704pcs.reston01.va.comcast.net> <20020719222938.GA23413@panix.com>
Message-ID: <3D38970A.2693833E@metaslash.com>

Aahz wrote:
> 
> On Fri, Jul 19, 2002, Guido van Rossum wrote:
> >Ping:
> >>
> >> I think the renaming of next() to __next__() is a good idea in any
> >> case.  It is distant enough from the other issues that it can be done
> >> independently of any decisions about __iter__.
> >
> > Yeah, it's just a pain that it's been deployed in Python 2.2 since
> > last December, and by the time 2.3 is out it will probably have been
> > at least a full year.  Worse, 2.2 is voted to be Python-in-a-Tie,
> > giving that particular idiom a very long lifetime.  I simply don't
> > think we can break compatibility that easily.  Remember the endless
> > threads we've had about the pace of change and stability.  We have to
> > live with warts, alas.  And this is a pretty minor one if you ask me.
> 
> Is this a Pronouncement, or are we still waiting on the results of the
> survey?  Note that several people have suggested a multi-release
> strategy for fixing this problem; does that make any difference?

Would it be good to use __next__() if it exists, else try next()?
This doesn't fix the current 'wart,' however, it could allow
moving closer to the desired end.  It could cause confusion.
For compatability, one would only need to do:

	next = __next__

or vica versa.

Not sure this is worth it.  But if there is a transition, it could
ease the pain.

Neal



From nas@python.ca  Sat Jul 20 00:22:26 2002
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 19 Jul 2002 16:22:26 -0700
Subject: [Python-Dev] The iterator story
In-Reply-To: <3D389369.948547E0@metaslash.com>; from neal@metaslash.com on Fri, Jul 19, 2002 at 06:32:09PM -0400
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <20020719120043.A21503@glacier.arctrix.com> <3D389369.948547E0@metaslash.com>
Message-ID: <20020719162226.A22929@glacier.arctrix.com>

Neal Norwitz wrote:
> In what context?  Were you iterating over a file or something else?
> I'm wondering if this is a problem, perhaps pychecker could generate
> a warning?

I was switching between implementing something as a generator and
returning a list.  I was curious why I was getting different behavior
until I realized I was iterating over the result twice.  I don't
think pychecker could warn about such a bug.

  Neil



From martin@v.loewis.de  Sat Jul 20 01:02:11 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 20 Jul 2002 02:02:11 +0200
Subject: [Python-Dev] Where's time.daylight???
In-Reply-To: <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net>
References: <E17VbDN-00015m-00@usw-pr-cvs1.sourceforge.net>
 <15672.18628.831787.897474@anthem.wooz.org>
 <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net>
 <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3it3bnu64.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> I'm going to remove the _XOPEN_SOURCE define; Jeremy and Martin can
> try to figure out what the right thing is for Tru64.

This is the wrong solution; instead, you need to define _GNU_SOURCE in
addition to _XOPEN_SOURCE.

Regards,
Martin




From martin@v.loewis.de  Sat Jul 20 01:06:51 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 20 Jul 2002 02:06:51 +0200
Subject: [Python-Dev] Added platform-specific directories to sys.path
In-Reply-To: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com>
References: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com>
Message-ID: <m3eldzntyc.fsf@mira.informatik.hu-berlin.de>

Jack Jansen <Jack.Jansen@oratrix.com> writes:

> I sympathize with the idea of making things more conform to the
> platform standard, on the other hand I'm a bit reluctant to do things
> differently again from what other Pythons do. But, one of the things
> that is sorely missing from Python is a standard place to install
> per-user extension modules, so this might well be the thing that
> triggers inclusion of such functionality into the grand scheme of
> things (including distutils support, etc).

If that is the platform convention, I see no problem following
it. Windows already does things differently from Unix, by using the
registry to compute sys.path.

Regards,
Martin




From guido@python.org  Sat Jul 20 01:30:04 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 20:30:04 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 19 Jul 2002 18:29:38 EDT."
 <20020719222938.GA23413@panix.com>
References: <Pine.LNX.4.44.0207190429410.25751-100000@ziggy> <200207192029.g6JKTU015005@pcp02138704pcs.reston01.va.comcast.net>
 <20020719222938.GA23413@panix.com>
Message-ID: <200207200030.g6K0U4P26218@pcp02138704pcs.reston01.va.comcast.net>

> > Yeah, it's just a pain that it's been deployed in Python 2.2 since
> > last December, and by the time 2.3 is out it will probably have been
> > at least a full year.  Worse, 2.2 is voted to be Python-in-a-Tie,
> > giving that particular idiom a very long lifetime.  I simply don't
> > think we can break compatibility that easily.  Remember the endless
> > threads we've had about the pace of change and stability.  We have to
> > live with warts, alas.  And this is a pretty minor one if you ask me.
> 
> Is this a Pronouncement, or are we still waiting on the results of the
> survey?

That is my current opinion.  I'm waiting for the results of the survey
to see if I'll be swayed (but I don't think it's likely).

> Note that several people have suggested a multi-release
> strategy for fixing this problem; does that make any difference?

Such a big gun for such a minor problem.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sat Jul 20 01:41:18 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 20:41:18 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: Your message of "Fri, 19 Jul 2002 18:47:38 EDT."
 <3D38970A.2693833E@metaslash.com>
References: <Pine.LNX.4.44.0207190429410.25751-100000@ziggy> <200207192029.g6JKTU015005@pcp02138704pcs.reston01.va.comcast.net> <20020719222938.GA23413@panix.com>
 <3D38970A.2693833E@metaslash.com>
Message-ID: <200207200041.g6K0fIX26940@pcp02138704pcs.reston01.va.comcast.net>

> Would it be good to use __next__() if it exists, else try next()?

Then the code in typeobject.c (e.g. resolve_slotdups) would have to
map tp_iternext to *both* __next__ and next.

> This doesn't fix the current 'wart,' however, it could allow
> moving closer to the desired end.  It could cause confusion.
> For compatability, one would only need to do:
> 
> 	next = __next__
> 
> or vica versa.
> 
> Not sure this is worth it.  But if there is a transition, it could
> ease the pain.

I don't think it's worth it.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sat Jul 20 01:43:21 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 19 Jul 2002 20:43:21 -0400
Subject: [Python-Dev] Where's time.daylight???
In-Reply-To: Your message of "Sat, 20 Jul 2002 02:02:11 +0200."
 <m3it3bnu64.fsf@mira.informatik.hu-berlin.de>
References: <E17VbDN-00015m-00@usw-pr-cvs1.sourceforge.net> <15672.18628.831787.897474@anthem.wooz.org> <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net> <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net>
 <m3it3bnu64.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200207200043.g6K0hMJ27043@pcp02138704pcs.reston01.va.comcast.net>

> > I'm going to remove the _XOPEN_SOURCE define; Jeremy and Martin can
> > try to figure out what the right thing is for Tru64.
> 
> This is the wrong solution; instead, you need to define _GNU_SOURCE in
> addition to _XOPEN_SOURCE.

Can you check that in?  I'm about to disappear to OSCON for a week.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sat Jul 20 07:06:29 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 20 Jul 2002 02:06:29 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: Your message of "Mon, 24 Jun 2002 21:33:18 EDT."
 <20020624213318.A5740@arizona.localdomain>
References: <20020624213318.A5740@arizona.localdomain>
Message-ID: <200207200606.g6K66Um28510@pcp02138704pcs.reston01.va.comcast.net>

> Any chance something like this could make it into the standard python
> library?  It would save a lot of time for lazy people like myself.  :-)
> 
> def heappush(heap, item):
>     pos = len(heap)
>     heap.append(None)
>     while pos:
>         parentpos = (pos - 1) / 2
>         parent = heap[parentpos]
>         if item <= parent:
>             break
>         heap[pos] = parent
>         pos = parentpos
>     heap[pos] = item
> 
> def heappop(heap):
>     endpos = len(heap) - 1
>     if endpos <= 0:
>         return heap.pop()
>     returnitem = heap[0]
>     item = heap.pop()
>     pos = 0
>     while 1:
>         child2pos = (pos + 1) * 2
>         child1pos = child2pos - 1
>         if child2pos < endpos:
>             child1 = heap[child1pos]
>             child2 = heap[child2pos]
>             if item >= child1 and item >= child2:
>                 break
>             if child1 > child2:
>                 heap[pos] = child1
>                 pos = child1pos
>                 continue
>             heap[pos] = child2
>             pos = child2pos
>             continue
>         if child1pos < endpos:
>             child1 = heap[child1pos]
>             if child1 > item:
>                 heap[pos] = child1
>                 pos = child1pos
>         break
>     heap[pos] = item
>     return returnitem

I have read (or at least skimmed) this entire thread now.  After I
reconstructed the algorithm in my head, I went back to Kevin's code; I
admire the compactness of his code.  I believe that this would make a
good addition to the standard library, as a friend of the bisect
module.  The only change I would make would be to make heap[0] the
lowest value rather than the highest.  (That's one thing that I liked
better about François Pinard's version, but a class seems too heavy
for this, just like it is overkill for bisect [*].  Oh, and maybe we
can borrow a few lines of François's description of the algorithm. :-)

I propose to call it heapq.py.  (Got a better name?  Now or never.)

[*] Afterthought: this could be made into an new-style class by adding
something like this to the end of module:

class heapq(list):
    __slots__ = []
    heappush = heappush
    heappop = heappop

A similar addition could easily be made to the bisect module.  But
this is very different from François' class, which hides the other
list methods.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@comcast.net  Sat Jul 20 07:18:16 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 20 Jul 2002 02:18:16 -0400
Subject: [Python-Dev] Sorting
Message-ID: <LNBBLJKPBEHFEDALKOLCIEDJAGAB.tim.one@comcast.net>

An enormous amount of research has been done on sorting since the last time
I wrote a sort for Python.  Major developments have been in two areas:

1. Adaptive sorting.  Sorting algorithms are usually tested on random
   data, but in real life you almost never see random data.  Python's
   sort tries to catch some common cases of near-order via special-
   casing.  The literature has since defined more than 15 formal
   measures of disorder, and developed algorithms provably optimal in
   the face of one or more of them.  But this is O() optimality, and
   theoreticians aren't much concerned about how big the constant
   factor is.  Some researchers are up front about this, and toward
   the end of one paper with "practical" in its title, the author was
   overjoyed to report that an implementation was only twice as slow
   as a naive quicksort <wink>.

2. Pushing the worst-case number of comparisons closer to the
   information-theoretic limit (ceiling(log2(N!))).

I don't care much about #2 -- in experiments conducted when it was new, I
measured the # of comparisons our samplesort hybrid did on random inputs,
and it was never more than 2% over the theoretical lower bound, and
typically closer.  As N grows large, the expected case provably converges to
the theoretical lower bound.  There remains a vanishly small chance for a
bad case, but nobody has reported one, and at the time I gave up trying to
construct one.


Back on Earth, among Python users the most frequent complaint I've heard is
that list.sort() isn't stable.  Alex is always quick to trot out the
appropriate DSU (Decorate Sort Undecorate) pattern then, but the extra
memory burden for that can be major (a new 2-tuple per list element costs
about 32 bytes, then 4 more bytes for a pointer to it in a list, and 12 more
bytes that don't go away to hold each non-small index).

After reading all those papers, I couldn't resist taking a crack at a new
algorithm that might be practical, and have something you might call a
non-recursive adaptive stable natural mergesort / binary insertion sort
hybrid.  In playing with it so far, it has two bad aspects compared to our
samplesort hybrid:

+ It may require temp memory, up to 2*N bytes worst case (one pointer each
  for no more than half the array elements).

+ It gets *some* benefit for arrays with many equal elements, but not
  nearly as much as I was able to hack samplesort to get.  In effect,
  paritioning is very good at moving equal elements close to each other
  quickly, but merging leaves them spread across any number of runs.
  This is especially irksome because we're sticking to Py_LT for
  comparisons, so can't even detect a==b without comparing a and b twice
  (and then it's a deduction from that not a < b and not b < a).  Given
  the relatively huge cost of comparisons, it's a timing disaster to do
  that (compare twice) unless it falls out naturally.  It was fairly
  natural to do so in samplesort, but not at all in this sort.

It also has good aspects:

+ It's stable (items that compare equal retain their relative order, so,
  e.g., if you sort first on zip code, and a second time on name, people
  with the same name still appear in order of increasing zip code; this
  is important in apps that, e.g., refine the results of queries based
  on user input).

+ The code is much simpler than samplesort's (but I think I can
  fix that <wink>).

+ It gets benefit out of more kinds of patterns, and without lumpy
  special-casing (a natural mergesort has to identify ascending and
  descending runs regardless, and then the algorithm builds on just
  that).

+ Despite that I haven't micro-optimized it, in the random case it's
  almost as fast as the samplesort hybrid.  In fact, it might
  have been a bit faster had I run tests yesterday (the samplesort
  hybrid got sped up by 1-2% last night).  This one surprised me the
  most, because at the time I wrote the samplesort hybrid, I tried
  several ways of coding mergesorts and couldn't make it as fast.

+ It has no bad cases (O(N log N) is worst case; N-1 compares is best).

Here are some typical timings, taken from Python's sortperf.py, over
identical lists of floats:

Key:
    *sort: random data
    \sort: descending data
    /sort: ascending data
    3sort: ascending data but with 3 random exchanges
    ~sort: many duplicates
    =sort: all equal
    !sort: worst case scenario

That last one was a worst case for the last quicksort Python had before it
grew the samplesort, and it was a very bad case for that.  By sheer
coincidence, turns out it's an exceptionally good case for the experimental
sort:

samplesort
 i    2**i  *sort  \sort  /sort  3sort  ~sort  =sort  !sort
15   32768   0.13   0.01   0.01   0.10   0.04   0.01   0.11
16   65536   0.24   0.02   0.02   0.23   0.08   0.02   0.24
17  131072   0.54   0.05   0.04   0.49   0.18   0.04   0.53
18  262144   1.18   0.09   0.09   1.08   0.37   0.09   1.16
19  524288   2.58   0.19   0.18   2.34   0.76   0.17   2.52
20 1048576   5.58   0.37   0.36   5.12   1.54   0.35   5.46

timsort
15   32768   0.16   0.01   0.02   0.05   0.14   0.01   0.02
16   65536   0.24   0.02   0.02   0.06   0.19   0.02   0.04
17  131072   0.55   0.04   0.04   0.13   0.42   0.04   0.09
18  262144   1.19   0.09   0.09   0.25   0.91   0.09   0.18
19  524288   2.60   0.18   0.18   0.46   1.97   0.18   0.37
20 1048576   5.61   0.37   0.35   1.00   4.26   0.35   0.74

If it weren't for the ~sort column, I'd seriously suggest replacing the
samplesort with this.  2*N extra bytes isn't as bad as it might sound, given
that, in the absence of massive object duplication, each list element
consumes at least 12 bytes (type pointer, refcount and value) + 4 bytes for
the list pointer.  Add 'em all up and that's a 13% worst-case temp memory
overhead.




From martin@v.loewis.de  Sat Jul 20 09:59:55 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 20 Jul 2002 10:59:55 +0200
Subject: [Python-Dev] Where's time.daylight???
In-Reply-To: <200207200043.g6K0hMJ27043@pcp02138704pcs.reston01.va.comcast.net>
References: <E17VbDN-00015m-00@usw-pr-cvs1.sourceforge.net>
 <15672.18628.831787.897474@anthem.wooz.org>
 <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net>
 <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net>
 <m3it3bnu64.fsf@mira.informatik.hu-berlin.de>
 <200207200043.g6K0hMJ27043@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3heiuzsdw.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Can you check that in?  I'm about to disappear to OSCON for a week.

Done. I have no OSF/1 (aka whatever) system, so I can't really test
whether it still helps on these systems.

Regards,
Martin



From jacobs@penguin.theopalgroup.com  Sat Jul 20 12:11:36 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Sat, 20 Jul 2002 07:11:36 -0400 (EDT)
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEDJAGAB.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.44.0207200707180.32211-100000@penguin.theopalgroup.com>

On Sat, 20 Jul 2002, Tim Peters wrote:
> After reading all those papers, I couldn't resist taking a crack at a new
> algorithm that might be practical, and have something you might call a
> non-recursive adaptive stable natural mergesort / binary insertion sort
> hybrid.

Great work, Tim!  I've got several Python implementations of stable-sorts
that I can now retire.  

> If it weren't for the ~sort column, I'd seriously suggest replacing the
> samplesort with this.

If duplicate keys cannot be more efficiently handled, why not add a
list.stable_sort() method?  That way the user gets to decide if they want
the ~sort tax.  If that case is fixed later, then there is little harm in
having list.sort == list.stable_sort.

> 2*N extra bytes isn't as bad as it might sound, given
> that, in the absence of massive object duplication, each list element
> consumes at least 12 bytes (type pointer, refcount and value) + 4 bytes for
> the list pointer.  Add 'em all up and that's a 13% worst-case temp memory
> overhead.

It doesn't bother me in the slightest (and I tend to sort big things).
13% is a reasonable trade-off for stability.

Thanks,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com




From pinard@iro.umontreal.ca  Sat Jul 20 13:24:45 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 20 Jul 2002 08:24:45 -0400
Subject: [Python-Dev] Re: Priority queue (binary heap) python code
In-Reply-To: <200207200606.g6K66Um28510@pcp02138704pcs.reston01.va.comcast.net>
References: <20020624213318.A5740@arizona.localdomain>
 <200207200606.g6K66Um28510@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <oqofd2wprm.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

> Oh, and maybe we can borrow a few lines of François's description of
> the algorithm. :-)

Borrow liberally!  I would prefer that nothing worth remains un-borrowed
from mine, so I can happily get rid of my copy when the time comes! :-)

> I propose to call it heapq.py.  (Got a better name?  Now or never.)

I like `heapq' as it is not an English common name, like `heap' would be,
so less likely to clash with user chosen variable names!  This principle
should be good in general.

Sub-classing `heapq' from `list' is a good idea!

P.S. - In other languages, I have been using `string' a lot, and this has
been one of the minor irritations when I came to Python, that it forced
me away of that identifier; so I'm now using `text' everywhere, instead.
Another example is the name `socket', which is kind of reserved from the
module name, I never really know how to name variables holding sockets :-).

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From ping@zesty.ca  Sat Jul 20 13:32:41 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Sat, 20 Jul 2002 05:32:41 -0700 (PDT)
Subject: [Python-Dev] The iterator story
In-Reply-To: <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.LNX.4.44.0207200427520.25751-100000@ziggy>

If you only have ten seconds read this:
---------------------------------------

Guido, i believe i understand your position.  My interpretation is:

    I'd like "iterate destructively" and "iterate non-destructively"
    to be spelled differently.  You don't.

    I'd like to be able to establish conventions so that "x in y"
    doesn't destroy y.  This isn't so important to you.

We have a difference of opinion.  I don't think we have a failure in
understanding.  If the opinions won't change, we might as well move on.
I did not mean to waste your time, only to achieve understanding.


Actual reply follows:
---------------------

On Fri, 19 Jul 2002, Guido van Rossum wrote:
> But I note that there are hybrids, and I think files (at least
> seekable files) fall in the hybrid category.

Indeed, files are unusual.  In the particular way that i've chosen
my definitions, though, classification of files is clear: files
are not containers (there's no non-mutating read) and files are
iterators (due to the behaviour of the read() method).

Files aside, i do agree that hybrids exist.  The dbm and tree examples
you gave indeed mix container and iterator behaviour.  I agree with
you that mixing these things isn't usually a good design.

In some cases you do end up providing both container-like and
iterator-like interfaces.  This is fine.  But then when you use the
object, you ought to be able to know which interface you are using.

The argument in the "iterator story" message is that we should have
a way to say "i want to use the non-destructive interface" and a
way to say "i want to use the destructive interface".  Depending what
makes sense, one can choose to implement either interface, or both.

> For example, while a tape file is a
> container in the sense that reading the data doesn't destroy it, it's
> very heavily geared towards sequential access, and you can't
> realistically have two iterators going over the same tape at once.

Indeed, you can't.  But a tape file object is not a container (if we're
using my definition), because the act of reading changes the tape file
object -- it advances the tape.  It's the same as file.read() -- even
though file.read() doesn't mutate the data on the disk, it does mutate
the file object, and that is what makes the file object not a container.

It's precisely because tapes are too slow for practical random access
that we would want a tape file object to provide an iterator-style
interface and not provide a container-style interface.

> If you're too young to remember

Hee hee.  I've used tapes.  I've used *cassette* tapes, even. :)

> >     The issue is, should "for" be non-destructive?
>
> I don't see the benefit.  We've done this for years and the only
> conceptual problem was the abuse of __getitem__, not the
> destructiveness of the for-loop.
[...]
> >     The issue is, should "in" be non-destructive?
>
> If it can't be helped otherwise, sure, why not?

Obviously we see these "problems" differently.  Having "x in y"
possibly destroy y is scary to me, but no big deal to you.  All right.

> >     still produces "KeyError: 0"!  This oughta be fixed...)
>
> Check the CVS logs.  At one point before 2.2 was released, UserDict
> has a __iter__ method.  But then SF bug 448153 was filed, presenting
> evidence that this broke previously working code.  So a separate
> class, IterableUserDict, was added that has the __iter__ method.

Oh.  :(   Okay.  Thanks for explaining.

> There are a lot of objects that
> have a way to return an iterators (old style using fake __getitem__,
> and new ones using __iter__ and next) that are intended to be looped
> over, once.  I have no desire to deprecate this behavior, since (a) it
> would be a major upheaval for the user community (a lot worse than
> integer division), and (b) I don't see that "fixing" this prevents a
> particular category of programming errors.

As you can tell by now, i think it does prevent a certain category
of errors.  The general description is "mixing up mutating and
non-mutating interfaces".  The closest analogy i can think of is
an alternate world in which "+" and "+=" had the same name, and the
only way you could tell if the left operand would get mutated is
by knowing the implementation of the left-hand object at runtime.

Of course, in real Python you have to trust that the implementation
"+" does not mutate.  But at least we are able to set a convention,
because "+" and "+=" are distinct operators.  In the weird alternate
world where "+" and "+=" are both written "+", you would have no
hope of telling the difference.  We'd look at "x + y" and say
"Will x change?  I don't know."

And so it is with "for x in y": we'd look at that and say "Will y
change?  I don't know."  We have no way of telling whether y is a
container or an iterator, thus no way to establish a convention
about what this should do.  "for x in y" is polymorphic on y, but
this is not how i think polymorphism is supposed to work.

You could say you don't care whether y changes.  (Well, you *are*
saying you don't care.)  Well, okay.  I just want to make sure we both
understand each other and see the issue at hand.  If we do, then it
just comes down to a difference of opinion about how significant a
mixup this is, and so be it.

> >     I believe __iter__ is not a type flag.
[...]
> And I never said it was a type flag.  I'm tired of repeating myself,
> but you keep repeating this broken argument, so I have to keep
> correcting you.

I know you didn't say this.  Please don't be offended.  I apologize
if i seemed to be wilfully ignoring you -- you don't have to repeat
things many times in order to "drive home" your position to me.
I was trying to summarize all the positions (not just yours),
organize them, and explain them all at once.


-- ?!ng




From ping@zesty.ca  Sat Jul 20 13:45:48 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Sat, 20 Jul 2002 05:45:48 -0700 (PDT)
Subject: [Python-Dev] Re: The iterator story
In-Reply-To: <20020719120043.A21503@glacier.arctrix.com>
Message-ID: <Pine.LNX.4.44.0207200533500.25751-100000@ziggy>

On Fri, 19 Jul 2002, Neil Schemenauer wrote:
> First, people could implement __iter__ such that it returns an iterator
> the mutates the original object (e.g. a file object __iter__ that
> returns xreadlines).

Yes, but then they would be violating the convention.  The way things
currently stand, we aren't even able to say what the convention *is*.

> Second, it will be confusing to have two different ways of looping over
> things.

It's a difference in perspective.  To me it seems confusing to have
only one way of looping that might do two different things.

But Guido basically agrees with you.  (As in, destructive and
non-destructive looping are not really that different; or, they are
different but it's not worth the bother.)

> Now I want to use this library but I have an iterator, not something
> that implements __iter__.  I would need to create a little wrapper with
> a __iter__ method that returns my object.

Yeah, that's seq().

> To summarize, I agree that "for" mutating the object can be surprising.

The rub is, the only way for it to *not* be surprising is to have a
way to *say* "loop destructively".  If you can't express your
expectations, there's no way to meet them.


-- ?!ng




From ping@zesty.ca  Sat Jul 20 13:58:39 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Sat, 20 Jul 2002 05:58:39 -0700 (PDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEBFAGAB.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.44.0207200549090.25751-100000@ziggy>

On Fri, 19 Jul 2002, Tim Peters wrote:
> "for" did and does work in accord with a simple protocol, and whether that's
> "destructive" depends on how the specific objects involved implement their
> pieces of the protocol, not on the protocol itself.  The same is true of all
> of Python's hookable protocols.

Name any protocol for which the question "does this mutate?" has no answer.

(I ask you to accept that __call__ is a special case.)

> What's so special about "for" that it
> should pretend to deliver purely functional behavior in a highly
> non-functional language?

Who said anything about functional behaviour?  I'm not requiring that
looping *never* mutate.  I just want to be able to tell *whether* it will.


-- ?!ng




From oren-py-d@hishome.net  Sat Jul 20 13:58:51 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sat, 20 Jul 2002 08:58:51 -0400
Subject: [Python-Dev] The iterator story
In-Reply-To: <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net>
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020720125850.GA5862@hishome.net>

> >     Based on Guido's positive response, in which he asked me to make
> >     an addition to the PEP, i believe Guido agrees with me that
> >     __iter__ is distinct from the protocol of an iterator.  This
> >     surprised me because it runs counter to the philosophy previously
> >     expressed in the PEP.
> 
> I recognize that they are separate protocols.  But because I like the
> for-loop as a convenient way to get all of the elements of an
> iterator, I want iterators to support __iter__.

Is this the only reason iterators are required to support __iter__?

It seems like a strange design decision to put the burden on all iterator 
implementers to write a dummy method returning self instead of just checking 
if tp_iter==NULL in PyObject_GetIter. It's like requiring all class writers 
to write a dummy __str__ method that calls __repr__ instead of implementing 
the automatic fallback to __repr__ in PyObject_Str when no __str__ is 
available.

	Oren




From aahz@pythoncraft.com  Sat Jul 20 14:00:01 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sat, 20 Jul 2002 09:00:01 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEDJAGAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCIEDJAGAB.tim.one@comcast.net>
Message-ID: <20020720130000.GA11845@panix.com>

On Sat, Jul 20, 2002, Tim Peters wrote:
>
> If it weren't for the ~sort column, I'd seriously suggest replacing the
> samplesort with this.  2*N extra bytes isn't as bad as it might sound, given
> that, in the absence of massive object duplication, each list element
> consumes at least 12 bytes (type pointer, refcount and value) + 4 bytes for
> the list pointer.  Add 'em all up and that's a 13% worst-case temp memory
> overhead.

Any reason the list object can't grow a .stablesort() method?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From oren-py-d@hishome.net  Sat Jul 20 14:28:57 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sat, 20 Jul 2002 09:28:57 -0400
Subject: [Python-Dev] The iterator story
In-Reply-To: <20020719162226.A22929@glacier.arctrix.com>
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <20020719120043.A21503@glacier.arctrix.com> <3D389369.948547E0@metaslash.com> <20020719162226.A22929@glacier.arctrix.com>
Message-ID: <20020720132857.GB5862@hishome.net>

On Fri, Jul 19, 2002 at 04:22:26PM -0700, Neil Schemenauer wrote:
> Neal Norwitz wrote:
> > In what context?  Were you iterating over a file or something else?
> > I'm wondering if this is a problem, perhaps pychecker could generate
> > a warning?
> 
> I was switching between implementing something as a generator and
> returning a list.  I was curious why I was getting different behavior
> until I realized I was iterating over the result twice.  I don't
> think pychecker could warn about such a bug.

That's the scenario that bit me too.  For me it was a little more difficult
to find because it was wrapped in a few layers of chained transformations.
I can't tell by the last element in the chain whether the first one is 
re-iterable or not. 

One approach to solve this is Ka-Ping Yee's proposal to specify in advance
whether you are expecting an iterator or a re-iterable container using
either 'for x in y' or 'for x from y'.  I don't think this will work.
There's already too much code that uses for x in y where y is an iterator.
Another problem is that a transformation shouldn't care whether its upstream 
source is an iterator or an iterable - it's a generic reusable building 
block.

My suggestion (which was rejected by Guido) was to raise an error when an
iterator's .next() method is called afer it raises StopIteration.  This
way, if I try to iterate over the result again at least I'll get and error
like "IteratorExhaustedError" instead something that is indistinguishable
from an iterator of an empty container. I hate silent errors.

This shouldn't be required from all iterator implementers but if all
built-in iterators supported this (especially generators) it would help a
lot to find such errors.

	Oren

P.S. My definition of a transformation is a function taking one iterable
argument and returning an iterator.  It is usually implemented as a
generator function. 




From oren-py-d@hishome.net  Sat Jul 20 14:39:26 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sat, 20 Jul 2002 09:39:26 -0400
Subject: [Python-Dev] Re: The iterator story
In-Reply-To: <Pine.LNX.4.44.0207200533500.25751-100000@ziggy>
References: <20020719120043.A21503@glacier.arctrix.com> <Pine.LNX.4.44.0207200533500.25751-100000@ziggy>
Message-ID: <20020720133926.GC5862@hishome.net>

On Sat, Jul 20, 2002 at 05:45:48AM -0700, Ka-Ping Yee wrote:
> > To summarize, I agree that "for" mutating the object can be surprising.
> 
> The rub is, the only way for it to *not* be surprising is to have a
> way to *say* "loop destructively".  If you can't express your
> expectations, there's no way to meet them.

It doesn't seem very useful to say "loop destructively" - in these cases I
don't usually care whether it's destructive or not.  It is useful, though, 
to be able to say "loop INdestructively".  

That's how I do it:

def reiter(obj):
    """ Return an object's iterator, raise exception if object does not
appear to support multiple iterations """
    assert not isintance(obj, file)
    itr = iter(obj)
    assert itr is not obj
    return itr

	Oren



From aahz@pythoncraft.com  Sat Jul 20 15:09:23 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sat, 20 Jul 2002 10:09:23 -0400
Subject: [Python-Dev] The iterator story
In-Reply-To: <20020720132857.GB5862@hishome.net>
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <20020719120043.A21503@glacier.arctrix.com> <3D389369.948547E0@metaslash.com> <20020719162226.A22929@glacier.arctrix.com> <20020720132857.GB5862@hishome.net>
Message-ID: <20020720140923.GA18716@panix.com>

On Sat, Jul 20, 2002, Oren Tirosh wrote:
> On Fri, Jul 19, 2002 at 04:22:26PM -0700, Neil Schemenauer wrote:
>> Neal Norwitz wrote:
>>>
>>> In what context?  Were you iterating over a file or something else?
>>> I'm wondering if this is a problem, perhaps pychecker could generate
>>> a warning?
>> 
>> I was switching between implementing something as a generator and
>> returning a list.  I was curious why I was getting different behavior
>> until I realized I was iterating over the result twice.  I don't
>> think pychecker could warn about such a bug.
> 
> That's the scenario that bit me too.  For me it was a little more difficult
> to find because it was wrapped in a few layers of chained transformations.
> I can't tell by the last element in the chain whether the first one is 
> re-iterable or not. 
> 
> My suggestion (which was rejected by Guido) was to raise an error when an
> iterator's .next() method is called afer it raises StopIteration.  This
> way, if I try to iterate over the result again at least I'll get and error
> like "IteratorExhaustedError" instead something that is indistinguishable
> from an iterator of an empty container. I hate silent errors.

I'm still not understanding how this would help.  When a chainable
transformer gets StopIteration, it should immediately return.  What else
do you want to do?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From guido@python.org  Sat Jul 20 15:10:57 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 20 Jul 2002 10:10:57 -0400
Subject: [Python-Dev] The iterator story
In-Reply-To: Your message of "Sat, 20 Jul 2002 05:32:41 PDT."
 <Pine.LNX.4.44.0207200427520.25751-100000@ziggy>
References: <Pine.LNX.4.44.0207200427520.25751-100000@ziggy>
Message-ID: <200207201410.g6KEAvY29349@pcp02138704pcs.reston01.va.comcast.net>

> If you only have ten seconds read this:
> ---------------------------------------
> 
> Guido, i believe i understand your position.  My interpretation is:
> 
>     I'd like "iterate destructively" and "iterate non-destructively"
>     to be spelled differently.  You don't.
> 
>     I'd like to be able to establish conventions so that "x in y"
>     doesn't destroy y.  This isn't so important to you.
> 
> We have a difference of opinion.  I don't think we have a failure in
> understanding.  If the opinions won't change, we might as well move on.
> I did not mean to waste your time, only to achieve understanding.

Aye, aye, Sir.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sat Jul 20 15:13:34 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 20 Jul 2002 10:13:34 -0400
Subject: [Python-Dev] The iterator story
In-Reply-To: Your message of "Sat, 20 Jul 2002 08:58:51 EDT."
 <20020720125850.GA5862@hishome.net>
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net>
 <20020720125850.GA5862@hishome.net>
Message-ID: <200207201413.g6KEDYh29370@pcp02138704pcs.reston01.va.comcast.net>

> > >     Based on Guido's positive response, in which he asked me to make
> > >     an addition to the PEP, i believe Guido agrees with me that
> > >     __iter__ is distinct from the protocol of an iterator.  This
> > >     surprised me because it runs counter to the philosophy previously
> > >     expressed in the PEP.
> > 
> > I recognize that they are separate protocols.  But because I like the
> > for-loop as a convenient way to get all of the elements of an
> > iterator, I want iterators to support __iter__.
> 
> Is this the only reason iterators are required to support __iter__?

Yes.

> It seems like a strange design decision to put the burden on all iterator 
> implementers to write a dummy method returning self instead of just checking 
> if tp_iter==NULL in PyObject_GetIter. It's like requiring all class writers 
> to write a dummy __str__ method that calls __repr__ instead of implementing 
> the automatic fallback to __repr__ in PyObject_Str when no __str__ is 
> available.

I suppose you meant "check for tp_iter==NULL and tp_iternext!=NULL.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From cce@clarkevans.com  Sat Jul 20 17:21:01 2002
From: cce@clarkevans.com (Clark C . Evans)
Date: Sat, 20 Jul 2002 12:21:01 -0400
Subject: [Python-Dev] The iterator story
In-Reply-To: <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Fri, Jul 19, 2002 at 05:10:45PM -0400
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020720122101.A38901@doublegemini.com>

On Fri, Jul 19, 2002 at 05:10:45PM -0400, Guido van Rossum wrote:
| > The __iter__-On-Iterators Issue:
| > 
| >     Some people have mentioned that the presence of an __iter__()
| >     method is a way of signifying that an object supports the
| >     iterator protocol.  It has been said that this is necessary
| >     because the presence of a "next()" method is not sufficiently
| >     distinguishing.
| 
| Not me.

As I remember the debate last year, Ping is expressing the 
concensus which was reached.  This issue was tied directly, although
not so articulately, to the namespace collision issue.  I remember
being concerned about next() not having leading and trailing __ but
my concerns were put to rest knowing that every iterator had to have
a __iter__ such that __iter__ returned self.   I wasn't on the list
for that long due to time constraints, but this linkage was there
at least for me.

| >     The iteration method is currently called "next()".
| > 
| >     Previous candidates for the name of this method were "next",
| >     "__next__", and "__call__".  After some previous debate,
| >     it was pronounced to be "next()".
| > 
| >     There are concerns that "next()" might collide with existing
| >     methods named "next()".  There is also a concern that "next()"
| >     is inconsistent because it is the only type-slot-method that
| >     does not have a __special__ name.
| > 
| >     The issue is, should it be called "next" or "__next__"?
| 
| That's a separate issue, and cleans up only a small wart that in
| practice hasn't hurt anybody AFAIK.

Today/tomorow I'll finish peicing together the survey so 
that it clearly articulates the issue (and I'll be sure to
note that you are against the idea).

Best,

Clark

-- 
Clark C. Evans                   Axista, Inc.
http://www.axista.com            800.926.5525
XCOLLA Collaborative Project Management Software



From neal@metaslash.com  Sat Jul 20 17:52:49 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Sat, 20 Jul 2002 12:52:49 -0400
Subject: [Python-Dev] Where's time.daylight???
References: <E17VbDN-00015m-00@usw-pr-cvs1.sourceforge.net>
 <15672.18628.831787.897474@anthem.wooz.org>
 <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net>
 <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net>
 <m3it3bnu64.fsf@mira.informatik.hu-berlin.de>
 <200207200043.g6K0hMJ27043@pcp02138704pcs.reston01.va.comcast.net> <m3heiuzsdw.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3D399561.A474C77A@metaslash.com>

"Martin v. Loewis" wrote:

> > Can you check that in?  I'm about to disappear to OSCON for a week.
> 
> Done. I have no OSF/1 (aka whatever) system, so I can't really test
> whether it still helps on these systems.

It doesn't work on dec^w alpha^w compaq ... 

I've got an autoconf patch which works on Linux & OSF:
	http://python.org/sf/584245

There are some test failures I will look at later:

test test_dl crashed -- exceptions.SystemError: module dl requires sizeof(int) == sizeof(long) == sizeof(char*)

test test_nis crashed -- exceptions.SystemError: error return without exception set

test_pwd may have hung which is the last test run

Neal



From tim.one@comcast.net  Sun Jul 21 04:26:44 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 20 Jul 2002 23:26:44 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEDJAGAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEFFAGAB.tim.one@comcast.net>

Quick update.  I left off here:

samplesort
 i    2**i  *sort  \sort  /sort  3sort  ~sort  =sort  !sort
15   32768   0.13   0.01   0.01   0.10   0.04   0.01   0.11
16   65536   0.24   0.02   0.02   0.23   0.08   0.02   0.24
17  131072   0.54   0.05   0.04   0.49   0.18   0.04   0.53
18  262144   1.18   0.09   0.09   1.08   0.37   0.09   1.16
19  524288   2.58   0.19   0.18   2.34   0.76   0.17   2.52
20 1048576   5.58   0.37   0.36   5.12   1.54   0.35   5.46

timsort
15   32768   0.16   0.01   0.02   0.05   0.14   0.01   0.02
16   65536   0.24   0.02   0.02   0.06   0.19   0.02   0.04
17  131072   0.55   0.04   0.04   0.13   0.42   0.04   0.09
18  262144   1.19   0.09   0.09   0.25   0.91   0.09   0.18
19  524288   2.60   0.18   0.18   0.46   1.97   0.18   0.37
20 1048576   5.61   0.37   0.35   1.00   4.26   0.35   0.74

With a lot of complication (albeit principled complication), timsort now
looks like

15   32768   0.14   0.01   0.01   0.04   0.10   0.01   0.02
16   65536   0.24   0.02   0.02   0.05   0.17   0.02   0.04
17  131072   0.54   0.05   0.04   0.13   0.38   0.04   0.09
18  262144   1.18   0.09   0.09   0.24   0.81   0.09   0.18
19  524288   2.57   0.18   0.18   0.46   1.77   0.18   0.37
20 1048576   5.55   0.37   0.35   0.99   3.81   0.35   0.74

on the same data (tiny improvements in *sort and 3sort, significant
improvement in ~sort, huge improvements for some patterns that aren't
touched by this test).

For contrast and a sanity check, I also implemented Edelkamp and Stiegeler's
"Next-to-m" refinement of weak heapsort.  If you know what heapsort is, this
is weaker <wink>.  In the last decade, Dutton had the bright idea that a
heap is stronger than you need for sorting:  it's enough if you know only
that a parent node's value dominates the right child's values, and then
ensure that the root node has no left child.  That implies the root node has
the maximum value in the (weak) heap.  It doesn't matter what's in the left
child for the other nodes, provided only that they're weak heaps too.  The
weaker requirements allow faster (but trickier) code for maintaining the
weak-heap invariant as sorting proceeds, and in particular it requires far
fewer element comparisons than a (strong)heap sort.  Edelkamp and Stiegeler
complicated this algorithm in several ways to cut the comparisons even more.
I stopped at their first refinement, which does a worst-case number of
comparisons

    N*k - 2**k + N - 2*k

where

    k = ceiling(logbase2(N))

so that even the worst case is very good.  They have other gimmicks to cut
it more (we're close to the theoretical limit here, so don't read too much
into "more"!), but the first refinement proved so far from being promising
that I dropped it:

weakheapsort
 i    2**i  *sort  \sort  /sort  3sort  ~sort  =sort  !sort
15   32768   0.19   0.12   0.11   0.11   0.11   0.11   0.12
16   65536   0.31   0.26   0.23   0.23   0.24   0.23   0.26
17  131072   0.71   0.55   0.49   0.49   0.51   0.48   0.56
18  262144   1.59   1.15   1.03   1.04   1.08   1.02   1.19
19  524288   3.57   2.43   2.18   2.18   2.27   2.14   2.51
20 1048576   8.01   5.08   4.57   4.58   4.77   4.50   5.29

The number of compares isn't the problem with this.  The problem appears to
be heapsort's poor cache behavior, leaping around via multiplying and
dividing indices by 2.  This is exacerbated in weak heapsort because it also
requires allocating a bit vector, to attach a "which of my children should I
think of as being 'the right child'?" flag to each element, and that also
gets accessed in the same kinds of cache-hostile ways at the same time.

The samplesort and mergesort variants access memory sequentially.

What I haven't accounted for is why weakheapsort appears to get a major
benefit from *any* kind of regularity in the input -- *sort is always the
worst case on each line, and by far (note that this implementation does no
special-casing of any kind, so it must be an emergent property of the core
algorithm).  If I were a researcher, I bet I could get a good paper out of
that <wink>.




From tim.one@comcast.net  Sun Jul 21 06:19:03 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 21 Jul 2002 01:19:03 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <20020720130000.GA11845@panix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEFIAGAB.tim.one@comcast.net>

[Aahz]
> Any reason the list object can't grow a .stablesort() method?

I'm not sure.  Python's samplesort implementation is right up there among
the most complicated (by any measure) algorithms in the code base, and the
mergesort isn't any simpler anymore.  Yet another large mass of difficult
code can make for a real maintenance burden after I'm dead.  Here, guess
what this does:

static int
gallop_left(PyObject *pivot, PyObject** p, int n, PyObject *compare)
{
	int k;
	int lo, hi;
	PyObject **pend;

	assert(pivot && p && n);
	pend = p+(n-1);
	lo = 0;
	hi = -1;
	for (;;) {
		IFLT(*(pend - lo), pivot)
			break;
		hi = lo;
		lo = (lo << 1) + 1;
		if (lo >= n) {
			lo = n;
			break;
		}
	}
	lo = n - lo;
	hi = n-1 - hi;
	while (lo < hi) {
		int m = (lo + hi) >> 1;
		IFLT(p[m], pivot)
			lo = m+1;
		else
			hi = m;
	}
	return lo;
fail:
	return -1;
}

There are 12 other functions that go into this, some less obscure, some
more.  Change "hi = -1" to "hi = 0" and you'll get a core dump, etc; it's
exceedingly delicate, and because truly understanding it essentially
requires doing a formal correctness proof, it's difficult to maintain; fight
your way to that understanding, and you'll know why it sorts, but still
won't have a clue about why it's so fast.  I'm disinclined to add more code
of this nature unless I can use it to replace code at least as difficult
(which samplesort is).

An irony is that stable sorts are, by definition, pointless unless you *do*
have equal elements, and the many-equal-elements case is the one known case
where the new algorithm is much slower than the current one (indeed, I have
good reason to suspect it's the only such case, and reasons beyond just that
God loves a good joke <wink>).

It's OK by me if this were to become Python's only sort.  Short of that, I'd
be happier contributing the code to a sorting extension module.  There are
other reasons the latter may be a good idea; e.g., if you know you're
sorting C longs, it's not particularly difficult to do that 10x faster than
Python's generic list.sort() can do it; ditto if you know you're comparing
strings; etc.  Exposing the binary insertion sort (which both samplesort and
mergesort use) would also be useful to some people (it's a richer variant of
bisect.insort_right).  I'd prefer that Python-the-language have just one
"really good general sort" built in.




From oren-py-d@hishome.net  Sun Jul 21 06:33:40 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sun, 21 Jul 2002 08:33:40 +0300
Subject: [Python-Dev] The iterator story
In-Reply-To: <20020720140923.GA18716@panix.com>; from aahz@pythoncraft.com on Sat, Jul 20, 2002 at 10:09:23AM -0400
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <20020719120043.A21503@glacier.arctrix.com> <3D389369.948547E0@metaslash.com> <20020719162226.A22929@glacier.arctrix.com> <20020720132857.GB5862@hishome.net> <20020720140923.GA18716@panix.com>
Message-ID: <20020721083340.A13156@hishome.net>

On Sat, Jul 20, 2002 at 10:09:23AM -0400, Aahz wrote:
> > That's the scenario that bit me too.  For me it was a little more difficult
> > to find because it was wrapped in a few layers of chained transformations.
> > I can't tell by the last element in the chain whether the first one is 
> > re-iterable or not. 
> > 
> > My suggestion (which was rejected by Guido) was to raise an error when an
> > iterator's .next() method is called afer it raises StopIteration.  This
> > way, if I try to iterate over the result again at least I'll get and error
> > like "IteratorExhaustedError" instead something that is indistinguishable
> > from an iterator of an empty container. I hate silent errors.
> 
> I'm still not understanding how this would help.  When a chainable
> transformer gets StopIteration, it should immediately return.  What else
> do you want to do?

The tranformations are fine the way they are.  The problem is the source - if 
the source is an exhausted iterator and you ask it for a new iterator it will 
happily return itself and report StopIteration on each .next(). This behavior 
is indistringuishable from a valid iterator on an empty container.  

What I would like is for iterators to return StopIteration exactly once and 
then switch to a different exception.  This way the transformations will not 
need to care whether their upstream source is restartable or not - the 
exception will propagate through the entire chain and notify the consumer at 
the end of the chain that the source at the beginning of the chain is not 
re-iterable.

I'm not suggesting that all iterator implementers much do this - having it on
just the builtin iterators will be a great help.

Right now I am using tricks like special-casing files and checking if 
iter(x) is x.  It works but I hate it.

	Oren




From oren-py-d@hishome.net  Sun Jul 21 06:40:14 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Sun, 21 Jul 2002 08:40:14 +0300
Subject: [Python-Dev] The iterator story
In-Reply-To: <200207201413.g6KEDYh29370@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Sat, Jul 20, 2002 at 10:13:34AM -0400
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <200207192110.g6JLAjU15146@pcp02138704pcs.reston01.va.comcast.net> <20020720125850.GA5862@hishome.net> <200207201413.g6KEDYh29370@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020721084014.A13189@hishome.net>

On Sat, Jul 20, 2002 at 10:13:34AM -0400, Guido van Rossum wrote:
> > It seems like a strange design decision to put the burden on all iterator 
> > implementers to write a dummy method returning self instead of just checking 
> > if tp_iter==NULL in PyObject_GetIter. It's like requiring all class writers 
> > to write a dummy __str__ method that calls __repr__ instead of implementing 
> > the automatic fallback to __repr__ in PyObject_Str when no __str__ is 
> > available.
> 
> I suppose you meant "check for tp_iter==NULL and tp_iternext!=NULL.

Yes.

Any comments on my analogy of __iter__/next with __str__/__repr__ and the
burden of implementation?

	Oren




From tim.one@comcast.net  Sun Jul 21 06:38:17 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 21 Jul 2002 01:38:17 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <Pine.LNX.4.44.0207200549090.25751-100000@ziggy>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEFJAGAB.tim.one@comcast.net>

[Ping]
> Name any protocol for which the question "does this mutate?" has
> no answer.

Heh -- you must not use Zope much <0.6 wink>.  I'm hard pressed to think of
a protocol where that does have a reliable answer.  Here:

    x1 = y.z
    x2 = y.z

Are x1 and x2 the same object after that?  At least equal?  Did either line
mutate y?  You simply can't know without knowing how y's type implements
__getattr__, and with the introduction of computed attributes (properties)
it's just going to get muddier.

> (I ask you to accept that __call__ is a special case.)

It's not to me -- if a protocol invokes user-defined Python code, there's
nothing you can say about mutability "in general", and people do both use
and abuse that.

>> What's so special about "for" that it should pretend to deliver
>> purely functional behavior in a highly non-functional language?

> Who said anything about functional behaviour?  I'm not requiring that
> looping *never* mutate.  I just want to be able to tell *whether* it
> will.

I don't blame you, and sometimes I'd like to know whether y.z (or "y += z",
etc) mutates y too.  It cuts deeper than loops, so a loop-focused gimmick
seems inadequate to me (provided "something needs to be done about it" at
all -- I'm not sure, but doubt it).




From tim.one@comcast.net  Sun Jul 21 06:55:00 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 21 Jul 2002 01:55:00 -0400
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <554A408C-9B5F-11D6-9B6B-003065517236@oratrix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEFLAGAB.tim.one@comcast.net>

[Jack Jansen]
> Oh, it's MarkH appreciation that's wanted! In that case I'll
> gladly chime in, I was was afraid it was __declspec(dllexport)
> appreciation. Mark is one cool dude who knows where his towel is!
>
> 199998 to go. Should we start taking a poll who'll be the next
> python-devver we start appreciating when the counter hits zero?

It would have been you, Jack, except Mark was much cleverer about this.  You
make the Mac support so invisible to the rest of us that the only thing we
can ever thank you for is stopping refcount abuse of immortal strings.  Mark
put some sort of Windows gimmick on 79% of the lines in the whole code base,
thus ensuring a never-ending supply of reasons to thank him for getting rid
of it one line at a time <wink>.

i-demand-that-everyone-appreciate-jack-more-too-ly y'rs  - tim




From smurf@noris.de  Sun Jul 21 09:29:30 2002
From: smurf@noris.de (Matthias Urlichs)
Date: Sun, 21 Jul 2002 10:29:30 +0200
Subject: [Python-Dev] Priority queue (binary heap) python code
Message-ID: <p05111709b9601ff9612b@[10.2.6.42]>

Oren Tirosh <oren-py-d@hishome.net>:
>  When I want to sort a list I just use .sort(). I don't care which algorithm
>  is used.

The point in this discussion, though, is that frequently you don't 
need a sorted list. You just need a list which yields all elements in 
order when you pop them.

Heaps are a nice low-overhead implementation of that idea, and 
therefore should be in the standard library.
-- 
Matthias Urlichs



From pinard@iro.umontreal.ca  Sun Jul 21 11:26:55 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 21 Jul 2002 06:26:55 -0400
Subject: [Python-Dev] Re: Priority queue (binary heap) python code
In-Reply-To: <p05111709b9601ff9612b@[10.2.6.42]>
References: <p05111709b9601ff9612b@[10.2.6.42]>
Message-ID: <oqy9c5s7f4.fsf@titan.progiciels-bpi.ca>

[Matthias Urlichs]

> Oren Tirosh <oren-py-d@hishome.net>:

> >  When I want to sort a list I just use .sort(). I don't care which
> >  algorithm is used.

> The point in this discussion, though, is that frequently you don't need
> a sorted list. You just need a list which yields all elements in order
> when you pop them.  Heaps are a nice low-overhead implementation of that
> idea, and therefore should be in the standard library.

This is especially true when you need only the first few elements from the
sorted set, which is a pretty common case in practice.  A blind sort is
not always the optimal solution, when you want to spare some CPU time.
A caricatural example of abuse would be to implement `max' as `sort'
followed by peeking at the first element of the result.

Heaps are also an efficient enough representation if you insert while
sorting, as it often happens in simulations.  Someone I know studied
this intensely, and came up with better algorithms on average of his
reference benchmark, but with much worse worst cases -- so it depends of
the characteristics of the simulation.  Heaps do quite well on average,
and do acceptably well also in their worst cases.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From aahz@pythoncraft.com  Sun Jul 21 14:25:50 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sun, 21 Jul 2002 09:25:50 -0400
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEFLAGAB.tim.one@comcast.net>
References: <554A408C-9B5F-11D6-9B6B-003065517236@oratrix.com> <LNBBLJKPBEHFEDALKOLCKEFLAGAB.tim.one@comcast.net>
Message-ID: <20020721132550.GC25525@panix.com>

On Sun, Jul 21, 2002, Tim Peters wrote:
>
> i-demand-that-everyone-appreciate-jack-more-too-ly y'rs  - tim

My iBook and OSCON class members thank Jack.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From ping@zesty.ca  Sun Jul 21 14:51:30 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Sun, 21 Jul 2002 06:51:30 -0700 (PDT)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEFJAGAB.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.44.0207210618080.1261-100000@ziggy>

On Sun, 21 Jul 2002, Tim Peters wrote:
>     x1 = y.z
>     x2 = y.z
>
> Are x1 and x2 the same object after that?  At least equal?  Did either line
> mutate y?  You simply can't know without knowing how y's type implements
> __getattr__, and with the introduction of computed attributes (properties)
> it's just going to get muddier.

That's not the point.  You could claim that *any* polymorphism in
Python is useless by the same argument.  But Python is not useless;
Python code really is reusable; and that's because there are good
conventions about what the behaviour *should* be.  People who do
really find this upsetting should go use a strongly-typed language.

In general, getting "y.z" should be idempotent, and should not mutate y.
I think everyone would agree on the concept.  If it does mutate y with
visible effects, then the implementor is breaking the convention.

Sure, Python won't prevent you from writing a file-like class where you
write the string "blah" to the file by fetching f.blah and you close the
file by mentioning f[42].  But when users of this class then come running
after you with pointed sticks, i'm not going to fight them off. :)

This is a list of all the type slots accessible from Python, before
iterators (i.e. pre-2.2).  Beside each is the answer to the question:

    Suppose you look at the value of x, then do this operation to x,
    then look at the value of x.  Should we expect the two observed
    values to be the same or different?

    nb_add                              same
    nb_subtract                         same
    nb_multiply                         same
    nb_divide                           same
    nb_remainder                        same
    nb_divmod                           same
    nb_power                            same
    nb_negative                         same
    nb_positive                         same
    nb_absolute                         same
    nb_nonzero                          same
    nb_invert                           same
    nb_lshift                           same
    nb_rshift                           same
    nb_and                              same
    nb_xor                              same
    nb_or                               same
    nb_coerce                           same
    nb_int                              same
    nb_long                             same
    nb_float                            same
    nb_oct                              same
    nb_hex                              same
    nb_inplace_add                      different
    nb_inplace_subtract                 different
    nb_inplace_multiply                 different
    nb_inplace_divide                   different
    nb_inplace_remainder                different
    nb_inplace_power                    different
    nb_inplace_lshift                   different
    nb_inplace_rshift                   different
    nb_inplace_and                      different
    nb_inplace_xor                      different
    nb_inplace_or                       different
    nb_floor_divide                     same
    nb_true_divide                      same
    nb_inplace_floor_divide             different
    nb_inplace_true_divide              different

    sq_length                           same
    sq_concat                           same
    sq_repeat                           same
    sq_item                             same
    sq_slice                            same
    sq_ass_item                         different
    sq_ass_slice                        different
    sq_contains                         same
    sq_inplace_concat                   different
    sq_inplace_repeat                   different

    mp_length                           same
    mp_subscript                        same
    mp_ass_subscript                    different

    bf_getreadbuffer                    same
    bf_getwritebuffer                   same
    bf_getsegcount                      same
    bf_getcharbuffer                    same

    tp_print                            same
    tp_getattr                          same
    tp_setattr                          different
    tp_compare                          same
    tp_repr                             same
    tp_hash                             same
    tp_call                             ?
    tp_str                              same
    tp_getattro                         same
    tp_setattro                         different

In every case except for __call__, there exists a canonical answer.
We all rely on these conventions every time we write a Python program.
And learning these conventions is a necessary part of learning Python.

You can argue, as Guido has, that in the particular case of for-loops
distinguishing between mutating and non-mutating behaviour is not worth
the trouble.  But you can't say that we should give up on the whole
concept *in general*.


-- ?!ng




From aahz@pythoncraft.com  Sun Jul 21 15:41:08 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sun, 21 Jul 2002 10:41:08 -0400
Subject: [Python-Dev] The iterator story
In-Reply-To: <20020721083340.A13156@hishome.net>
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <20020719120043.A21503@glacier.arctrix.com> <3D389369.948547E0@metaslash.com> <20020719162226.A22929@glacier.arctrix.com> <20020720132857.GB5862@hishome.net> <20020720140923.GA18716@panix.com> <20020721083340.A13156@hishome.net>
Message-ID: <20020721144108.GA5608@panix.com>

On Sun, Jul 21, 2002, Oren Tirosh wrote:
> On Sat, Jul 20, 2002 at 10:09:23AM -0400, Aahz wrote:
>>Oren:
>>>
>>> That's the scenario that bit me too.  For me it was a little more
>>> difficult to find because it was wrapped in a few layers of chained
>>> transformations.  I can't tell by the last element in the chain
>>> whether the first one is re-iterable or not.
>>>
>>> My suggestion (which was rejected by Guido) was to raise an
>>> error when an iterator's .next() method is called afer it raises
>>> StopIteration.  This way, if I try to iterate over the result again
>>> at least I'll get and error like "IteratorExhaustedError" instead
>>> something that is indistinguishable from an iterator of an empty
>>> container. I hate silent errors.
>>
>> I'm still not understanding how this would help.  When a chainable
>> transformer gets StopIteration, it should immediately return.  What
>> else do you want to do?
>
> The tranformations are fine the way they are.  The problem is the     
> source - if the source is an exhausted iterator and you ask it for a  
> new iterator it will happily return itself and report StopIteration   
> on each .next(). This behavior is indistringuishable from a valid     
> iterator on an empty container.                                       

So the problem lies in asking the source for a new iterator, not in
trying to use it.  Making the iterator consumer responsible for handling
this seems like the wrong approach to me -- the consumer *shouldn't* be
able to tell the difference.  If you're breaking that paradigm, you
don't actually have an iterator consumer, you've got something else that
wants to use the iterator interface, *plus* some additional features.

The way Python normally handles issues like this is through
documentation.  (I.e., if your consumer requires an iterable capable of
producing multiple iterators rather than an iterator object, you document
that.)

> Right now I am using tricks like special-casing files and checking if
> iter(x) is x.  It works but I hate it.

You need to write your own wrapper or change the way your consumer works.
Special-casing files inside your consumer is a Bad Idea.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/



From tim.one@comcast.net  Sun Jul 21 21:14:43 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 21 Jul 2002 16:14:43 -0400
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <Pine.LNX.4.44.0207210618080.1261-100000@ziggy>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHAAGAB.tim.one@comcast.net>

[Ping]
> Name any protocol for which the question "does this mutate?" has
> no answer.

[Tim]
>> Heh -- you must not use Zope much <0.6 wink>.  I'm hard pressed to
>> think of a protocol where that does have a reliable answer.  Here:
>>
>>     x1 = y.z
>>     x2 = y.z
>>
>> Are x1 and x2 the same object after that?  At least equal?  Did
>> either line mutate y?  You simply can't know without knowing how y's
>> type implements __getattr__, and with the introduction of computed
>> attributes (properties) it's just going to get muddier.

[Ping]
> That's not the point.

It answered the question you asked.

> You could claim that *any* polymorphism in Python is useless by the
> same argument.

It's your position that "for" is semi-useless because of the possibility for
mutation.  That isn't my position, and that some people write mutating
__getattr__ (etc) doesn't make y.z (etc) unattractive to me either.

> But Python is not useless; Python code really is reusable;

Provided you play along with code's often-undocumented preconditions,
absolutely.

> and that's because there are good conventions about what the behaviour
> *should* be.  People who do really find this upsetting should go use a
> strongly-typed language.

Sorry, I couldn't follow this part.  It's a fact that mutating __getattr__
(etc) implementations exist, and it's a fact that I'm not much bothered by
it.  I don't suggest they move to a different language, either (assuming
that by "strongly-typed" you meant "statically typed" -- Python is already
strongly typed).

> In general, getting "y.z" should be idempotent, and should not mutate y.
> I think everyone would agree on the concept.  If it does mutate y with
> visible effects, then the implementor is breaking the convention.

No argument, although I have to emphasize that it's *just* "a convention",
and repeat my prediction that the introduction of properties is going to
make this particular convention less reliable in real life over time.

> Sure, Python won't prevent you from writing a file-like class where you
> write the string "blah" to the file by fetching f.blah and you close the
> file by mentioning f[42].  But when users of this class then come running
> after you with pointed sticks, i'm not going to fight them off. :)

While properties aren't going to stop you from saying

    self.transactionid = self.session_manager.newid

and get a new result each time you do it.  Spelling no-argument method calls
without parens is popular in some other languages, and it's "a feature" that
properties make that easy to spell in Python 2.2 too.

> This is a list of all the type slots accessible from Python, before
> iterators (i.e. pre-2.2).  Beside each is the answer to the question:
>
>     Suppose you look at the value of x, then do this operation to x,
>     then look at the value of x.  Should we expect the two observed
>     values to be the same or different?
> ...

I don't know why you're bothering with this, but it's got holes.  For
example, some people overly fond <wink> of C++ enjoy overloading "<<" in
highly non-functional ways.  For another, the section on the inplace
operators seems confused; after

    x1 = x
    x += y

there's no single best answer to whether

    x is x1

is true, or to whether the value of x1 before is == to the value of x1
after.  The most popular convention*s* for the inplace operators are

- If x is of a mutable type, then x is x1 after, and the pre- and post-
  values of x1 are !=.

- If x is of an immutable type, then x is not x1 after, and the pre- and
  post- values of x1 are ==.

The second case is forced, but the first one isn't.  In light of all that,
the intended meaning of "different" in

>     nb_inplace_add                      different

is either incorrect, or so weak that it's not worth much.  I suppose you
mean that, in Python code

    x += y

the object bound to the name "x" before the operation most likely has a
different (!=) value than the object bound to the name "x" after the
operation.  That's true, but relies on what the generated code does *with*
the result of nb_inplace_add.  If you just call the method

    x.__iadd__(y)

there's simply no guessing whether x is "different" as a result (it never is
for x of an immutable type, it usually is for x of a mutable type, and
there's no way to tell the difference just by staring at x).

>     nb_hex                              same

I sure hope so <wink>.

> ...
> In every case except for __call__, there exists a canonical answer.

If by "canonical" you mean "most common", sure, with at least the exceptions
noted above.

> We all rely on these conventions every time we write a Python program.
> And learning these conventions is a necessary part of learning Python.
>
> You can argue, as Guido has, that in the particular case of for-loops
> distinguishing between mutating and non-mutating behavior is not worth
> the trouble.  But you can't say that we should give up on the whole
> concept *in general*.

To the contrary, in a language with state it's crucial for the programmer to
know when they're mutating state.  If you use a mutating __getattr__, you
better be careful that the code you call doesn't rely on __getattr__ not
mutating; if you use an iterator object, you better be careful that the code
you call doesn't require something stronger than an iterator object.  It's
all the same to me, and as Guido repeated until he got tired of it, the
possibility for "for" and "x in y" (etc) to mutate has always been there,
and has always been used.

I didn't and still don't have any notable real-life problems dealing with
this, although I too have gotten bit when passing a generator-iterator to
code that required a sequence.  I suppose the difference is that I said
"oops! I screwed up!", fixed it, and moved on.  It would have helped most if
Python had a scheme for declaring and enforcing interfaces, *and* I bothered
to use it (doubtful); second-most if the docs for the callee had spelled out
its preconditions better; I doubt it would have helped at all if a variant
spelling of "for" had been used, because I didn't eyeball the body of the
callee first.  As is, I just stuffed the generator-iterator object inside
tuple() at the call site, and everything was peachy.  That took a lot less
effort than reading this thread <0.9 wink>.




From tim.one@comcast.net  Sun Jul 21 21:17:46 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 21 Jul 2002 16:17:46 -0400
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <20020721132550.GC25525@panix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEHAAGAB.tim.one@comcast.net>

[Tim]
> i-demand-that-everyone-appreciate-jack-more-too-ly y'rs  - tim

[Aahz]
> My iBook and OSCON class members thank Jack.

Great!  You're the most appreciate guy we've got here, Aahz.  I demand that
everyone appreciate you more too!

starting-now-ly y'rs  - tim




From tim.one@comcast.net  Sun Jul 21 22:04:02 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 21 Jul 2002 17:04:02 -0400
Subject: [Python-Dev] Added platform-specific directories to sys.path
In-Reply-To: <m3eldzntyc.fsf@mira.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHDAGAB.tim.one@comcast.net>

[Martin v. Loewis]
> If that is the platform convention, I see no problem following
> it. Windows already does things differently from Unix, by using the
> registry to compute sys.path.

FYI, this is mostly a myth.  In normal operation for most people, Python
never gets any info out of the Windows registry.  The Python path in the
registry is consulted only in unusual situations, when the Python library
can't be found under the directory of the executable that called the
sys.path-setting code.  This can happen when, e.g., Python is embedded in
some other app.

The process is quite involved; the comment block at the top of PC/getpathp.c
is a good summary.  When reading it, note that there normally aren't any
"application paths" in the registry; e.g., the PLabs Windows installer
doesn't create any such beast.




From tdelaney@avaya.com  Mon Jul 22 00:23:55 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Mon, 22 Jul 2002 09:23:55 +1000
Subject: [Python-Dev] Single- vs. Multi-pass iterability
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A44F@natasha.auslabs.avaya.com>

> From: Ka-Ping Yee [mailto:ping@zesty.ca]
> 
> It's just not the way i expect for-loops to work.  Perhaps we would
> need to survey people for objective data, but i feel that most people
> would be surprised if
> 
>     for x in y: print x
>     for x in y: print x
> 
> did not print the same thing twice, or if
> 
>     if x in y: print 'got it'
>     if x in y: print 'got it'
> 
> did not do the same thing twice.  I realize this is my own opinion,
> but it's a fairly strong impression i have.
> 
> Well, for a generator, there is no underlying sequence.
> 
>     while 1: print next(gen)
> 
> makes it clear that there is no sequence, but
> 
>     for x in gen: print x
> 
> seems to give me the impression that there is.

I think this is the crux of the matter. You see for: loops as inherently
non-destructive - that they operate on containers. I (and presumably Guido,
though I would never presume to channel him ;) see for: loops as inherently
destructive - that they operate on iterators. That they obtain an iterator
from a container (if possible) is a useful convenience.

Perhaps the terminology is confusing. Consider a queue.

for each person in the queue:
    service the person

Is there anyone who would *not* consider this to be destructive (of the
queue)?

Tim Delaney



From kevin@koconnor.net  Mon Jul 22 00:30:57 2002
From: kevin@koconnor.net (Kevin O'Connor)
Date: Sun, 21 Jul 2002 19:30:57 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <200207200606.g6K66Um28510@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Sat, Jul 20, 2002 at 02:06:29AM -0400
References: <20020624213318.A5740@arizona.localdomain> <200207200606.g6K66Um28510@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020721193057.A1891@arizona.localdomain>

On Sat, Jul 20, 2002 at 02:06:29AM -0400, Guido van Rossum wrote:
> > Any chance something like this could make it into the standard python
> > library?  It would save a lot of time for lazy people like myself.  :-)
> > 
> 
> I have read (or at least skimmed) this entire thread now.  After I
> reconstructed the algorithm in my head, I went back to Kevin's code; I
> admire the compactness of his code.  I believe that this would make a
> good addition to the standard library, as a friend of the bisect
> module.

Thanks!

>The only change I would make would be to make heap[0] the
> lowest value rather than the highest.

I agree this appears more natural, but a priority queue that pops the
lowest priority item is a bit odd.

> I propose to call it heapq.py.  (Got a better name?  Now or never.)
> 
> [*] Afterthought: this could be made into an new-style class by adding
> something like this to the end of module:

Looks good to me.

Thanks again,
-Kevin

-- 
 ------------------------------------------------------------------------
 | Kevin O'Connor                     "BTW, IMHO we need a FAQ for      |
 | kevin@koconnor.net                  'IMHO', 'FAQ', 'BTW', etc. !"    |
 ------------------------------------------------------------------------



From tdelaney@avaya.com  Mon Jul 22 00:40:24 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Mon, 22 Jul 2002 09:40:24 +1000
Subject: [Python-Dev] Priority queue (binary heap) python code
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A450@natasha.auslabs.avaya.com>

> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> On Sat, Jul 20, 2002 at 02:06:29AM -0400, Guido van Rossum wrote:
> 
> >The only change I would make would be to make heap[0] the
> > lowest value rather than the highest.
> 
> I agree this appears more natural, but a priority queue that pops the
> lowest priority item is a bit odd.

I'm in two minds about this. My first thought is that the *first* item
(heap[0]) should be the highest priority.

OTOH, if it were a sorted list, list[0] would return the *lowest* priority.

So i think for consistency heap[0] must return the lowest priority.

Tim Delaney



From greg@cosc.canterbury.ac.nz  Mon Jul 22 01:20:12 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jul 2002 12:20:12 +1200 (NZST)
Subject: [Python-Dev] The iterator story
In-Reply-To: <20020719120043.A21503@glacier.arctrix.com>
Message-ID: <200207220020.g6M0KCM21823@oma.cosc.canterbury.ac.nz>

> Should people prefer to write:
> 
>     for item from iterator:
>         do something
> 
> when they only need to loop over something once?

This shows up a problem with Ping's proposal, I think:
The place where you write the for-loop isn't the place
where you know whether something will be iterated over
more than once or not. How is a library routine going
to know whether a sequence passed to it is going to
be used again later? It's impossible -- global knowledge
of the whole program is needed.

This appears to leave the library writer with two
choices: (1) Use for-in, to be on the safe side,
in case the user doesn't want the sequence destroyed --
but then it can't be used on a destructive iterator, 
even if the caller knows he won't be using it again;
(2) use for-from, and force everyone who calls it to
adapt sequences to iterators before calling.

Either way, things get messy and complicated and
possibly dangerous.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+
 



From greg@cosc.canterbury.ac.nz  Mon Jul 22 02:50:49 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jul 2002 13:50:49 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <B43D149A9AB2D411971300B0D03D7E8BF0A44F@natasha.auslabs.avaya.com>
Message-ID: <200207220150.g6M1onv22234@oma.cosc.canterbury.ac.nz>

"Delaney, Timothy" <tdelaney@avaya.com>:

> for each person in the queue:
>    service the person

If you actually wrote it that way in Python, it
would probably be a bug. It would be better written:

  while there is someone at the head of the queue:
    service that person

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From greg@cosc.canterbury.ac.nz  Mon Jul 22 00:35:38 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jul 2002 11:35:38 +1200 (NZST)
Subject: [Python-Dev] Single- vs. Multi-pass iterability
In-Reply-To: <Pine.LNX.4.44.0207190429410.25751-100000@ziggy>
Message-ID: <200207212335.g6LNZcU21438@oma.cosc.canterbury.ac.nz>

Ka-Ping Yee <ping@zesty.ca>:

> I believe this is where the biggest debate lies: whether "for" should be
> non-destructive.

It's not the for-loop's fault if it's argument is of such a nature
that iterating over it destroys it.

Given suitable values for x and y, it's possible for evaluating "x+y"
to be a destructive operation.  Does that mean we should revise the
"+" protocol somehow to prevent this from happening? I don't think so.

This sort of thing is all-pervasive in Python due to its dynamic
nature. It's not something that can be easily "fixed", even if it were
desirable to do so.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From ping@zesty.ca  Mon Jul 22 03:55:22 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Sun, 21 Jul 2002 19:55:22 -0700 (PDT)
Subject: [Python-Dev] The iterator story
In-Reply-To: <200207220020.g6M0KCM21823@oma.cosc.canterbury.ac.nz>
Message-ID: <Pine.LNX.4.44.0207211901590.1261-100000@ziggy>

I'm in a bit of a bind.  I know at this point that Guido's already
made up his mind so there's nothing further to be gained by debating
the issue; yet i feel compelled to respond as long as people keep
missing the idea or saying things that don't make sense.

So: this is a clarification, not a push.

I am going to reply to a few messages at once, to reduce the number
of messages that i'm sending on this topic.  If you're planning to
reply on this thread, please read the whole message before replying.

                *               *               *

On Mon, 22 Jul 2002, Greg Ewing wrote:
> This shows up a problem with Ping's proposal, I think:
> The place where you write the for-loop isn't the place
> where you know whether something will be iterated over
> more than once or not.

When you write the for-loop, you decide whether you want
to consume the sequence.  You use the convention and expect
the implementor of the sequence object to adhere to it.

> How is a library routine going
> to know whether a sequence passed to it is going to
> be used again later?

You've got this backwards.  You write the library routine
the way that makes sense, and then you document whether
the sequence gets destroyed or not.  That declaration
becomes part of your interface, and users of your routine
can then determine how to use it safely for their needs.

(Analogy: how does the implementor of file.close() know
whether the caller wants to use the file again later?
Answer: it's not the implementor's job to know that.  We
document what file.close() does, and people only *decide*
to call file.close() when they don't need the file anymore.)

Without a convention to distinguish between destruction
and non-destruction, you can't establish what the library
routine does; so you can't document it; so you can't use
it safely *even* if you trust the implementor.  No
implementation would ever make it possible for your library
routine to claim that it "does <blah> with the elements of
a given sequence without destroying the sequence".

Now if you do have a convention -- yes, you still have to
trust implementors to follow the convention -- but if they
do so, you're okay.

                *               *               *

> This appears to leave the library writer with two
> choices: (1) Use for-in, to be on the safe side,
> in case the user doesn't want the sequence destroyed --
> but then it can't be used on a destructive iterator,

No, it can.  The documentation for the library routine
will state that it wants a sequence.  If the caller wants to
use x and x is an iterator, it passes in seq(x).  No problem.
The caller has thereby declared that it's okay to destroy x.

To make it more obvious what is going on, i should have chosen
a better name; 'seq' was poor.  Let's rename 'seq' to 'consume'.

    consume(i) returns an object x such that iter(x) is i.

So calling 'consume' implies that you are consuming an iterator.
All right.  Then consider:

    for x in consume(y):
        print x

The above is clear that y is being destroyed.  Now consider:

    def printout(sequence):
        for x in sequence:
            print x

If y is an iterator, in my world you would not be able to
call "printout(y)".  You would say "printout(consume(y))",
thus making it clear that y is being destroyed.

> (2) use for-from, and force everyone who calls it to
> adapt sequences to iterators before calling.

Since for-in is non-destructive, it is safer, and it is also
more common to have a sequence than an iterator.  So i would
usually choose option 1 rather than 2.

But sure, you can write for-from, if you want.  I mean, if you
decide to accept strings, then users who want to pass in integers
will have to str() them first.  If you decide to accept integers,
then users who want to pass in strings will have to int() them
first.  This is no great dilemma.  We actually like this.

                *               *               *

Hereafter i'll stick to existing syntax, because the business of
introducing syntax isn't really the main point.  I'll use the
alternative i proposed, which is to use the built-in instead.
So we'd say

    for i in consume(it): ...

instead of

    for i from it: ...

Tim Delaney wrote:
> I think this is the crux of the matter. You see for: loops as inherently
> non-destructive - that they operate on containers. I (and presumably
> Guido, though I would never presume to channel him ;) see for: loops as
> inherently destructive - that they operate on iterators. That they obtain
> an iterator from a container (if possible) is a useful convenience.

I believe your interpretation of opinions is correct on all counts.
Except i would point out that for-loops are not always destructive;
most of the time, they are not, and that is why i consider the
destructive behaviour surprising and worth making visible.

> Perhaps the terminology is confusing. Consider a queue.
>
> for each person in the queue:
>     service the person
>
> Is there anyone who would *not* consider this to be destructive (of the
> queue)?

Well, the only reason you can tell is that you can see the context
from the meanings of the words "queue" and "service".  If you said

    for person in consume(queue):
        service(person)

then that would truly be clear, even if you used different variable
names, because the 'consume' built-in expresses that the queue will
be consumed.

                *               *               *

Greg Ewing wrote:
> Given suitable values for x and y, it's possible for evaluating "x+y"
> to be a destructive operation.  Does that mean we should revise the
> "+" protocol somehow to prevent this from happening? I don't think so.

Augh!  I'm just not getting through here.

We all know that the Python philosophy is to trust the implementors of
protocols instead of enforcing behaviour.  That's not the point.

Of course it's POSSIBLE for "x + y" to be destructive.  That doesn't
mean it SHOULD be.  We all know that "x + y" is normally not
destructive, and that's what counts.  That understanding enables me to
implement __add__ in a way that will not screw you over when you use it.

All i'm saying is that there should be a way to *express* safe iteration
(and safe "element in container" tests).

Guido's pronouncement is "Nope.  Don't need it."

Although i disagree, i am willing to respect that.  But please don't
confuse a lack of enforcement with a lack of convention.  Convention
is all we have.


-- ?!ng




From tim.one@comcast.net  Mon Jul 22 04:09:11 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 21 Jul 2002 23:09:11 -0400
Subject: [Python-Dev] Priority queue (binary heap) python code
In-Reply-To: <B43D149A9AB2D411971300B0D03D7E8BF0A450@natasha.auslabs.avaya.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHLAGAB.tim.one@comcast.net>

[Guido]
> The only change I would make would be to make heap[0] the
> lowest value rather than the highest.

[Kevin O'Connor]
> I agree this appears more natural, but a priority queue that pops the
> lowest priority item is a bit odd.

So now the fellow who wrote the code to begin with squirms at what will
happen if it's actually put in the std library, and sounds like he would
continue using his own code.

[Delaney, Timothy]
> I'm in two minds about this. My first thought is that the *first* item
> (heap[0]) should be the highest priority.
>
> OTOH, if it were a sorted list, list[0] would return the *lowest*
> priority.

On the third hand, if you're using heaps for sorting (as in a heapsort),
it's far more natural to have a max-heap -- else the sort can't be done
in-place (with a max-heap you pop the largest value, copy it to the last
array slot, pretend the array is one shorter, and trickle what *was* in the
last array slot back into the now-one-smaller max-heap; repeat N-1 times and
you've sorted the array in-place).

On the fourth hand, if you want a *bounded* priority queue, to remember only
the N best-scoring (largest-priority) objects for some fixed N, then
(perhaps paradoxically) a min-heap is what you need.

On the fifth head, if you want to process items in priorty order (highest
first) interleaved with entering new items, then you need a max-heap.  I
suspect that's what Kevin does.

> So i think for consistency heap[0] must return the lowest priority.

On the sixth hand, anyone who has implemented a heap in another 0-based
language expects the first slot in the array to be unused, in order to
simplify the indexing (parent = child >> 1 uniformly if the root is at index
1), and to ensure that all nodes on the same level have indices with the
same leading bit (which can be helpful in advanced algorithms -- then, e.g.,
you know that i and j are on the same level of the tree if and only if i&j >
i^j; maybe that's not obvious at first glance <wink>).

Priority queues just aren't a once-size-fits-all thing.




From drifty@bigfoot.com  Mon Jul 22 04:23:03 2002
From: drifty@bigfoot.com (Brett Cannon)
Date: Sun, 21 Jul 2002 20:23:03 -0700 (PDT)
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEHAAGAB.tim.one@comcast.net>
Message-ID: <Pine.SOL.4.44.0207212021410.24674-100000@hailstorm.OCF.Berkeley.EDU>

[Tim Peters]

> [Tim]
> > i-demand-that-everyone-appreciate-jack-more-too-ly y'rs  - tim
>
> [Aahz]
> > My iBook and OSCON class members thank Jack.
>
> Great!  You're the most appreciate guy we've got here, Aahz.  I demand that
> everyone appreciate you more too!
>

I appreciate everyone everywhere for everything.  =)

my-Berkeley-education-has-turned-me-hippie-ly y'rs -Brett




From tim.one@comcast.net  Mon Jul 22 05:01:48 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 22 Jul 2002 00:01:48 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEFFAGAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEHNAGAB.tim.one@comcast.net>

Just FYI.  I ripped out the complications I added to the mergesort variant
that tried to speed many-equal-keys cases, and worked on its core competency
(intelligent merging) instead.  There's a reason <wink>:  this kick started
while investigating ways to speed Zope B-Tree operations when they're used
as sets, equal keys are impossible in that context, but intelligent merging
can really help.  So whatever the fate of this sort, some of the code will
live on in Zope's B-Tree routines.

The result is that non-trivial cases of near-order got a nice boost, while
~sort got even slower again.  I added a new test +sort, which replaces the
last 10 values of a sorted array with random values.  samplesort has a
special case for this, limited to a maximum of 15 trailing out-of-order
entries.  timsort has no special case for this but does it significantly
faster than the samplesort hack anyway, has no limit on how many such
trailing entries it can exploit, and couldn't care less whether such entries
are at the front or the end of the array; I expect it would be (just) a
little slower if they were in the middle.  As shown below, timsort does a
+sort almost as fast as for a wholly-sorted array.  Ditto now for 3sort too,
which perturbs order by doing 3 random exchanges in a sorted array.

It's become a very interesting sort implementation, handling more kinds of
near-order at demonstrably supernatural speed than anything else I'm aware
of.  ~sort isn't an example of near-order.  Quite the contrary, it has a
number of inversions quadratic in N, and N/4 runs; the only reason ~sort
goes faster than *sort now is-- believe it or not --a surprising benefit
from a memory optimization.

Key:
    *sort: random data
    \sort: descending data
    /sort: ascending data
    3sort: ascending, then 3 random exchanges
    +sort: ascending, then 10 random at the end
    ~sort: many duplicates
    =sort: all equal
    !sort: worst case scenario

C:\Code\python\PCbuild>python -O sortperf.py 15 20 1
samplesort
 i    2**i  *sort  \sort  /sort  3sort  +sort  ~sort  =sort  !sort
15   32768   0.18   0.02   0.01   0.14   0.01   0.07   0.01   0.17
16   65536   0.24   0.02   0.02   0.22   0.02   0.08   0.02   0.24
17  131072   0.53   0.05   0.04   0.49   0.05   0.18   0.04   0.52
18  262144   1.16   0.09   0.09   1.06   0.12   0.37   0.09   1.13
19  524288   2.53   0.18   0.17   2.30   0.24   0.74   0.17   2.47
20 1048576   5.47   0.37   0.35   5.17   0.45   1.51   0.35   5.34

timsort
 i    2**i  *sort  \sort  /sort  3sort  +sort  ~sort  =sort  !sort
15   32768   0.17   0.01   0.01   0.01   0.01   0.14   0.01   0.02
16   65536   0.23   0.02   0.02   0.03   0.02   0.21   0.03   0.04
17  131072   0.53   0.04   0.04   0.05   0.04   0.46   0.04   0.09
18  262144   1.16   0.09   0.09   0.12   0.09   1.01   0.08   0.18
19  524288   2.53   0.18   0.17   0.18   0.18   2.20   0.17   0.36
20 1048576   5.48   0.36   0.35   0.36   0.37   4.78   0.35   0.73




From greg@cosc.canterbury.ac.nz  Mon Jul 22 05:50:02 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jul 2002 16:50:02 +1200 (NZST)
Subject: [Python-Dev] The iterator story
In-Reply-To: <Pine.LNX.4.44.0207211901590.1261-100000@ziggy>
Message-ID: <200207220450.g6M4o2u23472@oma.cosc.canterbury.ac.nz>

Ka-Ping Yee <ping@zesty.ca>:

> When you write the for-loop, you decide whether you want
> to consume the sequence.

As someone pointed out, it's pretty rare that you actually *want* to
consume the sequence. Usually the choice is between "I don't care" and
"The sequence must NOT be consumed".

Of the two varieties of for-loop in your proposal, for-in
obviously corresponds to the "must not be consumed" case,
leading one to suppose that you intend for-from to be used in
the don't-care case. 

But now you seem to be suggesting that library routines
should always use for-in, and that the caller should
convert an iterator to a sequence if he knows it's okay
to consume it:

> Since for-in is non-destructive, it is safer, and it is also
> more common to have a sequence than an iterator.
> ...
> If y is an iterator, in my world you would not be able to
> call "printout(y)".  You would say "printout(consume(y))

Okay, that seems reasonable -- explicit is better than
implicit. But... consider the following two library
routines:

  def printout1(s):
    for x in s:
      print x

  def printout2(s):
    for x in s:
      for y in s:
        print x, y

Clearly it's okay to call printout1(consume(s)), but it's
NOT okay to call printout2(consume(s)). So we need to document
these requirements:

  def printout1(s):
    "s may be an iterator or sequence"
    for x in s:
      print x

  def printout2(s):
    "s MUST be a sequence, NOT an iterator!"
    for x in s:
      for y in s:
        print x, y

But now there's nothing to enforce these requirements -- no
exception will be raised if you call printout2(consume(s))
by mistake.

To get any safety benefit from your proposed arrangement,
it seems to me that you'd need to write printout1 as

  def printout1(s):
    "s must be an iterator"
    for x from s:
      print x

and then in the (overwhelmingly most common) case of passing it a
sequence, you would need to call it as printout1(iter(s)) -- unless
you allow the for-from protocol to automatically obtain an iterator
from a sequence if possible, the way for-in currently does.

> Greg Ewing wrote:
> > Given suitable values for x and y, it's possible for evaluating "x+y"
> > to be a destructive operation.  Does that mean we should revise the
> > "+" protocol somehow to prevent this from happening? I don't think so.
> 
> Augh!  I'm just not getting through here.

Sorry, I wrote that before I saw your full proposal. I
understand your point of view much better now, and
even sympathise with it to some extent -- something
like the for-from syntax actually passed through my
mind shortly before I saw it in your post. 

There's no doubt that it's very elegant theoretically,
but in thinking through the implications, I'm not sure it
would be all that helpful in practice, and might even
turn out to be a nuisance if it requires putting in a
lot of iter(x) and/or consume(x) calls.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From tim.one@comcast.net  Mon Jul 22 07:05:14 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 22 Jul 2002 02:05:14 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEHNAGAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEIBAGAB.tim.one@comcast.net>

One more piece of this puzzle.  It's possible that one of {samplesort,
timsort} would become unboundedly faster as the cost of comparisons
increased over that of Python floats (which all the timings I posted used).
Here's a program that would show this if so, using my local Python, where
lists have an .msort() method:

"""
class SlowCmp(object):
    __slots__ = ['val']

    def __init__(self, val):
        self.val = val

    def __lt__(self, other):
        for i in range(SLOW):
            i*i
        return self.val < other.val

def drive(n):
    from random import randrange
    from time import clock as now

    n10 = n * 10
    L = [SlowCmp(randrange(n10)) for i in xrange(n)]
    L2 = L[:]

    t1 = now()
    L.sort()
    t2 = now()
    L2.msort()
    t3 = now()
    return t2-t1, t3-t2

for SLOW in 1, 2, 4, 8, 16, 32, 64, 128:
    print "At SLOW value", SLOW
    for n in range(1000, 10001, 1000):
        ss, ms = drive(n)
        print "    %6d  %6.2f  %6.2f  %6.2f" % (
            n, ss, ms, 100.0*(ss - ms)/ms)
"""

Here's the tail end of the output, from which I conclude that the number pf
comparisons done on random inputs is virtually identical for the two
methods; times vary by a fraction of a percent both ways, with no apparent
pattern (note that time.clock() has better than microsecond resolution on
WIndows, so the times going into the % calculation have more digits than are
displayed here):

At SLOW value 32
      1000    0.22    0.22   -0.05
      2000    0.50    0.50    0.10
      3000    0.80    0.80   -0.64
      4000    1.11    1.10    0.71
      5000    1.44    1.45   -0.12
      6000    1.77    1.76    0.72
      7000    2.10    2.09    0.31
      8000    2.43    2.41    0.79
      9000    2.78    2.80   -0.58
     10000    3.13    3.13   -0.01
At SLOW value 64
      1000    0.37    0.38   -1.00
      2000    0.83    0.83    0.20
      3000    1.33    1.33   -0.15
      4000    1.84    1.84    0.05
      5000    2.40    2.39    0.38
      6000    2.95    2.92    0.97
      7000    3.46    3.47   -0.20
      8000    4.04    4.01    0.87
      9000    4.60    4.63   -0.68
     10000    5.19    5.21   -0.33
At SLOW value 128
      1000    0.68    0.67    0.37
      2000    1.52    1.50    0.99
      3000    2.40    2.41   -0.67
      4000    3.35    3.32    1.03
      5000    4.30    4.32   -0.47
      6000    5.32    5.29    0.54
      7000    6.27    6.27    0.04
      8000    7.29    7.25    0.55
      9000    8.37    8.37   -0.03
     10000    9.39    9.43   -0.49




From oren-py-d@hishome.net  Mon Jul 22 07:08:18 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 22 Jul 2002 09:08:18 +0300
Subject: [Python-Dev] The iterator story
In-Reply-To: <20020721144108.GA5608@panix.com>; from aahz@pythoncraft.com on Sun, Jul 21, 2002 at 10:41:08AM -0400
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy> <20020719120043.A21503@glacier.arctrix.com> <3D389369.948547E0@metaslash.com> <20020719162226.A22929@glacier.arctrix.com> <20020720132857.GB5862@hishome.net> <20020720140923.GA18716@panix.com> <20020721083340.A13156@hishome.net> <20020721144108.GA5608@panix.com>
Message-ID: <20020722090818.A5576@hishome.net>

On Sun, Jul 21, 2002 at 10:41:08AM -0400, Aahz wrote:
> > The tranformations are fine the way they are.  The problem is the     
> > source - if the source is an exhausted iterator and you ask it for a  
> > new iterator it will happily return itself and report StopIteration   
> > on each .next(). This behavior is indistringuishable from a valid     
> > iterator on an empty container.                                       
> 
> So the problem lies in asking the source for a new iterator, not in
> trying to use it.  Making the iterator consumer responsible for handling
> this seems like the wrong approach to me -- the consumer *shouldn't* be
> able to tell the difference.  If you're breaking that paradigm, you
> don't actually have an iterator consumer, you've got something else that
> wants to use the iterator interface, *plus* some additional features.

Tuples are very much like lists except that they cannot be modified. A lot
of code that was written with lists in mind can actually use tuples. If you
pass a tuple to a function that tries to use the "additional feature" of
mutability you will get an exception.

Pipes are very much likes files except that they cannot be seeked. A lot of
code that was written with files in mind can actually use pipes. If you pass
a pipe to a function that tries to use the "additional feature" of
seekbility you will get an exception.

Iterators are very much like iterable containers except that they can only
be iterated once.  A lot of code that was written with containers in mind
can actually use iterators.  If you pass an iterator to a function that
tries to use the "additional feature" of re-iterability you...  will not get 
an exception.  You'll get nonsense results because on the second pass the 
iterator will fail silently and suddenly pretend to be an empty container.

Would you say that any code that expects a seekable file or a mutable
sequence is "breaking the paradigm"?  Why should code that expects a
re-iterable container be different from code that uses any other protocol
that has several variations and subsets/supersets?

> The way Python normally handles issues like this is through
> documentation.  (I.e., if your consumer requires an iterable capable of
> producing multiple iterators rather than an iterator object, you document
> that.)

The way Python normally handles issues of code trying to use a protocol that
the object does not support is through *exceptions*.  When a 5000+ line
program produces meaningless results documentation not not very helpful to
start looking for the problem.  An exception gives you an approximate line
number and reason.

If __setitem__ on a tuple was ignored instead of producing an exception or
seek on a pipe failed silently I don't think that anyone would find "don't
do that, then" or "documentation" to be a satisfactory answer.

	Oren




From mwh@python.net  Mon Jul 22 11:03:10 2002
From: mwh@python.net (Michael Hudson)
Date: 22 Jul 2002 11:03:10 +0100
Subject: [Python-Dev] Added platform-specific directories to sys.path
In-Reply-To: barry@zope.com's message of "Fri, 19 Jul 2002 17:48:53 -0400"
References: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com> <200207192123.g6JLN7s15263@pcp02138704pcs.reston01.va.comcast.net> <15672.35141.803094.488541@anthem.wooz.org>
Message-ID: <2m1y9w3wrl.fsf@starship.python.net>

barry@zope.com (Barry A. Warsaw) writes:

> >>>>> "GvR" == Guido van Rossum <guido@python.org> writes:
> 
>     GvR> Traditionally, on Unix per-user extensions are done by
>     GvR> pointing PYTHONPATH to your per-user directory (-ies) in your
>     GvR> .profile.
> 
> Or adding them to sys.path via your $PYTHONSTARTUP file.

That only helps for interactive sessions...

> OTOH, it might be nice if the distutils `install' command had some
> switches to make installing in some of these common alternative
> locations a little easier.  That might dovetail nicely if/when we
> decide to add a site-updates directory to sys.path.

I don't see what's so very difficult about

$ python setup.py install --prefix=$HOME

but maybe I'm odd.

Cheers,
M.

-- 
  $ head -n 2 src/bash/bash-2.04/unwind_prot.c
   /* I can't stand it anymore!  Please can't we just write the
      whole Unix system in lisp or something? */
                                       -- spotted by Rich van der Hoff



From Jack.Jansen@cwi.nl  Mon Jul 22 13:02:15 2002
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Mon, 22 Jul 2002 14:02:15 +0200
Subject: [Python-Dev] Added platform-specific directories to sys.path
In-Reply-To: <2m1y9w3wrl.fsf@starship.python.net>
Message-ID: <DD03FB54-9D6A-11D6-8599-0030655234CE@cwi.nl>

On Monday, July 22, 2002, at 12:03 , Michael Hudson wrote:
> I don't see what's so very difficult about
>
> $ python setup.py install --prefix=$HOME

This is what you use if you have built Python yourself, and installed it 
in your home directory.

What I was referring to (as the setup that isn't very well supported 
right now) is the situation where the system admin has built and 
installed Python in, say, /usr/local, and you want to install a 
distutils-based packaged for your own private use.

Setting PYTHONPATH to be $HOME/lib/python-extensions or something 
similar is what people customarily do to get access to their private 
modules, but there is no standard, and hence also no way for distutils 
to find the pathname and provide an easy interface to do this.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -




From mwh@python.net  Mon Jul 22 13:56:33 2002
From: mwh@python.net (Michael Hudson)
Date: 22 Jul 2002 13:56:33 +0100
Subject: [Python-Dev] Added platform-specific directories to sys.path
In-Reply-To: Jack Jansen's message of "Mon, 22 Jul 2002 14:02:15 +0200"
References: <DD03FB54-9D6A-11D6-8599-0030655234CE@cwi.nl>
Message-ID: <2mptxfncou.fsf@starship.python.net>

Jack Jansen <Jack.Jansen@cwi.nl> writes:

> On Monday, July 22, 2002, at 12:03 , Michael Hudson wrote:
> > I don't see what's so very difficult about
> >
> > $ python setup.py install --prefix=$HOME
> 
> This is what you use if you have built Python yourself, and installed it 
> in your home directory.

In that case, the --prefix arg is unnecessary.

> What I was referring to (as the setup that isn't very well supported 
> right now) is the situation where the system admin has built and 
> installed Python in, say, /usr/local, and you want to install a 
> distutils-based packaged for your own private use.

That's when I do the above.

> Setting PYTHONPATH to be $HOME/lib/python-extensions or something 
> similar is what people customarily do to get access to their private 
> modules, but there is no standard, and hence also no way for distutils 
> to find the pathname and provide an easy interface to do this.

My setup requires setting $PYTHONPATH too, so it's not ideal, but it
works.

Cheers,
M.

-- 
  Reading Slashdot can [...] often be worse than useless, especially
  to young and budding programmers: it can give you exactly the wrong
  idea about the technical issues it raises.
 -- http://www.cs.washington.edu/homes/klee/misc/slashdot.html#reasons



From sholden@holdenweb.com  Mon Jul 22 14:53:04 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Mon, 22 Jul 2002 09:53:04 -0400
Subject: [Python-Dev] Is __declspec(dllexport) really needed on Windows?
References: <Pine.SOL.4.44.0207212021410.24674-100000@hailstorm.OCF.Berkeley.EDU>
Message-ID: <021001c23187$1aa9b230$6300000a@holdenweb.com>

----- Original Message -----
From: "Brett Cannon" <bac@OCF.Berkeley.EDU>
To: "Tim Peters" <tim.one@comcast.net>
Cc: <python-dev@python.org>
Sent: Sunday, July 21, 2002 11:23 PM
Subject: RE: [Python-Dev] Is __declspec(dllexport) really needed on Windows?


> [Tim Peters]
>
> > [Tim]
> > > i-demand-that-everyone-appreciate-jack-more-too-ly y'rs  - tim
> >
> > [Aahz]
> > > My iBook and OSCON class members thank Jack.
> >
> > Great!  You're the most appreciate guy we've got here, Aahz.  I demand
that
> > everyone appreciate you more too!
> >
>
> I appreciate everyone everywhere for everything.  =)
>
> my-Berkeley-education-has-turned-me-hippie-ly y'rs -Brett
>

I appreciate the set of all things that are insufficiently appreciated, and
each of its under-appreciated members

but-it-won't-necessarily-make-a-difference-ly y'rs  - steve
-----------------------------------------------------------------------
Steve Holden                                 http://www.holdenweb.com/
Python Web Programming                http://pydish.holdenweb.com/pwp/
-----------------------------------------------------------------------




From barry@zope.com  Mon Jul 22 15:14:01 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 22 Jul 2002 10:14:01 -0400
Subject: [Python-Dev] Added platform-specific directories to sys.path
References: <57BEAF46-9B5A-11D6-9B6B-003065517236@oratrix.com>
 <200207192123.g6JLN7s15263@pcp02138704pcs.reston01.va.comcast.net>
 <15672.35141.803094.488541@anthem.wooz.org>
 <2m1y9w3wrl.fsf@starship.python.net>
Message-ID: <15676.4905.813038.253158@anthem.wooz.org>

>>>>> "MH" == Michael Hudson <mwh@python.net> writes:

    >> your GvR> .profile.  Or adding them to sys.path via your
    >> $PYTHONSTARTUP file.

    MH> That only helps for interactive sessions...

Yup, which might or might not be good enough.  I'm thinking of the
(X)Emacs arrangement that there are system startup files and user
startup files that are normally always loaded, unless you use a
command line switch to specifically disable them.

    >> OTOH, it might be nice if the distutils `install' command had
    >> some switches to make installing in some of these common
    >> alternative locations a little easier.  That might dovetail
    >> nicely if/when we decide to add a site-updates directory to
    >> sys.path.

    MH> I don't see what's so very difficult about

    MH> $ python setup.py install --prefix=$HOME

Actually, to do it correctly (and quietly) this appears to be the most
accurate way to tell distutils to install a library in an alternative
search path:

% PYTHONPATH=<dir> python setup.py --quiet install --install-lib <dir> \
   --install-purelib <dir>

A bit less than intuitive than say, a standard alternative
user-centric installation directory and a --userdir option to the
install command.

-Barry



From barry@zope.com  Mon Jul 22 16:09:49 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 22 Jul 2002 11:09:49 -0400
Subject: [Python-Dev] Sorting
References: <LNBBLJKPBEHFEDALKOLCIEDJAGAB.tim.one@comcast.net>
 <20020720130000.GA11845@panix.com>
Message-ID: <15676.8253.741856.171571@anthem.wooz.org>

>>>>> "A" == Aahz  <aahz@pythoncraft.com> writes:

    A> Any reason the list object can't grow a .stablesort() method?

Because when a user looks at the methods of a list object and sees
both .sort() and .stablesort() you now need to explain the difference,
and perhaps give some hint as to why you'd want to choose one over the
other.

Maybe the teachers-of-Python in this crowd can give some insight into
whether 1) they'd actually do this or just hand wave past the
difference, or 2) whether it would be a burden to teaching.  I'm
specifically thinking of the non-programmer crowd learning Python.

I would think that most naive uses of list.sort() would expect a
stable sort and wouldn't care much about any performance penalties
involved.  I'd put my own uses squarely in the "naive" camp. ;)

I'd prefer to see

- .sort() actually /be/ a stable sort in the default case

- list objects not be burdened with additional sorting methods (that
  way lies a footing-challenged incline)

- provide a module with more advanced sorting options, with functions
  suitable for list.sort()'s cmpfunc, and with derived classes
  (perhaps in C) of list for better performance.

-Barry



From xscottg@yahoo.com  Mon Jul 22 16:44:15 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Mon, 22 Jul 2002 08:44:15 -0700 (PDT)
Subject: [Python-Dev] Sorting
In-Reply-To: <15676.8253.741856.171571@anthem.wooz.org>
Message-ID: <20020722154415.79981.qmail@web40111.mail.yahoo.com>

--- "Barry A. Warsaw" <barry@zope.com> wrote:
> 
> Because when a user looks at the methods of a list object and sees
> both .sort() and .stablesort() you now need to explain the difference,
> and perhaps give some hint as to why you'd want to choose one over the
> other.
> 

Or you could have an optional parameter that defaults to whatever the more
sane value should be (probably stable), and when the user stumbles across
this parameter they stumble across the docs too.

I think Tim's codebloat argument is more compelling.



__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com



From skip@mojam.com  Mon Jul 22 17:06:58 2002
From: skip@mojam.com (Skip Montanaro)
Date: Mon, 22 Jul 2002 11:06:58 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200207221606.g6MG6wT20010@12-248-11-90.client.attbi.com>

Bug/Patch Summary
-----------------

262 open / 2681 total bugs (+5)
143 open / 1613 total patches (+15)

New Bugs
--------

OSX IDE behaviour (output to console) (2002-06-24)
	http://python.org/sf/573174
pydoc(.org) does not find file.flush() (2002-06-26)
	http://python.org/sf/574057
Chained __slots__ dealloc segfault (2002-06-26)
	http://python.org/sf/574207
convert_path fails with empty pathname (2002-06-26)
	http://python.org/sf/574235
Automated daily documentation builds (2002-06-26)
	http://python.org/sf/574241
Tex Macro Error (2002-06-27)
	http://python.org/sf/574939
multiple inheritance w/ slots dumps core (2002-06-28)
	http://python.org/sf/575229
Parts of 2.2.1 core use old gc API (2002-06-30)
	http://python.org/sf/575715
os.spawnv() fails with underscores (2002-06-30)
	http://python.org/sf/575770
Negative __len__ provokes SystemError (2002-06-30)
	http://python.org/sf/575773
Inconsistent behaviour in re grouping (2002-07-01)
	http://python.org/sf/576079
Sig11 in cPickle (stack overflow) (2002-07-01)
	http://python.org/sf/576084
Infinite recursion in Pickle (2002-07-02)
	http://python.org/sf/576419
Windows binary missing SSL (2002-07-02)
	http://python.org/sf/576711
os.path.walk behavior on symlinks (2002-07-03)
	http://python.org/sf/576975
inheriting from property and docstrings (2002-07-03)
	http://python.org/sf/576990
Wrong description for PyErr_Restore (2002-07-03)
	http://python.org/sf/577000
Print line number of string if at EOF (2002-07-04)
	http://python.org/sf/577295
** in doc/current/lib/operator-map.html (2002-07-04)
	http://python.org/sf/577513
del __builtins__ breaks out of rexec (2002-07-04)
	http://python.org/sf/577530
System Error with slots and multi-inh (2002-07-05)
	http://python.org/sf/577777
resire readonly memory mapped file (2002-07-05)
	http://python.org/sf/577782
Docs unclear about cleanup. (2002-07-05)
	http://python.org/sf/577793
Explain how to subclass Exception (2002-07-06)
	http://python.org/sf/578180
pthread_exit missing in thread_pthread.h (2002-07-09)
	http://python.org/sf/579116
LibRef 2.2.1, replace zero with False (2002-07-11)
	http://python.org/sf/579991
Subclassing WeakValueDictionary impossib (2002-07-11)
	http://python.org/sf/580107
GC Changes not mentioned in What's New (2002-07-12)
	http://python.org/sf/580462
mimetools module privacy leak (2002-07-12)
	http://python.org/sf/580495
MacOSX python.app build problems (2002-07-12)
	http://python.org/sf/580550
import lock should be exposed (2002-07-13)
	http://python.org/sf/580952
Provoking infinite scanner loops (2002-07-13)
	http://python.org/sf/581080
smtplib.SMTP.ehlo method esmtp_features  (2002-07-13)
	http://python.org/sf/581165
bug in splituser(host) in urllib (2002-07-14)
	http://python.org/sf/581529
pty.spawn - wrong error caught (2002-07-15)
	http://python.org/sf/581698
''.split() docstring clarification (2002-07-15)
	http://python.org/sf/582071
pickle error message unhelpful (2002-07-16)
	http://python.org/sf/582297
lib-dynload/*.so wrong permissions (2002-07-17)
	http://python.org/sf/583206
ConfigParser spaces in keys not read (2002-07-18)
	http://python.org/sf/583248
wrong dest size (2002-07-18)
	http://python.org/sf/583477
gethostbyaddr lag (2002-07-19)
	http://python.org/sf/583975
add way to detect bsddb version (2002-07-21)
	http://python.org/sf/584409
os.getlogin() fails (2002-07-21)
	http://python.org/sf/584566
no doc for os.fsync and os.fdatasync (2002-07-21)
	http://python.org/sf/584695

New Patches
-----------

Deprecate bsddb (2002-05-06)
	http://python.org/sf/553108
Executable .pyc-files with hashbang (2002-06-23)
	http://python.org/sf/572796
(?(id/name)yes|no) re implementation (2002-06-23)
	http://python.org/sf/572936
cgi.py and rfc822.py unquote fixes (2002-06-24)
	http://python.org/sf/573197
Changing owner of symlinks (2002-06-25)
	http://python.org/sf/573770
makesockaddr, use addrlen with AF_UNIX (2002-06-27)
	http://python.org/sf/574707
Make python-mode.el use jython (2002-06-27)
	http://python.org/sf/574747
Make python-mode.el use "jython" interp (2002-06-27)
	http://python.org/sf/574750
list.extend docstring fix (2002-06-27)
	http://python.org/sf/574867
PyTRASHCAN slots deallocation (2002-06-28)
	http://python.org/sf/575073
python-mode patch for ipython support (2002-06-30)
	http://python.org/sf/575774
SSL release GIL (2002-06-30)
	http://python.org/sf/575827
Alternative implementation of interning (2002-07-01)
	http://python.org/sf/576101
Extend PyErr_SetFromWindowsErr (2002-07-02)
	http://python.org/sf/576458
Remove PyArg_Parse() and METH_OLDARGS (2002-07-03)
	http://python.org/sf/577031
Merge xrange() into slice() (2002-07-05)
	http://python.org/sf/577875
fix for problems with test_longexp  (2002-07-06)
	http://python.org/sf/578297
Put IDE scripts in ~/Library (2002-07-08)
	http://python.org/sf/578667
incompatible, but nice strings improveme (2002-07-08)
	http://python.org/sf/578688
Solaris openpty() and forkpty() addition (2002-07-09)
	http://python.org/sf/579433
Shadow Password Support Module (2002-07-09)
	http://python.org/sf/579435
Build MachoPython with 2level namespace (2002-07-10)
	http://python.org/sf/579841
xreadlines caching, file iterator (2002-07-11)
	http://python.org/sf/580331
less restrictive HTML comments (2002-07-12)
	http://python.org/sf/580670
Fix for seg fault on test_re on mac osx (2002-07-12)
	http://python.org/sf/580869
new version of Set class (2002-07-13)
	http://python.org/sf/580995
Canvas "select_item" always returns None (2002-07-14)
	http://python.org/sf/581396
info reader bug (2002-07-14)
	http://python.org/sf/581414
fix to pty.spawn error on Linux (2002-07-15)
	http://python.org/sf/581705
Alternative PyTRASHCAN subtype_dealloc (2002-07-15)
	http://python.org/sf/581742
smtplib.py patch for macmail esmtp auth (2002-07-17)
	http://python.org/sf/583180
make file object an iterator (2002-07-17)
	http://python.org/sf/583235
get python to link on OSF1 (Dec Unix) (2002-07-20)
	http://python.org/sf/584245
yield allowed in try/finally (2002-07-21)
	http://python.org/sf/584626

Closed Bugs
-----------

ihooks on windows and pythoncom (PR#294) (2000-07-31)
	http://python.org/sf/210637
httplib does not check if port is valid (easy to fix?) (2000-12-13)
	http://python.org/sf/225744
httplib problem with '100 Continue' (2001-01-02)
	http://python.org/sf/227361
[windows] os.popen doens't kill subprocess when interrupted (2001-02-06)
	http://python.org/sf/231273
+= not assigning to same var it reads (2001-04-21)
	http://python.org/sf/417930
httplib: multiple Set-Cookie headers (2001-06-12)
	http://python.org/sf/432621
[win32] KeyboardInterrupt Not Caught (2001-07-10)
	http://python.org/sf/439992
Evaluating func_code causing core dump (2001-07-23)
	http://python.org/sf/443866
HTTPSConnect.__init__ too tricky (2001-09-04)
	http://python.org/sf/458463
base n integer to string conversion (2001-09-25)
	http://python.org/sf/465045
Tut: Dict used before dicts explained (2001-11-10)
	http://python.org/sf/480337
SAX Attribute/AttributesNS class missing (2001-11-22)
	http://python.org/sf/484603
Error building info docs (2001-12-20)
	http://python.org/sf/495624
'lambda' documentation in strange place (2001-12-27)
	http://python.org/sf/497109
unicode() docs don't mention LookupError (2002-02-06)
	http://python.org/sf/513666
bogus URLs cause exception in httplib (2002-03-07)
	http://python.org/sf/527064
Nested Scopes bug (Confirmed) (2002-03-10)
	http://python.org/sf/528274
Build unable to import w/gcc 3.0.4 (2002-04-11)
	http://python.org/sf/542737
buffer slice type inconsistant (2002-04-20)
	http://python.org/sf/546434
urllib/httplib vs corrupted tcp/ip stack (2002-04-22)
	http://python.org/sf/547093
Unicode encoders appears to leak references (2002-04-28)
	http://python.org/sf/549731
email.Utils.encode doesn't obey rfc2047 (2002-05-06)
	http://python.org/sf/552957
unittest.TestResult documentation (2002-05-20)
	http://python.org/sf/558278
HTTPSConnection memory leakage (2002-05-22)
	http://python.org/sf/559117
Getting traceback in embedded python. (2002-06-01)
	http://python.org/sf/563338
urllib2 can't cope with error response (2002-06-02)
	http://python.org/sf/563665
compile traceback must include filename (2002-06-05)
	http://python.org/sf/564931
Misleading string constant. (2002-06-12)
	http://python.org/sf/568269
minor improvement to Grammar file (2002-06-13)
	http://python.org/sf/568412
Broken pre.subn() (and pre.sub()) (2002-06-17)
	http://python.org/sf/570057
glob() fails for network drive in cgi (2002-06-19)
	http://python.org/sf/571167
imaplib fetch is broken (2002-06-19)
	http://python.org/sf/571334
Numeric Literal Anomoly (2002-06-19)
	http://python.org/sf/571382
Segmentation fault in Python 2.3 (2002-06-20)
	http://python.org/sf/571885
python-mode IM parses code in docstrings (2002-06-21)
	http://python.org/sf/572341
Memory leak in object comparison (2002-06-22)
	http://python.org/sf/572567

Closed Patches
--------------

Optional memory profiler (2000-08-18)
	http://python.org/sf/401229
Pure Python strptime() (PEP 42) (2001-10-23)
	http://python.org/sf/474274
Unicode support in email.Utils.encode (2001-12-07)
	http://python.org/sf/490456
httplib.py screws up on 100 response (2001-12-31)
	http://python.org/sf/498149
make python-mode play nice with gdb (2002-01-28)
	http://python.org/sf/509975
imputil.py can't import "\r\n" .py files (2002-02-28)
	http://python.org/sf/523944
urllib2.py: fix behavior with proxies (2002-03-08)
	http://python.org/sf/527518
Better AttributeError formatting (2002-03-20)
	http://python.org/sf/532638
RFC 2231 support for email package (2002-04-26)
	http://python.org/sf/549133
Fix for httplib bug with 100 Continue (2002-05-01)
	http://python.org/sf/551273
Py_AddPendingCall doesn't unlock on fail (2002-05-03)
	http://python.org/sf/552161
os.uname() on Darwin space in machine (2002-05-24)
	http://python.org/sf/560311
Remove UserDict from cookie.py (2002-05-31)
	http://python.org/sf/562987
email Parser non-strict mode (2002-06-06)
	http://python.org/sf/565183
Expose _Py_ReleaseInternedStrings (2002-06-06)
	http://python.org/sf/565378
Rationalize DL_IMPORT and DL_EXPORT (2002-06-07)
	http://python.org/sf/566100
Convert slice and buffer to types (2002-06-13)
	http://python.org/sf/568544
Remove support for Win16 (2002-06-16)
	http://python.org/sf/569753
Changes (?P=) with optional backref (2002-06-20)
	http://python.org/sf/571976



From barry@zope.com  Mon Jul 22 17:05:58 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 22 Jul 2002 12:05:58 -0400
Subject: [Python-Dev] Sorting
References: <15676.8253.741856.171571@anthem.wooz.org>
 <20020722154415.79981.qmail@web40111.mail.yahoo.com>
Message-ID: <15676.11622.589953.460393@anthem.wooz.org>

>>>>> "SG" == Scott Gilbert <xscottg@yahoo.com> writes:

    SG> Or you could have an optional parameter that defaults to
    SG> whatever the more sane value should be (probably stable), and
    SG> when the user stumbles across this parameter they stumble
    SG> across the docs too.

    SG> I think Tim's codebloat argument is more compelling.

Except that in

    http://mail.python.org/pipermail/python-dev/2002-July/026837.html

Tim says:

    "Back on Earth, among Python users the most frequent complaint
     I've heard is that list.sort() isn't stable."

and here

    http://mail.python.org/pipermail/python-dev/2002-July/026854.html

Tim seems <wink> to be arguing against stable sort as being the
default due to code bloat.

As Tim's Official Sysadmin, I'm only good at channeling him on one
subject, albeit probably one he'd deem most important to his life:
lunch.  So I'm not sure if he's arguing for or against stable sort
being the default. ;)

-Barry



From skip@pobox.com  Mon Jul 22 17:19:40 2002
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 22 Jul 2002 11:19:40 -0500
Subject: [Python-Dev] Weekly bug report summary
Message-ID: <15676.12444.402113.866101@12-248-11-90.client.attbi.com>

Neal Norwitz asked me what happened to the weekly bug summary mailing.  I've
been off-net a lot and was running it via cron on my laptop Sunday
mornings.  I just ran the bug reporter manually and migrated the database
and script over to the Mojam web server.  With any luck, the script will run
properly next Sunday morning.

Skip



From barry@zope.com  Mon Jul 22 18:24:52 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 22 Jul 2002 13:24:52 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2
References: <E17W9rU-0001Nk-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <15676.16356.112688.518256@anthem.wooz.org>

[Diverting to python-dev... -BAW]

    timmie> Update of /cvsroot/python/python/dist/src/Lib/email/test
    timmie> In directory
    timmie> usw-pr-cvs1:/tmp/cvs-serv3289/python/lib/email/test

    | Modified Files:
    | 	test_email_codecs.py 
    | Log Message:
    | Changed import from
    |     from test.test_support import TestSkipped, run_unittest
    | to
    |     from test_support import TestSkipped, run_unittest

    timmie> Otherwise, if the Japanese codecs aren't installed,
    timmie> regrtest doesn't believe the TestSkipped exception raised
    timmie> by this test matches the

    timmie>     except (ImportError, test_support.TestSkipped), msg:

    timmie> it's looking for, and reports the skip as a crash failure
    timmie> instead of as a skipped test.

    timmie> I suppose this will make it harder to run this test
    timmie> outside of regrtest, but under the assumption only Barry
    timmie> does that, better to make it skip cleanly for everyone
    timmie> else.

A better fix, IMO, is to recognize that the `test' package has become
a full fledged standard lib package (a Good Thing, IMO), heed our own
admonitions not to do relative imports, and change the various places
in the test suite that "import test_support" (or equiv) to "import
test.test_support" (or equiv).

I've twiddled the test suite to do things this way, and all the
(expected Linux) tests pass, so I'd like to commit these changes.
Unit test writers need to remember to use test.test_support instead of
just test_support.  We could do something wacky like remove '' from
sys.path if we really cared about enforcing this.  It would also be
good for folks on other systems to make sure I haven't missed a
module.

-Barry



From tim.one@comcast.net  Mon Jul 22 18:28:11 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 22 Jul 2002 13:28:11 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <15676.11622.589953.460393@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEJMAGAB.tim.one@comcast.net>

[Barry Warsaw]
> Except that in
>
>     http://mail.python.org/pipermail/python-dev/2002-July/026837.html
>
> Tim says:
>
>     "Back on Earth, among Python users the most frequent complaint
>      I've heard is that list.sort() isn't stable."

Yes, and because the current samplesort falls back to a stable sort when
lists are small, almost everyone who cares about this and tries to guess
about stability via trying small examples comes to a wrong conclusion.

> and here
>
>     http://mail.python.org/pipermail/python-dev/2002-July/026854.html
>
> Tim seems <wink> to be arguing against stable sort as being the
> default due to code bloat.

I'm arguing there against having two highly complex and long-winded sorting
algorithms in the core.  Pick one.  In favor of samplesort:

+ It can be much faster in very-many-equal-elements cases (note that
  ~sort lists have only 4 distinct values, each repeated N/4 times
  and spread uniformaly across the whole list).

+ While it requires some extra memory, that lives on the stack and
  is O(log N).  As a result, it can never raise MemoryError unless
  a comparison function does.

+ It's never had a bug reported against it (so is stable in a different
  sense <wink>).

In favor of timsort:

+ It's stable.

+ The code is more uniform and so potentially easier to grok, and
  because it has no random component is easier to predict (e.g., it's
  certain that it has no quadratic-time cases).

+ It's incredibly faster in the face of many more kinds of mild
  disorder, which I believe are very common in the real world.  As
  obvious examples, you add an increment of new data to an already-
  sorted file, or paste together several sorted files.  timsort
  screams in those cases, but they may as well be random to
  samplesort, and the difference in runtime can easily exceed a
  factor of 10.  A factor of 10 is a rare and wonderful thing in
  algorithm development.

Against timsort:

+ It can require O(N) temp storage, although the constant is small
  compared to object sizes.  That means it can raise MemoryError
  even if a comparison function never does.

+ Very-many-equal-elements cases can be much slower, but that's partly
  because it *is* stable, and preserving the order of equal elements
  is exactly what makes stability hard to achieve in a fast sort
  (samplesort can't be made stable efficiently).

> As Tim's Official Sysadmin, I'm only good at channeling him on one
> subject, albeit probably one he'd deem most important to his life:
> lunch.  So I'm not sure if he's arguing for or against stable sort
> being the default. ;)

All else being equal, a stable sort is a better choice.  Alas, all else
isn't equal.  If Python had no sort method now, I'd pick timsort with scant
hesitation.  Speaking of which, is it time for lunch yet <wink>?




From nas@python.ca  Mon Jul 22 19:18:47 2002
From: nas@python.ca (Neil Schemenauer)
Date: Mon, 22 Jul 2002 11:18:47 -0700
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEJMAGAB.tim.one@comcast.net>; from tim.one@comcast.net on Mon, Jul 22, 2002 at 01:28:11PM -0400
References: <15676.11622.589953.460393@anthem.wooz.org> <LNBBLJKPBEHFEDALKOLCAEJMAGAB.tim.one@comcast.net>
Message-ID: <20020722111847.A3095@glacier.arctrix.com>

Tim Peters wrote:
> Pick one.

I pick timsort.  Stability is nice to have.  It sounds like if you want
a stable sort you will have to pay for it (e.g. ~sort is slower).  The
fact that timsort is faster on partially sorted inputs more than makes
up for it.

  Neil



From tim.one@comcast.net  Mon Jul 22 19:20:25 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 22 Jul 2002 14:20:25 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test
 test_email_codecs.py,1.1,1.2
In-Reply-To: <15676.16356.112688.518256@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEJPAGAB.tim.one@comcast.net>

[Barry]
> A better fix, IMO, is to recognize that the `test' package has become
> a full fledged standard lib package (a Good Thing, IMO), heed our own
> admonitions not to do relative imports, and change the various places
> in the test suite that "import test_support" (or equiv) to "import
> test.test_support" (or equiv).
>
> I've twiddled the test suite to do things this way, and all the
> (expected Linux) tests pass, so I'd like to commit these changes.
> Unit test writers need to remember to use test.test_support instead of
> just test_support.  We could do something wacky like remove '' from
> sys.path if we really cared about enforcing this.  It would also be
> good for folks on other systems to make sure I haven't missed a
> module.

Note test/README, which says in part:

"""
NOTE:  Always import something from test_support like so:

    from test_support import verbose

or like so:

    import test_support
    ... use test_support.verbose in the code ...

Never import anything from test_support like this:

    from test.test_support import verbose

"test" is a package already, so can refer to modules it contains without
"test." qualification.  If you do an explicit "test.xxx" qualification, that
can fool Python into believing test.xxx is a module distinct from the xxx
in the current package, and you can end up importing two distinct copies of
xxx.  This is especially bad if xxx=test_support, as regrtest.py can (and
routinely does) overwrite its "verbose" and "use_large_resources"
attributes:  if you get a second copy of test_support loaded, it may not
have the same values for those as regrtest intended.
"""

I don't have a deep understanding of these miserable issues, so settled for
a one-line patch that worked.  The admonition to never import from
test.test_support was a BDFL Pronouncement at the time.

Note that Jack runs tests in ways nobody else does, via importing something
or other from an interactive Python session (Mac Classic doesn't have a
cmdline shell -- something like that).  It's always an adventure trying to
guess how things will break for him, although I'm not sure your suggestion
is (or isn't) relevant to Jack.

I imagine things will work provided that all imports "are the same".  I'm
not sure fiddling all the code is worth it just to save a line of typing in
the email package's test suite.




From tim.one@comcast.net  Mon Jul 22 20:32:09 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 22 Jul 2002 15:32:09 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEJMAGAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEKEAGAB.tim.one@comcast.net>

If you have access to a good library, you'll enjoy reading the original
paper on samplesort; or a scan can be purchased from the ACM:

    Samplesort:  A Sampling Approach to Minimal Storage Tree Sorting
    W. D. Frazer, A. C. McKellar
    JACM, Vol. 17, No. 3, July 1970

As in many papers of its time, the algorithm description is English prose
and raises more questions than it answers, but the mathematical analysis is
extensive.

Two things made me laugh out loud:

1. The largest array they tested had 50,000 elements, because that
   was the practical upper limit given storage sizes at the time.
   Now that's such a tiny case that even in Python it's hard to
   time it accurately.

2. They thought about using a different sort method for small buckets,

      However, the additional storage required for the program would
      reduce the size of the input sequence which could be accommodated,
      and hence it is an open question as to whether or not the
      efficiency of the total sorting process could be improved in
      this way.

In some ways, life was simpler then <wink>.

for-example-i-had-more-hair-ly y'rs  - tim




From barry@zope.com  Mon Jul 22 20:38:16 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 22 Jul 2002 15:38:16 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test
 test_email_codecs.py,1.1,1.2
References: <15676.16356.112688.518256@anthem.wooz.org>
 <LNBBLJKPBEHFEDALKOLCOEJPAGAB.tim.one@comcast.net>
Message-ID: <15676.24360.88972.449273@anthem.wooz.org>

>>>>> "TP" == Tim Peters <tim.one@comcast.net> writes:

    TP> Note test/README, which says in part:

    TP> """
    TP> NOTE:  Always import something from test_support like so:

    TP>     from test_support import verbose

    TP> or like so:

    |     import test_support
    |     ... use test_support.verbose in the code ...

    TP> Never import anything from test_support like this:

    TP>     from test.test_support import verbose

    TP> "test" is a package already, so can refer to modules it
    TP> contains without "test." qualification.  If you do an explicit
    TP> "test.xxx" qualification, that can fool Python into believing
    TP> test.xxx is a module distinct from the xxx in the current
    TP> package, and you can end up importing two distinct copies of
    TP> xxx.  This is especially bad if xxx=test_support, as
    TP> regrtest.py can (and routinely does) overwrite its "verbose"
    TP> and "use_large_resources" attributes: if you get a second copy
    TP> of test_support loaded, it may not have the same values for
    TP> those as regrtest intended.  """

Yep, but I think those recommendations are out-of-date.  You added
them to the file almost 2 years ago. ;)

Note that the warnings in that README go away when regrtest also
imports test_support from the test package.

    TP> I don't have a deep understanding of these miserable issues,
    TP> so settled for a one-line patch that worked.  The admonition
    TP> to never import from test.test_support was a BDFL
    TP> Pronouncement at the time.

Hmm, I don't know if he considers that admonition to still be in
effect, but I'd like to hope not.  We're discouraging relative imports
these days, and I don't see any deep reason why the regression tests
need to break this rule to function (and indeed, on Unix at least
it doesn't seem to).

    TP> Note that Jack runs tests in ways nobody else does, via
    TP> importing something or other from an interactive Python
    TP> session (Mac Classic doesn't have a cmdline shell -- something
    TP> like that).  It's always an adventure trying to guess how
    TP> things will break for him, although I'm not sure your
    TP> suggestion is (or isn't) relevant to Jack.

I wouldn't presume to know!  So I'll generate a patch, upload it to
SF, and assign it to Jack for review.

    TP> I imagine things will work provided that all imports "are the
    TP> same".

Yes.
    
    TP> I'm not sure fiddling all the code is worth it just to
    TP> save a line of typing in the email package's test suite.

It's a bit uglier than that because since Lib/test gets magically
added to sys.path during regrtest by virtue of running "python
Lib/test/regrtest.py".  So to find the "same" test_support module,
you'd probably have to do something more along the lines of

>>> import os
>>> import test.regrtest
>>> testdir = os.path.dirname(test.regrtest.__file__)
>>> sys.path.insert(0, testdir)
>>> import test_support

blechi-ly y'rs,
-Barry



From Rick Farrer" <rfarrer@avisionone.com  Mon Jul 22 21:13:48 2002
From: Rick Farrer" <rfarrer@avisionone.com (Rick Farrer)
Date: Mon, 22 Jul 2002 15:13:48 -0500
Subject: [Python-Dev] Remove from mailing list
Message-ID: <001001c231bc$4b0310e0$3745fea9@ibm1499>

This is a multi-part message in MIME format.

------=_NextPart_000_000D_01C23192.61006FC0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Please remove me from your mailing list.

Thanks


------=_NextPart_000_000D_01C23192.61006FC0
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.2600.0" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>Please remove me from your mailing=20
list.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Thanks</FONT></DIV>
<DIV>&nbsp;</DIV></BODY></HTML>

------=_NextPart_000_000D_01C23192.61006FC0--




From tim.one@comcast.net  Tue Jul 23 03:07:57 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 22 Jul 2002 22:07:57 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEKEAGAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAELMAGAB.tim.one@comcast.net>

This is a multi-part message in MIME format.

--Boundary_(ID_G6Ak++OVPb3/j8dlAUI+lw)
Content-type: text/plain; charset=Windows-1252
Content-transfer-encoding: 7BIT

In an effort to save time on email (ya, right ...), I wrote up a pretty
detailed overview of the "timsort" algorithm.  It's attached.

all-will-be-revealed-ly y'rs  - tim

--Boundary_(ID_G6Ak++OVPb3/j8dlAUI+lw)
Content-type: text/plain; name=timsort.txt
Content-transfer-encoding: quoted-printable
Content-disposition: attachment; filename=timsort.txt

/*-----------------------------------------------------------------------=
----
A stable natural mergesort with excellent performance on many flavors of
lightly disordered arrays, and as fast as samplesort on random arrays.

In a nutshell, the main routine marches over the array once, left to =
right,
alternately identifying the next run, and then merging it into the =
previous
runs.  Everything else is complication for speed, and some measure of =
memory
efficiency.

Runs
----
count_run() returns the # of elements in the next run.  A run is either
"ascending", which means non-decreasing:

    a0 <=3D a1 <=3D a2 <=3D ...

or "descending", which means strictly decreasing:

    a0 > a1 > a2 > ...

Note that a run is always at least 2 long, unless we start at the =
array's
last element.

The definition of descending is strict, because the main routine =
reverses
a descending run in-place, transforming a descending run into an =
ascending
run.  Reversal is done via the obvious fast "swap elements starting at =
each
end, and converge at the middle" method, and that can violate stability =
if
the slice contains any equal elements.  Using a strict definition of
descending ensures that a descending run contains distinct elements.

If an array is random, it's very unlikely we'll see long runs, much of =
the
rest of the algorithm is geared toward exploiting long runs, and that =
takes
a fair bit of work.  That work is a waste of time if the data is random, =
so
if a natural run contains less than MIN_MERGE_SLICE elements, the main =
loop
artificially boosts it to MIN_MERGE_SLICE elements, via binary insertion
sort applied to the right number of array elements following the short
natural run.  In a random array, *all* runs are likely to be =
MIN_MERGE_SLICE
long as a result, and merge_at() short-circuits the expensive stuff in =
that
case.

The Merge Pattern
-----------------
In order to exploit regularities in the data, we're merging on natural
run lengths, and they can become wildly unbalanced.  But that's a Good =
Thing
for this sort!

Stability constrains permissible merging patterns.  For example, if we =
have
3 consecutive runs of lengths

    A:10000  B:20000  C:10000

we dare not merge A with C first, because if A, B and C happen to =
contain
a common element, it would get out of order wrt its occurence(s) in B.  =
The
merging must be done as (A+B)+C or A+(B+C) instead.

So merging is always done on two consecutive runs at a time, and =
in-place,
although this may require some temp memory (more on that later).

When a run is identified, its base address and length are pushed on a =
stack
in the MergeState struct.  merge_collapse() is then called to see =
whether
it should merge it with preceeding run(s).  We would like to delay =
merging
as long as possible in order to exploit patterns that may come up later, =
but
we would like to do merging as soon as possible to exploit that the run =
just
found is still high in the memory hierarchy.  We also can't delay =
merging
"too long" because it consumes memory to remember the runs that are =
still
unmerged, and the stack has a fixed size.

What turned out to be a good compromise maintains two invariants on the
stack entries, where A, B and C are the lengths of the three righmost =
not-yet
merged slices:

1.   A > B+C
2.   B > C

Note that, by induction, #2 implies the lengths of pending runs form a
decreasing sequence.  #1 implies that, reading the lengths right to =
left,
the pending-run lengths grow at least as fast as the Fibonacci numbers.
Therefore the stack can never grow larger than about log_base_phi(N) =
entries,
where phi =3D (1+sqrt(5))/2 ~=3D 1.618.  Thus a small # of stack slots =
suffice
for very large arrays.

If A <=3D B+C, the smaller of A and C is merged with B, and the new run =
replaces
the A,B or B,C entries; e.g., if the last 3 entries are

    A:30  B:20  C:10

then B is merged with C, leaving

    A:30  BC:30

on the stack.  Or if they were

    A:500  B:400:  C:1000

then A is merged with B, leaving

    AB:900  C:1000

on the stack.

In both examples, the stack configuration still violates invariant #2, =
and
merge_at() goes on to continue merging runs until both invariants are
satisfied.  As an extreme case, suppose we didn't do the MIN_MERGE_SLICE
gimmick, and natural runs were of lengths 128, 64, 32, 16, 8, 4, 2, and =
2.
Nothing would get merged until the final 2 was seen, and that would =
trigger
7 perfectly balanced (both runs involved have the same size) merges.

The thrust of these rules when they trigger merging is to balance the =
run
lengths as closely as possible, while keeping a low bound on the number
of runs we have to remember.  This is maximally effective for random =
data,
where all runs are likely to be of (artificially forced) length
MIN_MERGE_SLICE, and then we get a sequence of perfectly balanced =
merges.

OTOH, the reason this sort is so good for lightly disordered data has to =
do
with wildly unbalanced run lengths.

Merge Memory
------------
Merging adjacent runs of lengths A and B in-place is very difficult.
Theoretical constructions are known that can do it, but they're too =
difficult
and slow for practical use.  But if we have temp memory equal to min(A, =
B),
it's easy.

If A is smaller, copy A to a temp array, leave B alone, and then we can
do the obvious merge algorithm left to right, from the temp area and B,
starting the stores into where A used to live.  There's always a free =
area
in the original area comprising a number of elements equal to the number
not yet merged from the temp array (trivially true at the start; proceed
by induction).  The only tricky bit is that if a comparison raises an
exception, we have to remember to copy the remaining elements back in =
from
the temp area, lest the array end up with duplicate entries from B.

If B is smaller, much the same, except that we need to merge right to =
left,
starting the stores at the right end of where B used to live.

In all, then, we need no more than N/2 temp array slots.

A refinement:  When we're about to merge adjacent runs A and B, we first
do a form of binary search (more on that later) to see where B[0] should
end up in A.  Elements in A preceding that point are already in their =
final
positions, effectively shrinking the size of A.  Likewise we also search
to see where A[-1] should end up in B, and elements of B after that =
point
can also be ignored.  This cuts the amount of temp memory needed by the
same amount.  It may not pay, though.

Merge Algorithms
----------------
When merging runs of lengths A and B, if A/2 <=3D B <=3D 2*A (i.e., =
they're
within a factor of two of each other), we do the usual straightforward =
one-at-
a-time merge.  This can take up to A+B comparisons.  If the data is =
random,
there's very little potential for doing better than that.  If there are =
a
great many equal elements, we can do better than that, but there's no =
way
to know whether there *are* a great many equal elements short of doing a
great many additional comparisons (we only use "<" in sort), and that's
too expensive when it doesn't pay.

If the sizes of A and B are out of whack, we can do much better.  The
Hwang-Lin merging algorithm is very good at merging runs of mismatched
lengths if the data is random, but I believe it would be a mistake to
try that here.  As explained before, if we really do have random data, =
we're
almost certainly going to stay in the A/2 <=3D B <=3D 2*A case.

Instead we assume that wildly different run lengths correspond to *some*
sort of clumpiness in the data.  Without loss of generality, assume A is
the shorter run.  We first look for A[0] in B.  We do this via =
"galloping",
comparing A[0] in turn to B[0], B[1], B[3], B[7], ..., B[2**j - 1], ...,
until finding the k such that B[2**(k-1) - 1] < A[0] <=3D B[2**k - 1].  =
This
takes at most log2(B) comparisons, and, unlike a straight binary search,
favors finding the right spot early in B.  Why that's important may =
<wink>
become clear later.

After finding such a k, the region of uncertainty is reduced to 2**(k-1) =
- 1
consecutive elements, and a straight binary search requires exactly k-1
comparisons to nail it.

Now we can copy all the B's up to that point in one chunk, and then copy =
A[0].
If the data really is clustered, the new A[0] (what was A[1] at the =
start)
is likely to belong near the start of what remains of the B run.  That's
why we gallop first instead of doing a straight binary search:  if the =
new
A[0] really is near the start of the remaining B run, galloping will =
find it
much quicker.  OTOH, if we're wrong, galloping + binary search never =
takes
more than 2*log2(B) compares, so can't become a disaster.  If the =
clumpiness
comes in distinct clusters, gallop + binary search also adapts nicely to
that.

I first learned about the galloping strategy in a related context; do a
Google search to find this paper available online:

    "Adaptive Set Intersections, Unions, and Differences" (2000)
    Erik D. Demaine, Alejandro L=F3pez-Ortiz, J. Ian Munro

and its followup(s).
-------------------------------------------------------------------------=
--*/

--Boundary_(ID_G6Ak++OVPb3/j8dlAUI+lw)--



From neal@metaslash.com  Tue Jul 23 04:19:32 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Mon, 22 Jul 2002 23:19:32 -0400
Subject: [Python-Dev] More Sorting
Message-ID: <3D3CCB44.4F2592ED@metaslash.com>

Sebastien Keim posted a patch (http://python.org/sf/544113) 
of a merge sort.  I didn't really review it, but it included
test and doc.  So if the bisect module is being added to, 
perhaps someone should review this patch.

Neal



From ping@zesty.ca  Tue Jul 23 05:57:24 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Mon, 22 Jul 2002 21:57:24 -0700 (PDT)
Subject: [Python-Dev] Re: The iterator story
In-Reply-To: <200207220450.g6M4o2u23472@oma.cosc.canterbury.ac.nz>
Message-ID: <Pine.LNX.4.44.0207222045270.1261-100000@ziggy>

SYNOPSIS: a slight adjustment to the definition of consume()
yields a simple solution that addresses both the destruction
issue and the multiple-iteration issue, without introducing
any new syntax.


On Mon, 22 Jul 2002, Greg Ewing wrote:
> As someone pointed out, it's pretty rare that you actually *want* to
> consume the sequence. Usually the choice is between "I don't care" and
> "The sequence must NOT be consumed".

Sure, i'll go for that.  What i'm after is the ability to say
"i would like this sequence not to be consumed."

> Of the two varieties of for-loop in your proposal, for-in
> obviously corresponds to the "must not be consumed" case,
> leading one to suppose that you intend for-from to be used in
> the don't-care case.

Right.

> But now you seem to be suggesting that library routines
> should always use for-in, and that the caller should
> convert an iterator to a sequence if he knows it's okay
> to consume it:

The two are semantically equivalent proposals.  I explained
them both in the original message that i posted proposing
the solution.  The 'consume()' library routine is just another
way to express 'for-from' without using new syntax.

However, it is true that 'consume()' is more generally useful.
It would be good to have, whether or not we had new syntax.
I acknowledge that i did not realize this at the time i wrote
the earlier message, or i would have stated the 'consume()'
(then called 'seq()') proposal first and the for-from proposal
second, instead of the opposite.

That is why i am sticking to talking about the no-new-syntax
version of the proposal for now.  I apologize if it seems
that i am asking you to follow a moving target.  I would like
you to recognize, though, that the underlying concept is the
same -- the programmer has to signal when an iterator is being
used like a sequence.

> Okay, that seems reasonable -- explicit is better than
> implicit. But... consider the following two library
> routines:
>
>   def printout1(s):
>     for x in s:
>       print x
>
>   def printout2(s):
>     for x in s:
>       for y in s:
>         print x, y
[...]
> no exception will be raised if you call printout2(consume(s))
> by mistake.

Good point!  Clearly my proposal did not take care of this case.
(But there are solutions below; read on.)

Upon some reflection, though, it seems to me that this problem
is orthogonal to the proposal: forcing the programmer to declare
when destruction is allowed neither solves nor exacerbates the
problem of printout2().  consume() is about destruction, whereas
printout2() is about multiple iteration.

> To get any safety benefit from your proposed arrangement,
> it seems to me that you'd need to write printout1 as
>
>   def printout1(s):
>     "s must be an iterator"
>     for x from s:
>       print x

I'm afraid i don't see how this bears on the problem you just
described.  It still would not be possible to write a safe version
of printout2() in either (a) the world of the current Python with
iterators or (b) a world where for-in does not accept iterators
and consume() has been introduced.

One real solution to this problem is what Oren has been suggesting
all along -- raise an IteratorExhausted exception if you try to fetch
an element from an iterator that has already thrown StopIteration.
In printout2(), this exception would occur on the second time through
the inner loop.  This works, but we can do even better.

After some thought today, i realized that there is a second solution.
Thanks for leading me to it, Greg!  With consume(), the programmer
has declared that the iterator is okay to destroy.  But my definition
of consume() was incomplete.  One slight change solves the problem:

    consume(y) returns x such that iter(x) returns y the
    first time, and raises IteratorConsumedException thereafter.

Now we're all set!  If consume(it) is passed to printout2(), an
exception is raised immediately before any damage is done.  This
detects whether you attempt to *start* the iterator twice, which
makes more sense than detecting whether you hit the *end* of the
iterator twice.

The insight is that protection against multiple iteration belongs
in the implementation of __iter__, not in the iterator itself --
because the iterator doesn't know whether it can be restarted.
The *provider* of the iterator does.

> There's no doubt that it's very elegant theoretically,
> but in thinking through the implications, I'm not sure it
> would be all that helpful in practice, and might even
> turn out to be a nuisance if it requires putting in a
> lot of iter(x) and/or consume(x) calls.

It's not so bad.  You only have to say iter() or consume() in
exceptional cases, where you are specifically writing code to
manipulate iterators.  Everything else looks the same -- except
it's safe.

More importantly, neither iter() nor consume() need to be taught
on the first day of Python.

I think it all comes together quite nicely.  Here it is in summary:

    - Iterators just implement __next__.

    - Containers, and other things that want to be iterated over,
      just implement __iter__.

    - The new built-in routine consume(y) returns x such that iter(x)
      returns y the first time, and raises IteratorConsumedException
      thereafter.

    - (Other objects that only allow one-shot iteration can also raise
      IteratorConsumedException when their __iter__ is called twice.)

Advantages:

    1. "for-in" and "in" are safe to use -- no fear of destruction.

    2. One-shot iterators are safe against multiple iteration.

    3. Iterators don't have to implement a dummy __iter__ method
       returning self.

    4. The implementation of "for" stays exactly as it is now.

    5. Current implementations of iterators continue to work fine,
       if unsafely (but they're already unsafe).

    6. No new syntax.

    7. For-loops continue to work on containers exactly as they
       always have.

    8. Iterators don't have to maintain extra state to know that
       it's time to start throwing IteratorExhausted instead of
       StopIteration.

Items 1, 2, and 3 are distinct improvements over the current state
of affairs.  The only inconvenience is the case where an iterator
is being passed to a routine that expects a container; this is
still pretty rare yet, and this situation is easy to detect (hence,
the error message from "for" can explain what to do).  In this case,
you have to wrap consume() around the iterator to declare it okay
to consume.  And that's all.

The fact that it takes only a slight adjustment to the earlier proposal
to solve *both* the destruction problem and the multiple-iteration
problem has led me to be even more convinced that this is the "right
answer" -- in the sense that this is how i would design the protocol
if we were starting from scratch.

Now, i know we are not starting from scratch.  And i know Guido has
already said he doesn't want to solve this problem.  But, just in
case you are wondering, the migration path from here to there seems
pretty straightforward to me:

    1. When __next__() is not present, call next() and issue a warning.

    2. In the next version, deprecate next() in favour of __next__().

    3. Add consume() and IteratorConsumedException to built-ins.

    4. Deprecate the dummy __iter__() method on iterators.

    5. Throw a party and consume(mass_quantities).


-- ?!ng

"Most things are, in fact, slippery slopes.  And if you start backing off
from one thing because it's a slippery slope, who knows where you'll stop?"
    -- Sean M. Burke




From xscottg@yahoo.com  Tue Jul 23 07:22:12 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Mon, 22 Jul 2002 23:22:12 -0700 (PDT)
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAELMAGAB.tim.one@comcast.net>
Message-ID: <20020723062212.25747.qmail@web40102.mail.yahoo.com>

--- Tim Peters <tim.one@comcast.net> wrote:
> In an effort to save time on email (ya, right ...), I wrote up a pretty
> detailed overview of the "timsort" algorithm.  It's attached.
> 
> all-will-be-revealed-ly y'rs  - tim
>
> [Interesting stuff deleted.]
>

I'm curious if there is any literature that you've come across, or if
you've done any experiments with merging more than two parts at a time.  So
instead of merging like such:

  A B C D E F G H I J K L
  AB  CD  EF  GH  IJ  KL
  ABCD    EFGH    IJKL
  ABCDEFGH        IJKL
  ABCDEFGHIJKL

You were to merge

  A B C D E F G H I J K L
  ABC   DEF   GHI   JKL
  ABCDEF      GHIJKL
  ABCDEFGHIJKL

(I realize that your merges are based on the lengths of the subsequences,
but you get the point.)

My thinking is that many machines (probably yours for instance) have a
cache that is 4-way associative, so merging only 2 blocks at a time might
not be using the cache as well as it could.  Also, changing from merging 2
blocks to 3 or 4 blocks at a time would change the number of passes you
have to make (the log part of N*log(N)).

It's quite possible that this isn't worth the trade off in complexity and
space (or your time :-).  Keeping track of comparisons that you've already
made could get ugly, and your temp space requirement would go from N/2 to
possibly 3N/4...  But since you're diving so deeply into this problem, I
figured I'd throw it out there.

OTOH, this could be close to the speedup that heavily optimized FFT algs
get when they go from radix-2 to radix-4.   Just thinking out loud...






__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com



From xscottg@yahoo.com  Tue Jul 23 07:36:11 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Mon, 22 Jul 2002 23:36:11 -0700 (PDT)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
Message-ID: <20020723063611.26677.qmail@web40102.mail.yahoo.com>

--0-1908127438-1027406171=:26257
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

The latest version of this PEP will be in CVS, but the most recent copy as
of this message is attached.

I'm posting this to python-dev first to shave off the rough edges.  I'll
post to comp.lang.python after that.

Please don't hesitate to email me directly if you have any questions on it.

Cheers,
    -Scott Gilbert





__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com
--0-1908127438-1027406171=:26257
Content-Type: text/plain; name="pep-0296.txt"
Content-Description: pep-0296.txt
Content-Disposition: inline; filename="pep-0296.txt"

PEP: 296
Title: The Buffer Problem
Version: $Revision: 1.1 $
Last-Modified: $Date: 2002/07/22 21:03:34 $
Author: xscottg at yahoo.com (Scott Gilbert)
Status: Draft
Type: Standards Track
Created: 12-Jul-2002
Python-Version: 2.3
Post-History:


Abstract

    This PEP proposes the creation of a new standard type and builtin
    constructor called 'bytes'.  The bytes object is an efficiently
    stored array of bytes with some additional characteristics that
    set it apart from several implementations that are similar.


Rationale

    Python currently has many objects that implement something akin to
    the bytes object of this proposal.  For instance the standard
    string, buffer, array, and mmap objects are all very similar in
    some regards to the bytes object.  Additionally, several
    significant third party extensions have created similar objects to
    try and fill similar needs.  Frustratingly, each of these objects
    is too narrow in scope and is missing critical features to make it
    applicable to a wider category of problems.


Specification

    The bytes object has the following important characteristics:

    1. Efficient underlying array storage via the standard C type "unsigned
    char".  This allows fine grain control over how much memory is
    allocated.  With the alignment restrictions designated in the next
    item, it is trivial for low level extensions to cast the pointer
    to a different type as needed.
    
    Also, since the object is implemented as an array of bytes, it is
    possible to pass the bytes object to the extensive library of
    routines already in the standard library that presently work with
    strings.  For instance, the bytes object in conjunction with the
    struct module could be used to provide a complete replacement for
    the array module using only Python script.

    If an unusual platform comes to light, one where there isn't a
    native unsigned 8 bit type, the object will do its best to
    represent itself at the Python script level as though it were an
    array of 8 bit unsigned values.  It is doubtful whether many
    extensions would handle this correctly, but Python script could be
    portable in these cases.

    2. Alignment of the allocated byte array is whatever is promised by the
    platform implementation of malloc.  A bytes object created from an
    extension can be supplied that provides any arbitrary alignment as
    the extension author sees fit.

    This alignment restriction should allow the bytes object to be
    used as storage for all standard C types - including PyComplex
    objects or other structs of standard C type types.  Further
    alignment restrictions can be provided by extensions as necessary.

    3. The bytes object implements a subset of the sequence operations
    provided by string/array objects, but with slightly different
    semantics in some cases.  In particular, a slice always returns a
    new bytes object, but the underlying memory is shared between the
    two objects.  This type of slice behavior has been called creating
    a "view".  Additionally, repetition and concatenation are
    undefined for bytes objects and will raise an exception.

    As these objects are likely to find use in high performance
    applications, one motivation for the decision to use view slicing
    is that copying between bytes objects should be very efficient and
    not require the creation of temporary objects.  The following code
    illustrates this:

        # create two 10 Meg bytes objects
        b1 = bytes(10000000)
        b2 = bytes(10000000)

        # copy from part of one to another with out creating a 1 Meg temporary
        b1[2000000:3000000] = b2[4000000:5000000]

    Slice assignment where the rvalue is not the same length as the
    lvalue will raise an exception.  However, slice assignment will
    work correctly with overlapping slices (typically implemented with
    memmove).

    4. The bytes object will be recognized as a native type by the pickle and
    cPickle modules for efficient serialization.  (In truth, this is
    the only requirement that can't be implemented via a third party
    extension.)

    Partial solutions to address the need to serialize the data stored
    in a bytes-like object without creating a temporary copy of the
    data into a string have been implemented in the past.  The tofile
    and fromfile methods of the array object are good examples of
    this.  The bytes object will support these methods too.  However,
    pickling is useful in other situations - such as in the shelve
    module, or implementing RPC of Python objects, and requiring the
    end user to use two different serialization mechanisms to get an
    efficient transfer of data is undesirable.

    XXX: Will try to implement pickling of the new bytes object in
    such a way that previous versions of Python will unpickle it as a
    string object.

    When unpickling, the bytes object will be created from memory
    allocated from Python (via malloc).  As such, it will lose any
    additional properties that an extension supplied pointer might
    have provided (special alignment, or special types of memory).

    XXX: Will try to make it so that C subclasses of bytes type can
    supply the memory that will be unpickled into.  For instance, a
    derived class called PageAlignedBytes would unpickle to memory
    that is also page aligned.

    On any platform where an int is 32 bits (most of them), it is
    currently impossible to create a string with a length larger than
    can be represented in 31 bits.  As such, pickling to a string will
    raise an exception when the operation is not possible.

    At least on platforms supporting large files (many of them),
    pickling large bytes objects to files should be possible via
    repeated calls to the file.write() method.

    5. The bytes type supports the PyBufferProcs interface, but a bytes object
    provides the additional guarantee that the pointer will not be
    deallocated or reallocated as long as a reference to the bytes
    object is held.  This implies that a bytes object is not resizable
    once it is created, but allows the global interpreter lock (GIL)
    to be released while a separate thread manipulates the memory
    pointed to if the PyBytes_Check(...) test passes.

    This characteristic of the bytes object allows it to be used in
    situations such as asynchronous file I/O or on multiprocessor
    machines where the pointer obtained by PyBufferProcs will be used
    independently of the global interpreter lock.

    Knowing that the pointer can not be reallocated or freed after the
    GIL is released gives extension authors the capability to get true
    concurrency and make use of additional processors for long running
    computations on the pointer.

    6. In C/C++ extensions, the bytes object can be created from a supplied
    pointer and destructor function to free the memory when the
    reference count goes to zero.

    The special implementation of slicing for the bytes object allows
    multiple bytes objects to refer to the same pointer/destructor.
    As such, a refcount will be kept on the actual
    pointer/destructor.  This refcount is separate from the refcount
    typically associated with Python objects.

    XXX: It may be desirable to expose the inner refcounted object as an
    actual Python object.  If a good use case arises, it should be possible
    for this to be implemented later with no loss to backwards compatibility.

    7. It is also possible to signify the bytes object as readonly, in this
    case it isn't actually mutable, but does provide the other features of a
    bytes object.

    8. The bytes object keeps track of the length of its data with a Python
    LONG_LONG type.  Even though the current definition for PyBufferProcs
    restricts the length to be the size of an int, this PEP does not propose
    to make any changes there.  Instead, extensions can work around this limit
    by making an explicit PyBytes_Check(...) call, and if that succeeds they
    can make a PyBytes_GetReadBuffer(...) or PyBytes_GetWriteBuffer call to
    get the pointer and full length of the object as a LONG_LONG.

    The bytes object will raise an exception if the standard PyBufferProcs
    mechanism is used and the size of the bytes object is greater than can be
    represented by an integer.

    From Python scripting, the bytes object will be subscriptable with longs
    so the 32 bit int limit can be avoided.

    There is still a problem with the len() function as it is PyObject_Size()
    and this returns an int as well.  As a workaround, the bytes object will
    provide a .length() method that will return a long.

    9. The bytes object can be constructed at the Python scripting level by
    passing an int/long to the bytes constructor with the number of bytes to
    allocate.  For example:

       b = bytes(100000) # alloc 100K bytes

    The constructor can also take another bytes object.  This will be useful
    for the implementation of unpickling, and in converting a read-write bytes
    object into a read-only one.  An optional second argument will be used to
    designate creation of a readonly bytes object.

    10. From the C API, the bytes object can be allocated using any of the
    following signatures:

       PyObject* PyBytes_FromLength(LONG_LONG len, int readonly);
       PyObject* PyBytes_FromPointer(void* ptr, LONG_LONG len, int readonly
                void (*dest)(void *ptr, void *user), void* user);
    
    In the PyBytes_FromPointer(...) function, if the dest function pointer is
    passed in as NULL, it will not be called.  This should only be used for
    creating bytes objects from statically allocated space.
    
    The user pointer has been called a closure in other places.  It is a
    pointer that the user can use for whatever purposes.  It will be passed to
    the destructor function on cleanup and can be useful for a number of
    things.  If the user pointer is not needed, NULL should be passed instead.
 
    11. The bytes type will be a new style class as that seems to be where all
    standard Python types are headed.


Contrast to existing types

    The most common way to work around the lack of a bytes object has been to
    simply use a string object in its place.  Binary files, the struct/array
    modules, and several other examples exist of this.  Putting aside the
    style issue that these uses typically have nothing to do with text
    strings, there is the real problem that strings are not mutable, so direct
    manipulation of the data returned in these cases is not possible.  Also,
    numerous optimizations in the string module (such as caching the hash
    value or interning the pointers) mean that extension authors are on very
    thin ice if they try to break the rules with the string object.

    The buffer object seems like it was intended to address the purpose that
    the bytes object is trying fulfill, but several shortcomings in its
    implementation [1] have made it less useful in many common cases.  The
    buffer object made a different choice for its slicing behavior (it returns
    new strings instead of buffers for slicing and other operations), and it
    doesn't make many of the promises on alignment or being able to release
    the GIL that the bytes object does.

    Also in regards to the buffer object, it is not possible to simply replace
    the buffer object with the bytes object and maintain backwards
    compatibility.  The buffer object provides a mechanism to take the
    PyBufferProcs supplied pointer of another object and present it as its
    own.  Since the behavior of the other object can not be guaranteed to
    follow the same set of strict rules that a bytes object does, it can't be
    used in places that a bytes object could.

    The array module supports the creation of an array of bytes, but it does
    not provide a C API for supplying pointers and destructors to extension
    supplied memory.  This makes it unusable for constructing objects out of
    shared memory, or memory that has special alignment or locking for things
    like DMA transfers.  Also, the array object does not currently pickle.
    Finally since the array object allows its contents to grow, via the extend
    method, the pointer can be changed if the GIL is not held while using it.

    Creating a buffer object from an array object has the same problem of
    leaving an invalid pointer when the array object is resized.

    The mmap object caters to its particular niche, but does not attempt to
    solve a wider class of problems.

    Finally, any third party extension can not implement pickling without
    creating a temporary object of a standard python type.  For example in the
    Numeric community, it is unpleasant that a large array can't pickle
    without creating a large binary string to duplicate the array data.


Backward Compatibility

    The only possibility for backwards compatibility problems that the author
    is aware of are in previous versions of Python that try to unpickle data
    containing the new bytes type.


Reference Implementation

    XXX: Actual implementation is in progress, but changes are still possible
    as this PEP gets further review.

    The following new files will be added to the Python baseline:

        Include/bytesobject.h  # C interface
        Objects/bytesobject.c  # C implementation
        Lib/test/test_bytes.py # unit testing
        Doc/lib/libbytes.tex   # documentation

    The following files will also be modified:

        Include/Python.h       # adding bytesmodule.h include file
        Python/bltinmodule.c   # adding the bytes type object
        Modules/cPickle.c      # adding bytes to the standard types
        Lib/pickle.py          # adding bytes to the standard types

    It is possible that several other modules could be cleaned up and
    implemented in terms of the bytes object.  The mmap module comes to mind
    first, but as noted above it would be possible to reimplement the array
    module as a pure Python module.  While it is attractive that this PEP
    could actually reduce the amount of source code by some amount, the author
    feels that this could cause unnecessary risk for breaking existing
    applications and should be avoided at this time.


Additional Notes/Comments

    - Guido van Rossum wondered whether it would make sense to be able
    to create a bytes object from a mmap object.  The mmap object
    appears to support the requirements necessary to provide memory
    for a bytes object.  (It doesn't resize, and the pointer is valid
    for the lifetime of the object.)  As such, a method could be added
    to the mmap module such that a bytes object could be created
    directly from a mmap object.  An initial stab at how this would be
    implemented would be to use the PyBytes_FromPointer() function
    described above and pass the mmap_object as the user pointer.  The
    destructor function would decref the mmap_object for cleanup.

    - Todd Miller notes that it may be useful to have two new functions:
    PyObject_AsLargeReadBuffer() and PyObject_AsLargeWriteBuffer that are
    similar to PyObject_AsReadBuffer() and PyObject_AsWriteBuffer(), but
    support getting a LONG_LONG length in addition to the void* pointer.
    These functions would allow extension authors to work transparently with
    bytes object (that support LONG_LONG lengths) and most other buffer like
    objects (which only support int lengths).  These functions could be in
    lieu of, or in addition to, creating a specific PyByte_GetReadBuffer() and
    PyBytes_GetWriteBuffer() functions.

    XXX: The author thinks this is very a good idea as it paves the way for
    other objects to eventually support large (64 bit) pointers, and it should
    only affect abstract.c and abstract.h.  Should this be added above?

    - It was generally agreed that abusing the segment count of the
    PyBufferProcs interface is not a good hack to work around the 31 bit
    limitation of the length.  If you don't know what this means, then you're
    in good company.  Most code in the Python baseline, and presumably in many
    third party extensions, punt when the segment count is not 1.


References

    [1] The buffer interface
        http://mail.python.org/pipermail/python-dev/2000-October/009974.html


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:



--0-1908127438-1027406171=:26257--



From tim.one@comcast.net  Tue Jul 23 09:30:11 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 23 Jul 2002 04:30:11 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <20020723062212.25747.qmail@web40102.mail.yahoo.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEMOAGAB.tim.one@comcast.net>

[Scott Gilbert]
> I'm curious if there is any literature that you've come across, or if
> you've done any experiments with merging more than two parts at a
> time.

There's a literal mountain of research on the topic.  I recommend

    "A Meticulous Analysis of Mergesort Programs"
    Jyrki Katajainen, Jesper Larsson Traff

for a careful accounting of all operations that go into one of these beasts.
They got the best results (and much better than quicksort) out of a 4-way
bottom-up mergesort via very tedious code (e.g., it effectively represents
which input run currently has the smallest next key via the program counter,
by way of massive code duplication and oodles of gotos); they were afraid to
write actual code for an 8-way version <wink>.  OTOH, they were sorting
random integers, and, e.g., were delighted to increase the # of comparisons
when that could save a few other "dirt cheap" operations.

> ...
> My thinking is that many machines (probably yours for instance) have a
> cache that is 4-way associative, so merging only 2 blocks at a time might
> not be using the cache as well as it could.  Also, changing from merging 2
> blocks to 3 or 4 blocks at a time would change the number of passes you
> have to make (the log part of N*log(N)).
>
> It's quite possible that this isn't worth the trade off in complexity and
> space (or your time :-).

The real reason it's uninteresting to me is that it has no clear
applicability to the cases this sort aims at:  exploiting significant
pre-existing order of various kinds.  That leads to unbalanced run lengths
when we're lucky, and if I'm merging a 2-element run with a 100,000-element
run, high cache associativity isn't of much use.  From the timings I showed
before, it's clear that "good cases" of pre-existing order take time that
depends almost entirely on just the number of comparisons needed; e.g.,
3sort and +sort were as fast as /sort, where the latter does nothing but N-1
comparisons in a single left-to-right scan of the array.  Comparisons are
expensive enough in Python that doing O(log N) additional comparisons in
3sort and +sort, then moving massive amounts of the array around to fit the
oddballs in place, costs almosts nothing more in percentage terms.  Since
these cases are already effectively as fast as a single left-to-right scan,
there's simply no potential remaining for significant gain (unless you can
speed a single left-to-right scan!  that would be way cool).

If you think you can write a sort for random Python arrays faster than the
samplesort hybrid, be my guest:  I'd love to see it!  You should be aware
that I've been making this challenge for years <wink>.

Something to note:  I think you have an overly simple view of Python's lists
in mind.  When we're merging two runs in the timing test, it's not *just*
the list memory that's getting scanned.  The lists contain pointers *to*
float objects.  The float objects have to get read up from memory too, and
there goes the rest of your 4-way associativity.  Indeed, if you read the
comments in Lib/test/sortperf.py, you'll find that it performs horrid
trickery to ensure that =sort and =sort work on physically distinct float
objects; way back when, these particular tests ran much faster, and that
turned out to be partly because, e.g., [0.5] * N constructs a list with N
pointers to a single float object, and that was much easier on the memory
system.  We got a really nice slowdown <wink> by forcing N distinct copies
of 0.5.  In earlier Pythons the comparison also got short-circuited by an
early pointer-equality test ("if they're the same object, they must be
equal"), but that's not done anymore.  A quick run just now showed that
=sort still runs significantly quicker if given a list of identical objects;
the only explanation left for that appears to be cache effects.

> Keeping track of comparisons that you've already made could get ugly,

Most researches have found that a fancy data structure for this is
counter-productive:  so long as the m in m-way merging isn't ridiculously
large, keeping the head entries in a straight vector with m elements runs
fastest.  But they're not worried about Python's expensive-comparison case.
External sorts using m-way merging with large m typically use a selection
tree much like a heap to reduce the expense of keeping track (see, e.g.,
Knuth for details).

> and your temp space requirement would go from N/2 to possibly 3N/4...
> But since you're diving so deeply into this problem, I figured I'd
> throw it out there.
>
> OTOH, this could be close to the speedup that heavily optimized FFT algs
> get when they go from radix-2 to radix-4.   Just thinking out loud...

I don't think that's comparable.  Moving to radix 4 cuts the total number of
non-trivial complex multiplies an FFT has to do, and non-trivial complex
multiplies are the expensive part of what an FFT does.  In contrast,
boosting the m in m-way merging doesn't cut the number of comparisons needed
at all (to the contrary, if you're not very careful it increases them), and
comparisons are what kill sorting routines in Python.  The elaborate
gimmicks in timsort for doing merges of unbalanced runs do cut the total
number of comparisons needed, and that's where the huge wins come from.




From ark@research.att.com  Tue Jul 23 14:58:30 2002
From: ark@research.att.com (Andrew Koenig)
Date: 23 Jul 2002 09:58:30 -0400
Subject: [Python-Dev] The iterator story
In-Reply-To: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy>
References: <Pine.LNX.4.44.0207190133440.25751-100000@ziggy>
Message-ID: <yu99wurm1r7d.fsf@europa.research.att.com>

Ping>     - Iterators provide just one method, __next__().

Ping>     - The built-in next() calls tp_iternext.  For instances,
Ping>       tp_iternext calls __next__.

Ping>     - Objects wanting to be iterated over provide just one method,
Ping>       __iter__().  Some of these are containers, but not all.

Ping>     - The built-in iter(foo) calls tp_iter.  For instances,
Ping>       tp_iter calls __iter__.

Ping>     - "for x in y" gets iter(y) and uses it as an iterator.

Ping>     - "for x from y" just uses y as the iterator.

+1.

Ping>     - We have a nice clean division between containers and iterators.

Ping>     - When you see "for x in y" you know that y is a container.

What if y is a file?  You already said that files are not containers.

Ping>     - When you see "for x from y" you know that y is an iterator.

Ping>     - "for x in y" never destroys y.

Ping>     - "if x in y" never destroys y.

What if y is a file?

Ping> Other notes:

Ping>     - The file problem has a consistent solution.  Instead of writing
Ping>       "for line in file" you write

Ping>         for line from file:
Ping>             print line

Ping>       Being forced to write "from" signals to you that the file is
Ping>       eaten up.  There is no expectation that "for line from file"
Ping>       will work again.

Ah.  So you want to break "for line in file:", which works now?

I'm still +1 as long as there is a transition scheme.

Ping> My Not-So-Ideal Protocol
Ping> ------------------------

Ping> All right.  So new syntax may be hard to swallow.  An alternative
Ping> is to introduce an adapter that turns an iterator into something
Ping> that "for" will accept -- that is, the opposite of iter().

Ping>     - The built-in seq(it) returns x such that iter(x) yields it.

Ping> Then instead of writing

Ping>     for x from it:

Ping> you would write

Ping>     for x in seq(it):

Ping> and the rest would be the same.  The use of "seq" here is what
Ping> would flag the fact that "it" will be destroyed.

I prefer "for x from it:

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark



From thomas.heller@ion-tof.com  Tue Jul 23 15:18:31 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Tue, 23 Jul 2002 16:18:31 +0200
Subject: [Python-Dev] PEP 296 - The Buffer Problem
References: <20020723063611.26677.qmail@web40102.mail.yahoo.com>
Message-ID: <003c01c23253$d2f80860$e000a8c0@thomasnotebook>

> PEP: 296
> Title: The Buffer Problem

IMO should better be 'The bytes Object'

>     6. In C/C++ extensions, the bytes object can be created from a supplied
>     pointer and destructor function to free the memory when the
>     reference count goes to zero.
> 
>     The special implementation of slicing for the bytes object allows
>     multiple bytes objects to refer to the same pointer/destructor.
>     As such, a refcount will be kept on the actual
>     pointer/destructor.  This refcount is separate from the refcount
>     typically associated with Python objects.
> 

Why is this? Wouldn't it be sufficient if views keep references
to the 'viewed' byte object?

>     8. The bytes object keeps track of the length of its data with a Python
>     LONG_LONG type.  Even though the current definition for PyBufferProcs
>     restricts the length to be the size of an int, this PEP does not propose
>     to make any changes there.  Instead, extensions can work around this limit
>     by making an explicit PyBytes_Check(...) call, and if that succeeds they
>     can make a PyBytes_GetReadBuffer(...) or PyBytes_GetWriteBuffer call to
>     get the pointer and full length of the object as a LONG_LONG.
> 
>     The bytes object will raise an exception if the standard PyBufferProcs
>     mechanism is used and the size of the bytes object is greater than can be
>     represented by an integer.
> 
>     From Python scripting, the bytes object will be subscriptable with longs
>     so the 32 bit int limit can be avoided.
> 
>     There is still a problem with the len() function as it is PyObject_Size()
>     and this returns an int as well.  As a workaround, the bytes object will
>     provide a .length() method that will return a long.
> 
Is this worth the trouble?
(Hm, 64-bit platforms with 32-bit integers remind my of the broken
DOS/Windows 3.1 platforms with near/far/huge pointers).

>     9. The bytes object can be constructed at the Python scripting level by
>     passing an int/long to the bytes constructor with the number of bytes to
>     allocate.  For example:
> 
>        b = bytes(100000) # alloc 100K bytes
> 
>     The constructor can also take another bytes object.  This will be useful
>     for the implementation of unpickling, and in converting a read-write bytes
>     object into a read-only one.  An optional second argument will be used to
>     designate creation of a readonly bytes object.

> 
>     10. From the C API, the bytes object can be allocated using any of the
>     following signatures:
> 
>        PyObject* PyBytes_FromLength(LONG_LONG len, int readonly);
>        PyObject* PyBytes_FromPointer(void* ptr, LONG_LONG len, int readonly
>                 void (*dest)(void *ptr, void *user), void* user);
>     
>     In the PyBytes_FromPointer(...) function, if the dest function pointer is
>     passed in as NULL, it will not be called.  This should only be used for
>     creating bytes objects from statically allocated space.
>     
>     The user pointer has been called a closure in other places.  It is a
>     pointer that the user can use for whatever purposes.  It will be passed to
>     the destructor function on cleanup and can be useful for a number of
>     things.  If the user pointer is not needed, NULL should be passed instead.

Shouldn't there be constructors to create a view of a bytes/view object,
or are we supposed to create them by slicing?

>     11. The bytes type will be a new style class as that seems to be where all
>     standard Python types are headed.
 
Good.

Thanks,

Thomas




From xscottg@yahoo.com  Tue Jul 23 16:40:55 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Tue, 23 Jul 2002 08:40:55 -0700 (PDT)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <003c01c23253$d2f80860$e000a8c0@thomasnotebook>
Message-ID: <20020723154055.54251.qmail@web40106.mail.yahoo.com>

--- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> > PEP: 296
> > Title: The Buffer Problem
> 
> IMO should better be 'The bytes Object'
> 

Part of the title was just me being cute, but apparently this problem has a
long history and has been referred to as "The Buffer Problem" many times in
the past.  Plus when I first submitted it, I wasn't sure the name "bytes"
was going to stick.


> 
> Why is this? Wouldn't it be sufficient if views keep references
> to the 'viewed' byte object?
> 

They do, but the referenced "inner-thing" needs it's own reference count to
know how many "bytes-views" are sharing it.  When a bytes-view gets cleaned
up, it decrefs the reference count of the inner-thing it is referring to,
and if the reference count goes to zero, the bytes-view calls the
destructor for the inner-thing.


>
> > and this returns an int as well.  As a workaround, the bytes object
> > will provide a .length() method that will return a long.
> > 
> Is this worth the trouble?
> (Hm, 64-bit platforms with 32-bit integers remind my of the broken
> DOS/Windows 3.1 platforms with near/far/huge pointers).
> 

I think most 64 bit platforms actually have a 32 bit int.  Some of them
(like the Alpha) have a 64 bit long, but Python has made extensive use of
the int type in the PyBufferProcs interface and elsewhere.  So if we want
to make full use of large memory machines (I do), something has to be done.
 The only way to reliably get a 64 bit integer on these platforms is to use
the "long long" type or __int64 on Windows (spelled LONG_LONG in Python).

Note that the .length() method will return a Python long, not a C long.


> 
> Shouldn't there be constructors to create a view of a bytes/view object,
> or are we supposed to create them by slicing?
> 

Item 9 in the PEP talks about this.  Maybe I'll add some text to make this
more clear.



Cheers,
    -Scott


__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com



From thomas.heller@ion-tof.com  Tue Jul 23 19:04:00 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Tue, 23 Jul 2002 20:04:00 +0200
Subject: [Python-Dev] PEP 296 - The Buffer Problem
References: <20020723154055.54251.qmail@web40106.mail.yahoo.com>
Message-ID: <027f01c23273$528d1240$e000a8c0@thomasnotebook>

> > 
> > Why is this? Wouldn't it be sufficient if views keep references
> > to the 'viewed' byte object?
> > 
> 
> They do, but the referenced "inner-thing" needs it's own reference count to
> know how many "bytes-views" are sharing it.  When a bytes-view gets cleaned
> up, it decrefs the reference count of the inner-thing it is referring to,
> and if the reference count goes to zero, the bytes-view calls the
> destructor for the inner-thing.
> 
Hm, I thought the 'inner-thing' is a python object (with it's own
refcount) itself. Isn't the 'inner-thing' the bytes object owning
the allocated memory? And the 'outer-things' (the views) simply
viewing slices of this memory?

Thomas




From xscottg@yahoo.com  Tue Jul 23 19:33:02 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Tue, 23 Jul 2002 11:33:02 -0700 (PDT)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <027f01c23273$528d1240$e000a8c0@thomasnotebook>
Message-ID: <20020723183302.98153.qmail@web40102.mail.yahoo.com>

--- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> > 
> > They do, but the referenced "inner-thing" needs it's own reference
> count to
> > know how many "bytes-views" are sharing it.  When a bytes-view gets
> cleaned
> > up, it decrefs the reference count of the inner-thing it is referring
> to,
> > and if the reference count goes to zero, the bytes-view calls the
> > destructor for the inner-thing.
> > 
> Hm, I thought the 'inner-thing' is a python object (with it's own
> refcount) itself. Isn't the 'inner-thing' the bytes object owning
> the allocated memory? And the 'outer-things' (the views) simply
> viewing slices of this memory?
> 

The outer-thing is definitely the "bytes object", since that's what people
will work with directly.  It has to be a true Python object in all its
glory.

The inner-thing _could_ be a Python object (and Guido suggested that maybe
it should be), but that's an implementation detail.  I don't know why
anyone would want to work with the inner-thing directly.  However, one good
use case and I'll be sold on the idea.

I'll definitely add some verbage to clarify this in the next revision.

Cheers,
    -Scott



__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From jmiller@stsci.edu  Tue Jul 23 19:47:02 2002
From: jmiller@stsci.edu (Todd Miller)
Date: Tue, 23 Jul 2002 14:47:02 -0400
Subject: [Python-Dev] PEP 296 - The Buffer Problem
References: <20020723183302.98153.qmail@web40102.mail.yahoo.com>
Message-ID: <3D3DA4A6.6040802@stsci.edu>

Scott Gilbert wrote:

>--- Thomas Heller <thomas.heller@ion-tof.com> wrote:
>
>>>They do, but the referenced "inner-thing" needs it's own reference
>>>
>>count to
>>
>>>know how many "bytes-views" are sharing it.  When a bytes-view gets
>>>
>>cleaned
>>
>>>up, it decrefs the reference count of the inner-thing it is referring
>>>
>>to,
>>
>>>and if the reference count goes to zero, the bytes-view calls the
>>>destructor for the inner-thing.
>>>
>>Hm, I thought the 'inner-thing' is a python object (with it's own
>>refcount) itself. Isn't the 'inner-thing' the bytes object owning
>>the allocated memory? And the 'outer-things' (the views) simply
>>viewing slices of this memory?
>>
>
>The outer-thing is definitely the "bytes object", since that's what people
>will work with directly.  It has to be a true Python object in all its
>glory.
>
>The inner-thing _could_ be a Python object (and Guido suggested that maybe
>it should be), but that's an implementation detail.  I don't know why
>
>
>anyone would want to work with the inner-thing directly.  However, one good
>use case and I'll be sold on the idea.
>
Letting the inner-thing be a mmap would enable slices of a mmap as views 
as opposed to strings.  We'd certainly like this for numarray, 
especially if it meant pickling efficiency for mmap based arrays.

>
>
>I'll definitely add some verbage to clarify this in the next revision.
>
>Cheers,
>    -Scott
>
>
>
>__________________________________________________
>Do You Yahoo!?
>Yahoo! Health - Feel better, live better
>http://health.yahoo.com
>
>_______________________________________________
>Python-Dev mailing list
>Python-Dev@python.org
>http://mail.python.org/mailman/listinfo/python-dev
>


-- 
Todd Miller 			jmiller@stsci.edu
STSCI / SSG			(410) 338 4576





From thomas.heller@ion-tof.com  Tue Jul 23 20:59:18 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Tue, 23 Jul 2002 21:59:18 +0200
Subject: [Python-Dev] PEP 296 - The Buffer Problem
References: <20020723183302.98153.qmail@web40102.mail.yahoo.com>
Message-ID: <030901c23283$6e2be430$e000a8c0@thomasnotebook>

> > > 
> > > They do, but the referenced "inner-thing" needs it's own reference
> > count to
> > > know how many "bytes-views" are sharing it.  When a bytes-view gets
> > cleaned
> > > up, it decrefs the reference count of the inner-thing it is referring
> > to,
> > > and if the reference count goes to zero, the bytes-view calls the
> > > destructor for the inner-thing.
> > > 
> > Hm, I thought the 'inner-thing' is a python object (with it's own
> > refcount) itself. Isn't the 'inner-thing' the bytes object owning
> > the allocated memory? And the 'outer-things' (the views) simply
> > viewing slices of this memory?
> > 
> 
> The outer-thing is definitely the "bytes object", since that's what people
> will work with directly.  It has to be a true Python object in all its
> glory.
> 
> The inner-thing _could_ be a Python object (and Guido suggested that maybe
> it should be), but that's an implementation detail.  I don't know why
> anyone would want to work with the inner-thing directly.  However, one good
> use case and I'll be sold on the idea.
> 
> I'll definitely add some verbage to clarify this in the next revision.
> 
I've quickly read the pep again.
I see no mentioning of an 'inner object' and an 'outer object'
there, so I would recommend you try to explain this (if you want to stay
with this decision).

OTOH, your 'inner thing' has a refcount, an (optional) destructor
which is a kind of closure, instance variables (memory pointer,
readonly flag), so there is not too much missing for a full
python object.

Could the 'inner thing' have the same type as the 'outer thing':
the inner thing being a full view of itself, and the outer thing
probably a view viewing only a slice of the inner thing?

Thomas



From greg@cosc.canterbury.ac.nz  Tue Jul 23 23:29:22 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 24 Jul 2002 10:29:22 +1200 (NZST)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <20020723183302.98153.qmail@web40102.mail.yahoo.com>
Message-ID: <200207232229.g6NMTM609792@oma.cosc.canterbury.ac.nz>

Scott Gilbert <xscottg@yahoo.com>:

> The inner-thing _could_ be a Python object (and Guido suggested that
> maybe it should be), but that's an implementation detail.

In that case, unless there's some reason for it
*not* to be a Python object, you might as well
make it one and take advantage of all the
Python refcount machinery.

<plug>
If you use Pyrex for the implementation, making
Python objects will be dead easy!
</plug>

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From jason-exp-1028157503.aebc46@mastaler.com  Wed Jul 24 00:18:58 2002
From: jason-exp-1028157503.aebc46@mastaler.com (jason-exp-1028157503.aebc46@mastaler.com)
Date: Tue, 23 Jul 2002 17:18:58 -0600
Subject: [Python-Dev] Re: Where's time.daylight???
References: <E17VbDN-00015m-00@usw-pr-cvs1.sourceforge.net> <15672.18628.831787.897474@anthem.wooz.org>
 <200207191732.g6JHWJD28040@pcp02138704pcs.reston01.va.comcast.net>
 <200207191910.g6JJAUJ32606@pcp02138704pcs.reston01.va.comcast.net>
 <m3it3bnu64.fsf@mira.informatik.hu-berlin.de>
 <200207200043.g6K0hMJ27043@pcp02138704pcs.reston01.va.comcast.net>
 <m3heiuzsdw.fsf@mira.informatik.hu-berlin.de>
Message-ID: <hheldu6nj1.fsf@nightshade.la.mastaler.com>

martin@v.loewis.de (Martin v. Loewis) writes:

> I have no OSF/1 (aka whatever) system

http://www.testdrive.compaq.com/

-- 
(http://tmda.net/)




From xscottg@yahoo.com  Wed Jul 24 09:13:26 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Wed, 24 Jul 2002 01:13:26 -0700 (PDT)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <030901c23283$6e2be430$e000a8c0@thomasnotebook>
Message-ID: <20020724081326.995.qmail@web40107.mail.yahoo.com>

--- Thomas Heller <thomas.heller@ion-tof.com> wrote:
>
> I've quickly read the pep again.
> I see no mentioning of an 'inner object' and an 'outer object'
> there, so I would recommend you try to explain this (if you want to stay
> with this decision).
>

This is just the terminology I was using to try and communicate with you. 
The outer thing is the bytes object (which is generally interesting to
users), and the inner thing is an implementation detail.  Like I said, I'll
add more text on this in the next revision since it seems to be causing
confusion.

> 
> OTOH, your 'inner thing' has a refcount, an (optional) destructor
> which is a kind of closure, instance variables (memory pointer,
> readonly flag), so there is not too much missing for a full
> python object.
>

I still haven't heard a good reason to expose the inner thing to user code
yet though.  So even if the inner thing is a PyObject, who would know? 
It's probably better for maintenance to use something everyone is already
familiar with, so I'll probably do it for that reason.

> 
> Could the 'inner thing' have the same type as the 'outer thing':
> the inner thing being a full view of itself, and the outer thing
> probably a view viewing only a slice of the inner thing?
> 

It might be.  However, I'm afraid this will lead to some ugly special cases
when the view is the inner thing versus when the view is referring to some
other thing.  It's probably cleaner to make a clear distinction between the
two and stick with it throughout.  

(I'm growing to dislike this "thing" terminology....)







__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From xscottg@yahoo.com  Wed Jul 24 09:22:29 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Wed, 24 Jul 2002 01:22:29 -0700 (PDT)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <3D3DA4A6.6040802@stsci.edu>
Message-ID: <20020724082229.94975.qmail@web40105.mail.yahoo.com>

--- Todd Miller <jmiller@stsci.edu> wrote:
>
> Letting the inner-thing be a mmap would enable slices of a mmap as views 
> as opposed to strings.  We'd certainly like this for numarray, 
> especially if it meant pickling efficiency for mmap based arrays.
>

The first version of the PEP I sent to you directly didn't have this, but
the latest version I posted to python-dev mentions it briefly.  It seems
both you and Guido came up with the same idea regarding mmap.

The current strategy is to add a method to the mmap module that would
return a bytes object from an mmap object.  I would like it to be able to
pickle too.  (Which probably means the new method in the mmap module will
probably return a class derived from bytes, and not the bytes base class.)

However, this is sort of orthogonal to the PEP.  If the bytes object makes
it in, but the mmap enhancements get left out, a third party extension
could implement the mmap_to_bytes function and still make use of the
efficient pickling by deriving from the bytes object.


Cheers,
    -Scott


__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From xscottg@yahoo.com  Wed Jul 24 10:05:12 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Wed, 24 Jul 2002 02:05:12 -0700 (PDT)
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEMOAGAB.tim.one@comcast.net>
Message-ID: <20020724090512.2485.qmail@web40110.mail.yahoo.com>

--- Tim Peters <tim.one@comcast.net> wrote:
> 
>     "A Meticulous Analysis of Mergesort Programs"
>     Jyrki Katajainen, Jesper Larsson Traff
> 

Thanks for the cool reference.  I read a bit of it last night.  I ought to
know by now that there really isn't much new under the sun...

> 
> The real reason it's uninteresting to me is that it has no clear
> applicability to the cases this sort aims at:  exploiting significant
> pre-existing order of various kinds.
> [...]
> (unless you can speed a single left-to-right scan!  that would be way 
> cool).
> 

Do a few well placed prefetch instructions buy you anything?  The MMU could
be grabbing your next pointer while you're doing your current comparison. 
And of course you could implement it as a macro that evaporates for
whatever platforms you didn't care to implement it on.  (I need to look it
up, but I'm pretty sure you could do this for both VC++ and gcc on recent
x86s.)

>
> If you think you can write a sort for random Python arrays faster than
> the
> samplesort hybrid, be my guest:  I'd love to see it!  You should be aware
> that I've been making this challenge for years <wink>.
>

You're remarkably good at taunting me.  :-)  I've spent a little time on a
few of these optimization challenges that get posted.  One of these days
I'll best you...  (not this day though)


> 
> Something to note:  I think you have an overly simple view of Python's
> lists in mind.
>

No, I think I understand the model.  I just assumed the objects pointed to
would be scattered pretty randomly through memory.  So statistically
they'll step on the same cache lines as your list once in a while, but that
it would average out to being less interesting than the adjacent slots in
the list.  I'm frequently wrong about stuff like this though...


Cheers,
    -Scott


__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From jmiller@stsci.edu  Wed Jul 24 12:21:41 2002
From: jmiller@stsci.edu (Todd Miller)
Date: Wed, 24 Jul 2002 07:21:41 -0400
Subject: [Python-Dev] PEP 296 - The Buffer Problem
References: <20020724082229.94975.qmail@web40105.mail.yahoo.com>
Message-ID: <3D3E8DC5.7040906@stsci.edu>

Scott Gilbert wrote:

>--- Todd Miller <jmiller@stsci.edu> wrote:
>
>>Letting the inner-thing be a mmap would enable slices of a mmap as views 
>>as opposed to strings.  We'd certainly like this for numarray, 
>>especially if it meant pickling efficiency for mmap based arrays.
>>
>
>The first version of the PEP I sent to you directly didn't have this, but
>the latest version I posted to python-dev mentions it briefly.  It seems
>both you and Guido came up with the same idea regarding mmap.
>
Yeah, I saw that in your respose.  Sorry.   FWIW,  anything I say here 
should be regarded as a reflection of STSCI's current technical goals as 
channeled by me, and not necessarily "my ideas".   Exploiting mmapping 
has been a pretty long standing goal here at STSCI.

>
>
>The current strategy is to add a method to the mmap module that would
>return a bytes object from an mmap object.  I would like it to be able to
>pickle too.  (Which probably means the new method in the mmap module will
>probably return a class derived from bytes, and not the bytes base class.)
>
This runs pretty wide of my current mental ruts, but it sounds like 
conservative design, so great.

>
>However, this is sort of orthogonal to the PEP.  If the bytes object makes
>it in, but the mmap enhancements get left out, a third party extension
>could implement the mmap_to_bytes function and still make use of the
>efficient pickling by deriving from the bytes object.
>
I understand.  That sounds excellent.

>
>
>Cheers,
>    -Scott
>
>
>__________________________________________________
>Do You Yahoo!?
>Yahoo! Health - Feel better, live better
>http://health.yahoo.com
>
Back to numarray,
Todd




From thomas.heller@ion-tof.com  Wed Jul 24 12:38:00 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 24 Jul 2002 13:38:00 +0200
Subject: [Python-Dev] PEP 296 - The Buffer Problem
References: <20020723063611.26677.qmail@web40102.mail.yahoo.com>
Message-ID: <048301c23306$90992090$e000a8c0@thomasnotebook>

Let me ask some questions and about platforms with 32-bit
integers and 64-bit longs:

>     2. Alignment of the allocated byte array is whatever is promised by the
>     platform implementation of malloc.

On these platforms, does malloc() accept an unsigned long argument
for the requested size?

> [...]
>     8. The bytes object keeps track of the length of its data with a Python
>     LONG_LONG type.
> [...]
>     From Python scripting, the bytes object will be subscriptable with longs
>     so the 32 bit int limit can be avoided.

How is indexing done in C? Can you index these byte arrays by longs?

>     9. The bytes object can be constructed at the Python scripting level by
>     passing an int/long to the bytes constructor with the number of bytes to
>     allocate.  For example:
> 
>        b = bytes(100000) # alloc 100K bytes
> 
>     The constructor can also take another bytes object.  This will be useful
>     for the implementation of unpickling, and in converting a read-write bytes
>     object into a read-only one.  An optional second argument will be used to
>     designate creation of a readonly bytes object.
> 
>     10. From the C API, the bytes object can be allocated using any of the
>     following signatures:
> 
>        PyObject* PyBytes_FromLength(LONG_LONG len, int readonly);
>        PyObject* PyBytes_FromPointer(void* ptr, LONG_LONG len, int readonly
>                 void (*dest)(void *ptr, void *user), void* user);

Hm, if 'bytes' is a new style class, these functions should
require a 'PyObject *type' parameter as well. OTOH, new style
classes are usually created by calling their *type*, so you
should describe the signature of the byte type's tp_call.
(It may be possible to supply variations of the above functions
for convenience as well.)

>     The array module supports the creation of an array of bytes, but it does
>     not provide a C API for supplying pointers and destructors to extension
>     supplied memory.  This makes it unusable for constructing objects out of
>     shared memory, or memory that has special alignment or locking for things
>     like DMA transfers.  Also, the array object does not currently pickle.
>     Finally since the array object allows its contents to grow, via the extend
>     method, the pointer can be changed if the GIL is not held while using it.
...or if any code is executed which may change the array object, even
if the GIL is held!

Thomas



From xscottg@yahoo.com  Wed Jul 24 17:01:58 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Wed, 24 Jul 2002 09:01:58 -0700 (PDT)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <048301c23306$90992090$e000a8c0@thomasnotebook>
Message-ID: <20020724160158.34860.qmail@web40112.mail.yahoo.com>

--- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> Let me ask some questions and about platforms with 32-bit
> integers and 64-bit longs:
> 
> >     2. Alignment of the allocated byte array is whatever is promised by
> >     the platform implementation of malloc.
> 
> On these platforms, does malloc() accept an unsigned long argument
> for the requested size?
> 

At the moment, the only 64 bit platform that I have easy access to is
Tru64/Alpha.  That version of malloc takes a size_t which is a 64 bit
quantity.

I believe most semi-sane platforms will use a size_t as argument for
malloc, and I believe most semi-sane platforms will have a size_t that is
the same number of bits as a pointer for that platform.


> > [...]
> >     8. The bytes object keeps track of the length of its data with a
> >     Python LONG_LONG type.
> > [...]
> >     From Python scripting, the bytes object will be subscriptable with
> >     longs so the 32 bit int limit can be avoided.
> 
> How is indexing done in C?> 

Indexing is done by grabbing the pointer and length via a call like:

    int PyObject_AsLargeReadBuffer(PyObject* bo, unsigned char** ptr,
                                   LONG_LONG* len);

Note that the name could be different depending on whether it ends up in
abstract.h or bytesobject.h.

> Can you index these byte arrays by longs?

You could index it via a long, but using a LONG_LONG is safer.  My
understanding is that on Win64 a long will only be 32 bits even though
void* is 64 bits.  So for that platform, LONG_LONG will be a typedef for
__int64 which is 64 bits.

None of this matters for 32 bit platforms.  All 32 bit platforms that I
know of have sizeof(int) == sizeof(long) == sizeof(void*) == 4.  So even if
you wanted to subscript with a long or LONG_LONG, the pointer could only
point to something about 2 Gigs (31 bits) in size.


> > 
> >     10. From the C API, the bytes object can be allocated using any of
> >     the following signatures:
> > 
> >        PyObject* PyBytes_FromLength(LONG_LONG len, int readonly);
> >        PyObject* PyBytes_FromPointer(void* ptr, LONG_LONG len,
> >                int readonly void (*dest)(void *ptr, void *user),
> >                void* user);
> 
> Hm, if 'bytes' is a new style class, these functions should
> require a 'PyObject *type' parameter as well. OTOH, new style
> classes are usually created by calling their *type*, so you
> should describe the signature of the byte type's tp_call.
> (It may be possible to supply variations of the above functions
> for convenience as well.)
> 

I consider these to be the minimum convenience functions that are necessary
for the functionality I'd like to see.  I'll follow the conventions for
creating a new style class for PyBytesObject to the letter, and any other
variations of the above convenience functions can be added as needed. 
(It's easier to add stuff than take it away...)


Cheers,
    -Scott


__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From xscottg@yahoo.com  Wed Jul 24 17:02:15 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Wed, 24 Jul 2002 09:02:15 -0700 (PDT)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <048301c23306$90992090$e000a8c0@thomasnotebook>
Message-ID: <20020724160215.30526.qmail@web40104.mail.yahoo.com>

--- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> Let me ask some questions and about platforms with 32-bit
> integers and 64-bit longs:
> 
> >     2. Alignment of the allocated byte array is whatever is promised by
> >     the platform implementation of malloc.
> 
> On these platforms, does malloc() accept an unsigned long argument
> for the requested size?
> 

At the moment, the only 64 bit platform that I have easy access to is
Tru64/Alpha.  That version of malloc takes a size_t which is a 64 bit
quantity.

I believe most semi-sane platforms will use a size_t as argument for
malloc, and I believe most semi-sane platforms will have a size_t that is
the same number of bits as a pointer for that platform.


> > [...]
> >     8. The bytes object keeps track of the length of its data with a
> >     Python LONG_LONG type.
> > [...]
> >     From Python scripting, the bytes object will be subscriptable with
> >     longs so the 32 bit int limit can be avoided.
> 
> How is indexing done in C?> 

Indexing is done by grabbing the pointer and length via a call like:

    int PyObject_AsLargeReadBuffer(PyObject* bo, unsigned char** ptr,
                                   LONG_LONG* len);

Note that the name could be different depending on whether it ends up in
abstract.h or bytesobject.h.

> Can you index these byte arrays by longs?

You could index it via a long, but using a LONG_LONG is safer.  My
understanding is that on Win64 a long will only be 32 bits even though
void* is 64 bits.  So for that platform, LONG_LONG will be a typedef for
__int64 which is 64 bits.

None of this matters for 32 bit platforms.  All 32 bit platforms that I
know of have sizeof(int) == sizeof(long) == sizeof(void*) == 4.  So even if
you wanted to subscript with a long or LONG_LONG, the pointer could only
point to something about 2 Gigs (31 bits) in size.


> > 
> >     10. From the C API, the bytes object can be allocated using any of
> >     the following signatures:
> > 
> >        PyObject* PyBytes_FromLength(LONG_LONG len, int readonly);
> >        PyObject* PyBytes_FromPointer(void* ptr, LONG_LONG len,
> >                int readonly void (*dest)(void *ptr, void *user),
> >                void* user);
> 
> Hm, if 'bytes' is a new style class, these functions should
> require a 'PyObject *type' parameter as well. OTOH, new style
> classes are usually created by calling their *type*, so you
> should describe the signature of the byte type's tp_call.
> (It may be possible to supply variations of the above functions
> for convenience as well.)
> 

I consider these to be the minimum convenience functions that are necessary
for the functionality I'd like to see.  I'll follow the conventions for
creating a new style class for PyBytesObject to the letter, and any other
variations of the above convenience functions can be added as needed. 
(It's easier to add stuff than take it away...)


Cheers,
    -Scott


__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From tim@zope.com  Wed Jul 24 18:49:10 2002
From: tim@zope.com (Tim Peters)
Date: Wed, 24 Jul 2002 13:49:10 -0400
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <20020724160158.34860.qmail@web40112.mail.yahoo.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHIEKODHAA.tim@zope.com>

[Scott Gilbert]
> At the moment, the only 64 bit platform that I have easy access to is
> Tru64/Alpha.  That version of malloc takes a size_t which is a 64 bit
> quantity.
>
> I believe most semi-sane platforms will use a size_t as argument for
> malloc,

That much is required by the C standard, so you can rely on it.

> and I believe most semi-sane platforms will have a size_t that is
> the same number of bits as a pointer for that platform.

The std is silent on this; it's true on 64-bit Linux and Win64, so "good
enough".

>> Can you index these byte arrays by longs?

> You could index it via a long, but using a LONG_LONG is safer.  My
> understanding is that on Win64 a long will only be 32 bits even though
> void* is 64 bits.

Right.

> So for that platform, LONG_LONG will be a typedef for __int64 which is 64
> bits.

Also on Win32:  LONG_LONG is a 64-bit integral type on Win32 and Win64.

> None of this matters for 32 bit platforms.

?  Win32 has always supported "large files" and "large mmaps" (where large
means 64-bit capacity), and most 32-bit flavors of Unix do too.  It's a
x-platform mess, though.

> All 32 bit platforms that I know of have sizeof(int) == sizeof(long) ==
> sizeof(void*) == 4.

Same here.

> So even if you wanted to subscript with a long or LONG_LONG, the pointer
> could only point to something about 2 Gigs (31 bits) in size.

That depends on how it's implemented; on a 32-bit box, supporting a
LONG_LONG subscript may require some real pain, but isn't impossible.  For
example, Python manages to support 64-bit "subscripts" to f.seek() on the
major 32-bit boxes right now.



From tim.one@comcast.net  Thu Jul 25 00:01:32 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 24 Jul 2002 19:01:32 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEIBAGAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEEMAHAB.tim.one@comcast.net>

FYI, I've been poking at this in the background.  The ~sort regression is
vastly reduced, via removing special-casing and adding more general
adaptivity (if you read the timsort.txt file, the special case for run
lengths within a factor of 2 of each other went away, replaced by a more
intelligent mix of one-pair-at-a-time versus galloping modes).

*sort lost about 1% as a result (one-pair-at-a-time is maximally effective
for *sort, but in a random mix every now again the "switch to the less
efficient (for it) galloping mode" heuristic triggers by blind luck).

There's also a significant systematic regression in timsort's +sort case,
although it remains faster (and much more general) than samplesort's
special-casing of it; also a mix of small regressions and speedups in 3sort.
These are because, to simplify experimenting, I threw out the "copy only the
shorter run" gimmick, always copying the left run instead.  That hurts +sort
systematically, as instead of copying just the 10 oddball elements at the
end, it copies the very long run of N-10 elements instead (and as many as
N-1 temp pointers can be needed, up from N/2).  That's all repairable, it's
just a PITA to do it.

C:\Code\python\PCbuild>python -O sortperf.py 15 20 1
samplesort
 i    2**i  *sort  \sort  /sort  3sort  +sort  ~sort  =sort  !sort
15   32768   0.18   0.01   0.02   0.11   0.01   0.04   0.01   0.11
16   65536   0.24   0.02   0.02   0.25   0.02   0.08   0.02   0.24
17  131072   0.53   0.05   0.04   0.49   0.05   0.18   0.04   0.52
18  262144   1.16   0.09   0.09   1.06   0.12   0.37   0.09   1.14
19  524288   2.53   0.18   0.17   2.30   0.24   0.75   0.17   2.47
20 1048576   5.48   0.37   0.35   5.18   0.45   1.52   0.35   5.35

timsort
 i    2**i  *sort  \sort  /sort  3sort  +sort  ~sort  =sort  !sort
15   32768   0.17   0.01   0.02   0.01   0.01   0.05   0.01   0.02
16   65536   0.24   0.02   0.02   0.02   0.02   0.09   0.02   0.04
17  131072   0.54   0.05   0.04   0.05   0.05   0.19   0.04   0.09
18  262144   1.17   0.09   0.09   0.10   0.10   0.38   0.09   0.18
19  524288   2.56   0.18   0.17   0.20   0.20   0.79   0.17   0.36
20 1048576   5.54   0.37   0.35   0.37   0.41   1.62   0.35   0.73

In short, there's no real "speed argument" against this anymore (as I said
in the first msg of this thread, the ~sort regression was serious -- it's an
important case; turns out galloping is very effective at speeding it too,
provided that dumbass premature special-casing doesn't stop galloping from
trying <wink>).



From xscottg@yahoo.com  Thu Jul 25 00:22:54 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Wed, 24 Jul 2002 16:22:54 -0700 (PDT)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHIEKODHAA.tim@zope.com>
Message-ID: <20020724232254.86946.qmail@web40101.mail.yahoo.com>

--- Tim Peters <tim@zope.com> wrote:
> 
> > So for that platform, LONG_LONG will be a typedef for __int64 which is
> > 64 bits.
> 
> Also on Win32:  LONG_LONG is a 64-bit integral type on Win32 and Win64.
> 

Yep.  I was trying to contrast that on most platforms LONG_LONG is an alias
for "long long", but on Windows (32 or 64) it's going to be an __int64.


> 
> > So even if you wanted to subscript with a long or LONG_LONG, the
> > pointer could only point to something about 2 Gigs (31 bits) in size.
> 
> That depends on how it's implemented; on a 32-bit box, supporting a
> LONG_LONG subscript may require some real pain, but isn't impossible. 
> For
> example, Python manages to support 64-bit "subscripts" to f.seek() on the
> major 32-bit boxes right now.
> 

I should have been more clear.  I was referring specifically to working
with pointers:

    datum = *(pointer + offset);
or:
    datum = pointer[offset];


Just so there is no confusion, you aren't suggesting that the bytes PEP
should provide a mechanism to support chunks of memory larger than 4 Gigs
on 32 bit platforms right?

I think the bytes object could be a part of the solution to that problem,
at least I know how I would do that under Win32, but I'd rather not kluge
up the interface to the bytes object to support it directly.


Cheers,
    -Scott


__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From guido@python.org  Thu Jul 25 01:04:51 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 24 Jul 2002 20:04:51 -0400
Subject: [Python-Dev] Powerpoint slide for keynotes available
Message-ID: <200207250004.g6P04pP20522@pcp02138704pcs.reston01.va.comcast.net>

I've put the powerpoint slides for my keynotes at EuroPython and OSCON
on the web.  If someone can donate PDF that would be great (the HTML
generated by Powerpoint sucks too much to be worth it IMO).

http://www.python.org/doc/essays/ppt/

(scroll to end)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Thu Jul 25 05:37:00 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 25 Jul 2002 00:37:00 -0400
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <20020724232254.86946.qmail@web40101.mail.yahoo.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEFPAHAB.tim.one@comcast.net>

[Scott Gilbert]
> ...
> I should have been more clear.  I was referring specifically to working
> with pointers:
>
>     datum = *(pointer + offset);
> or:
>     datum = pointer[offset];

Na, my fault -- I fit in the email between other things, and hadn't read the
whole thread up to that point.  It was clear enough in context.

> Just so there is no confusion, you aren't suggesting that the bytes PEP
> should provide a mechanism to support chunks of memory larger than 4 Gigs
> on 32 bit platforms right?

It depends on how insane you are.  It sure as heck doesn't *sound* like this
is the bytes object's problem to solve, but then if people want their data
sorted they shouldn't let it get out of order to begin with either <wink>.

> I think the bytes object could be a part of the solution to that problem,
> at least I know how I would do that under Win32, but I'd rather not kluge
> up the interface to the bytes object to support it directly.

I agree.



From thomas.heller@ion-tof.com  Thu Jul 25 08:45:44 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 25 Jul 2002 09:45:44 +0200
Subject: [Python-Dev] PEP 296 - The Buffer Problem
References: <BIEJKCLHCIOIHAGOKOLHIEKODHAA.tim@zope.com>
Message-ID: <011201c233af$4883dd50$e000a8c0@thomasnotebook>

> [Scott Gilbert]
> > At the moment, the only 64 bit platform that I have easy access to is
> > Tru64/Alpha.  That version of malloc takes a size_t which is a 64 bit
> > quantity.
> >
> > I believe most semi-sane platforms will use a size_t as argument for
> > malloc,
> 
[Tim]
> That much is required by the C standard, so you can rely on it.
> 
> > and I believe most semi-sane platforms will have a size_t that is
> > the same number of bits as a pointer for that platform.
> 
> The std is silent on this; it's true on 64-bit Linux and Win64, so "good
> enough".
> 
> >> Can you index these byte arrays by longs?
> 
> > You could index it via a long, but using a LONG_LONG is safer.  My
> > understanding is that on Win64 a long will only be 32 bits even though
> > void* is 64 bits.
> 
> Right.

So isn't the conclusion that sizeof(size_t) == sizeof(void *) on
any platform, and so the index should be of type size_t instead of
int, long, or LONG_LONG (aka __int64 in some places)?

Thomas



From thomas.heller@ion-tof.com  Thu Jul 25 09:07:43 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 25 Jul 2002 10:07:43 +0200
Subject: [Python-Dev] PEP 296 - The Buffer Problem
References: <20020723063611.26677.qmail@web40102.mail.yahoo.com>
Message-ID: <014b01c233b2$5a9bf240$e000a8c0@thomasnotebook>

What if we would 'fix' the buffer interface?

Extend the PyBufferProcs structure by new fields:

    typedef size_t (*getlargereadbufferproc)(PyObject *, void **);
    typedef size_t (*getlargewritebufferproc)(PyObject *, void **);

    typedef struct {
            getreadbufferproc bf_getreadbuffer;
            getwritebufferproc bf_getwritebuffer;
            getsegcountproc bf_getsegcount;
            getcharbufferproc bf_getcharbuffer;
            /* new fields */
            getlargereadbufferproc bf_getlargereadbufferproc;
            getlargewritebufferproc bf_getlargewritebufferproc;
    } PyBufferProcs;


The new fields are present if the Py_TPFLAGS_HAVE_GETLARGEBUFFER flag
is set in the object's type. Py_TPFLAGS_HAVE_GETLARGEBUFFER implies
the Py_TPFLAGS_HAVE_GETCHARBUFFER flag.

These functions have the same semantics Scott describes: they must
only be implemented by types only return addresses which are valid as
long as the Python 'source' object is alive.

Python strings, unicode strings, mmap objects, and maybe other types
would expose the large buffer interface, but the array type would
*not*. We could also change the name from 'large buffer interface'
to something more sensible, currently I don't have a better name.

Thomas



From oren-py-d@hishome.net  Thu Jul 25 11:01:58 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Thu, 25 Jul 2002 06:01:58 -0400
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <011201c233af$4883dd50$e000a8c0@thomasnotebook>
References: <BIEJKCLHCIOIHAGOKOLHIEKODHAA.tim@zope.com> <011201c233af$4883dd50$e000a8c0@thomasnotebook>
Message-ID: <20020725100157.GA34465@hishome.net>

On Thu, Jul 25, 2002 at 09:45:44AM +0200, Thomas Heller wrote:
> > >> Can you index these byte arrays by longs?
> > 
> > > You could index it via a long, but using a LONG_LONG is safer.  My
> > > understanding is that on Win64 a long will only be 32 bits even though
> > > void* is 64 bits.
> > 
> > Right.
> 
> So isn't the conclusion that sizeof(size_t) == sizeof(void *) on
> any platform, and so the index should be of type size_t instead of
> int, long, or LONG_LONG (aka __int64 in some places)?

The obvious type to index byte arrays would be ptrdiff_t.

If (char*)-(char*)==ptrdiff_t then (char*)+ptrdiff_t==(char*)

	Oren


From tim@zope.com  Thu Jul 25 16:23:05 2002
From: tim@zope.com (Tim Peters)
Date: Thu, 25 Jul 2002 11:23:05 -0400
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <011201c233af$4883dd50$e000a8c0@thomasnotebook>
Message-ID: <BIEJKCLHCIOIHAGOKOLHOENBDHAA.tim@zope.com>

[Thomas Heller]
> So isn't the conclusion that sizeof(size_t) == sizeof(void *) on
> any platform,

Last I knew, there were dozens of platforms besides Linux and Windows
<wink>.  Like I said, no relationship is defined here.  C99 standardizes a
uintptr_t typedef for an unsigned integer type with "enough bits" so that

    (void*)(uintptr_t)p == p

for any legit pointer p of type void*, but only standarizes its name, not
its existence (a conforming implementation isn't required to supply a
typedef with this name).  Such a type *is* required to compile Python,
though, and pyport.h defines our own Py_uintptr_t (as a synonym for the
platform uintptr_t if it exists, else to the smallest integer type it can
find that looks big enough, else a compile-time #error).

> and so the index should be of type size_t instead of
> int, long, or LONG_LONG (aka __int64 in some places)?

Try to spell out exactly what it is you think this index should be capable
of representing; e.g., what's your most extreme use case?



From tim.one@comcast.net  Thu Jul 25 16:44:22 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 25 Jul 2002 11:44:22 -0400
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <20020725100157.GA34465@hishome.net>
Message-ID: <BIEJKCLHCIOIHAGOKOLHMENDDHAA.tim.one@comcast.net>

[Oren Tirosh]
> The obvious type to index byte arrays would be ptrdiff_t.
>
> If (char*)-(char*)==ptrdiff_t then (char*)+ptrdiff_t==(char*)

Alas, the standard only says that ptrdiff_t *is* the type of the result of
pointer subtraction, not that it *suffices* for that purpose; it explicitly
warns that the true result of subtracting two pointers may not be
respresentable in that type (in which case the behavior is undefined).  In a
similar way, C says the result of adding int to int *is* int, but doesn't
guarantee the result type (int) is sufficent to represent the true result
(and, indeed, in the int case it often isn't).

It may be safer to stick with size_t, since size_t isn't as obscure (lightly
used and/or misunderstood) as ptrdiff_t.



From jeremy@alum.mit.edu  Thu Jul 25 12:04:39 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jul 2002 07:04:39 -0400
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHMENDDHAA.tim.one@comcast.net>
References: <20020725100157.GA34465@hishome.net>
 <BIEJKCLHCIOIHAGOKOLHMENDDHAA.tim.one@comcast.net>
Message-ID: <15679.56135.465947.542871@slothrop.zope.com>

We could have an #if test on PTRDIFF_MIN and PTRDIFF_MAX and refuse to
compile if they don't have reasonable values.

Jeremy



From yozh@mx1.ru  Thu Jul 25 17:03:38 2002
From: yozh@mx1.ru (Stepan Koltsov)
Date: Thu, 25 Jul 2002 20:03:38 +0400
Subject: [Python-Dev] PEP 295 - Interpretation of multiline string constants
Message-ID: <20020725160337.GA8999@banana.mx1.ru>

--/9DWx/yDrRhgMJTb
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline

Hi, all.

I wrote a PEP, its number is 295, it is in attachment.
It should be posted somewhere to be discussed so it is here.
Please, look at it and say what you think.

-- 
mailto: Stepan Koltsov <yozh@mx1.ru>

--/9DWx/yDrRhgMJTb
Content-Type: text/plain; charset=koi8-r
Content-Disposition: attachment; filename="pep-0295.txt"

PEP: 295
Title: Interpretation of multiline string constants
Version: $Revision: 1.1 $
Last-Modified: $Date: 2002/07/22 20:45:07 $
Author: yozh@mx1.ru (Stepan Koltsov)
Status: Draft
Type: Standards Track
Created: 22-Jul-2002
Python-Version: 3.0
Post-History:

Abstract

    This PEP describes an interpretation of multiline string constants
    for Python.  It suggests stripping spaces after newlines and
    stripping a newline if it is first character after an opening
    quotation.


Rationale

    This PEP proposes an interpretation of multiline string constants
    in Python.  Currently, the value of string constant is all the
    text between quotations, maybe with escape sequences substituted,
    e.g.:

        def f():
            """
            la-la-la
            limona, banana
            """
        
        def g():
            return "This is \
            string"
        
        print repr(f.__doc__)
        print repr(g())
    
    prints:
    
        '\n\tla-la-la\n\tlimona, banana\n\t'
        'This is \tstring'
    
    This PEP suggest two things

	- ignore the first character after opening quotation, if it is
	  newline
	- second: ignore in string constants all spaces and tabs up to
	  first non-whitespace character, but no more then current
	  indentation.

    After applying this, previous program will print:
    
        'la-la-la\nlimona, banana\n'
        'This is string'
    
    To get this result, previous programs could be rewritten for
    current Python as (note, this gives the same result with new
    strings meaning):
    
        def f():
            """\
        la-la-la
        limona, banana
        """
        
        def g():
            "This is \
        string"
    
    Or stripping can be done with library routines at runtime (as
    pydoc does), but this decreases program readability.


Implementation

    I'll say nothing about CPython, Jython or Python.NET.
    
    In original Python, there is no info about the current indentation
    (in spaces) at compile time, so space and tab stripping should be
    done at parse time.  Currently no flags can be passed to the
    parser in program text (like from __future__ import xxx).  I
    suggest enabling or disabling of this feature at Python compile
    time depending of CPP flag Py_PARSE_MULTILINE_STRINGS.


Alternatives

    New interpretation of string constants can be implemented with flags
    'i' and 'o' to string constants, like
    
        i"""
        SELECT * FROM car
        WHERE model = 'i525'
        """ is in new style,
        
        o"""SELECT * FROM employee
        WHERE birth < 1982
        """ is in old style, and
        
        """
        SELECT employee.name, car.name, car.price FROM employee, car
        WHERE employee.salary * 36 > car.price
        """ is in new style after Python-x.y.z and in old style otherwise.
    
    Also this feature can be disabled if string is raw, i.e. if flag 'r'
    specified.


Copyright

    This document has been placed in the Public Domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:

--/9DWx/yDrRhgMJTb--


From thomas.heller@ion-tof.com  Thu Jul 25 17:22:15 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 25 Jul 2002 18:22:15 +0200
Subject: [Python-Dev] PEP 296 - The Buffer Problem
References: <BIEJKCLHCIOIHAGOKOLHOENBDHAA.tim@zope.com>
Message-ID: <04c501c233f7$70a0a4b0$e000a8c0@thomasnotebook>

From: "Tim Peters" <tim@zope.com>
> [Thomas Heller]
> > So isn't the conclusion that sizeof(size_t) == sizeof(void *) on
> > any platform,
> 
> Last I knew, there were dozens of platforms besides Linux and Windows
> <wink>.  Like I said, no relationship is defined here.  C99 standardizes a
> uintptr_t typedef for an unsigned integer type with "enough bits" so that
> 
>     (void*)(uintptr_t)p == p
> 
> for any legit pointer p of type void*, but only standarizes its name, not
> its existence (a conforming implementation isn't required to supply a
> typedef with this name).  Such a type *is* required to compile Python,
> though, and pyport.h defines our own Py_uintptr_t (as a synonym for the
> platform uintptr_t if it exists, else to the smallest integer type it can
> find that looks big enough, else a compile-time #error).
> 
> > and so the index should be of type size_t instead of
> > int, long, or LONG_LONG (aka __int64 in some places)?
> 
> Try to spell out exactly what it is you think this index should be capable
> of representing; e.g., what's your most extreme use case?
> 
*I* have no use for this at the moment.
I was just trying to understand the (let's call it) large
byte-array support in Scott's proposal on 64-bit platforms,
and how to program portably on 64-bit and 32-bit platforms.

Assuming we have a large enough byte array
  unsigned char *ptr;
and want to use it in C, for example get a certain byte:

  unsigned char *mybyte = ptr[my_index];

What should the type of my_index be? IIRC, Scott proposed LONG_LONG,
but wouldn't this be a paint on 32-bit platforms?

Thomas



From guido@python.org  Thu Jul 25 17:32:24 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 25 Jul 2002 12:32:24 -0400
Subject: [Python-Dev] Powerpoint slide for keynotes available
References: <200207250004.g6P04pP20522@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <009b01c233f8$de6bade0$7f00a8c0@pacbell.net>

I wrote:
> If someone can donate PDF that would be great (the HTML
> generated by Powerpoint sucks too much to be worth it IMO).
> 
> http://www.python.org/doc/essays/ppt/
> 
> (scroll to end)

I've received about 5 offers of PDF.  The first one is now on the web.
Mark Hadfield won the race. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido@python.org  Thu Jul 25 17:41:02 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 25 Jul 2002 12:41:02 -0400
Subject: [Python-Dev] PEP 295 - Interpretation of multiline string constants
References: <20020725160337.GA8999@banana.mx1.ru>
Message-ID: <00e301c233fa$121fa6e0$7f00a8c0@pacbell.net>

> I wrote a PEP, its number is 295, it is in attachment.
> It should be posted somewhere to be discussed so it is here.
> Please, look at it and say what you think.

This is an incompatible change. Your PEP does not address
how to deal with this at all. I will be forced to reject it unless
you come up with a transition strategy (in fact, I don't even want
to consider your proposal unless you deal with this).

> --Guido van Rossum (home page: http://www.python.org/~guido/)



From xscottg@yahoo.com  Thu Jul 25 18:00:03 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Thu, 25 Jul 2002 10:00:03 -0700 (PDT)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <04c501c233f7$70a0a4b0$e000a8c0@thomasnotebook>
Message-ID: <20020725170003.93924.qmail@web40107.mail.yahoo.com>

--- Thomas Heller <thomas.heller@ion-tof.com> wrote:
>
> *I* have no use for this at the moment.
> I was just trying to understand the (let's call it) large
> byte-array support in Scott's proposal on 64-bit platforms,
> and how to program portably on 64-bit and 32-bit platforms.
> 
> Assuming we have a large enough byte array
>   unsigned char *ptr;
> and want to use it in C, for example get a certain byte:
> 
>   unsigned char *mybyte = ptr[my_index];
> 
> What should the type of my_index be? IIRC, Scott proposed LONG_LONG,
> but wouldn't this be a paint on 32-bit platforms?
> 

Ok, now that I understand where you're coming from.  If nobody has an
objection or can point to a supported platform where it won't work, I'll
switch it to size_t.










__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From yozh@mx1.ru  Thu Jul 25 18:06:40 2002
From: yozh@mx1.ru (Stepan Koltsov)
Date: Thu, 25 Jul 2002 21:06:40 +0400
Subject: [Python-Dev] PEP 295 - Interpretation of multiline string constants
In-Reply-To: <00e301c233fa$121fa6e0$7f00a8c0@pacbell.net>
References: <20020725160337.GA8999@banana.mx1.ru> <00e301c233fa$121fa6e0$7f00a8c0@pacbell.net>
Message-ID: <20020725170640.GA10350@banana.mx1.ru>

On Thu, Jul 25, 2002 at 12:41:02PM -0400, Guido van Rossum wrote:
> > I wrote a PEP, its number is 295, it is in attachment.
> > It should be posted somewhere to be discussed so it is here.
> > Please, look at it and say what you think.
> 
> This is an incompatible change. Your PEP does not address
> how to deal with this at all. I will be forced to reject it unless
> you come up with a transition strategy (in fact, I don't even want
> to consider your proposal unless you deal with this).

For most strings this change will not change program result (for
example number of spaces doesn't matter in SQL queries). For others
I suggested (in section 'Alternatives') flags 'i' and 'o' for string
constants.

-- 
mailto: Stepan Koltsov <yozh@mx1.ru>


From fredrik@pythonware.com  Thu Jul 25 18:27:07 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 25 Jul 2002 19:27:07 +0200
Subject: [Python-Dev] PEP 295 - Interpretation of multiline string constants
References: <20020725160337.GA8999@banana.mx1.ru> <00e301c233fa$121fa6e0$7f00a8c0@pacbell.net> <20020725170640.GA10350@banana.mx1.ru>
Message-ID: <009f01c23400$8259c200$ced241d5@hagrid>

Stepan Koltsov wrote:

> > This is an incompatible change. Your PEP does not address
> > how to deal with this at all. I will be forced to reject it unless
> > you come up with a transition strategy (in fact, I don't even want
> > to consider your proposal unless you deal with this).
> 
> For most strings this change will not change program result

and how on earth do you know that?

> (for example number of spaces doesn't matter in SQL queries).

so why do all your examples use SQL queries?

> For others I suggested (in section 'Alternatives') flags 'i' and 'o'
> for string constants.

if you want to interpret multiline strings in a different way, why
cannot you just do like everyone else, and use a function?

    mystring = SQL("""
        blablabla
    """)

(as a bonus, that approach makes it trivial to embed files, images,
xml structures, etc...)

a big -1 from here.

</F>



From guido@python.org  Thu Jul 25 18:51:01 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 25 Jul 2002 13:51:01 -0400
Subject: [Python-Dev] PEP 295 - Interpretation of multiline string constants
References: <20020725160337.GA8999@banana.mx1.ru> <00e301c233fa$121fa6e0$7f00a8c0@pacbell.net> <20020725170640.GA10350@banana.mx1.ru>
Message-ID: <004a01c23403$d9494880$7f00a8c0@pacbell.net>

> > > I wrote a PEP, its number is 295, it is in attachment.
> > > It should be posted somewhere to be discussed so it is here.
> > > Please, look at it and say what you think.
> > 
> > This is an incompatible change. Your PEP does not address
> > how to deal with this at all. I will be forced to reject it unless
> > you come up with a transition strategy (in fact, I don't even want
> > to consider your proposal unless you deal with this).
> 
> For most strings this change will not change program result (for
> example number of spaces doesn't matter in SQL queries). For others
> I suggested (in section 'Alternatives') flags 'i' and 'o' for string
> constants.

You are proposing a language change.  Because of the grave consequences
of such changes you have to explain why you cannot obtain the desired
results with the existing language.  You have completely failed to
provide a motivation for your PEP so far.  If you want your PEP to
be considered you must provide a motivation first.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From yozh@mx1.ru  Thu Jul 25 18:55:28 2002
From: yozh@mx1.ru (Stepan Koltsov)
Date: Thu, 25 Jul 2002 21:55:28 +0400
Subject: [Python-Dev] PEP 295 - Interpretation of multiline string constants
In-Reply-To: <009f01c23400$8259c200$ced241d5@hagrid>
References: <20020725160337.GA8999@banana.mx1.ru> <00e301c233fa$121fa6e0$7f00a8c0@pacbell.net> <20020725170640.GA10350@banana.mx1.ru> <009f01c23400$8259c200$ced241d5@hagrid>
Message-ID: <20020725175528.GA11100@banana.mx1.ru>

On Thu, Jul 25, 2002 at 07:27:07PM +0200, Fredrik Lundh wrote:
> > > This is an incompatible change. Your PEP does not address
> > > how to deal with this at all. I will be forced to reject it unless
> > > you come up with a transition strategy (in fact, I don't even want
> > > to consider your proposal unless you deal with this).
> > 
> > For most strings this change will not change program result
> 
> and how on earth do you know that?

I've seen output of `grep -rwC '"""' Python/Lib/` and
`egrep -rwC '= *"""' Python/Lib/`. Most strings are
docstrings ;-)

> > (for example number of spaces doesn't matter in SQL queries).
> 
> so why do all your examples use SQL queries?

Because I saw this defect of Python first when I wrote SQL queries.

f():
	q = """my
	query""" % vars
	if debug:
		print q # looks bad

> > For others I suggested (in section 'Alternatives') flags 'i' and 'o'
> > for string constants.
> 
> if you want to interpret multiline strings in a different way, why
> cannot you just do like everyone else, and use a function?
> 
>     mystring = SQL("""
>         blablabla
>     """)

Functions don't know current indentation.

> (as a bonus, that approach makes it trivial to embed files, images,
> xml structures, etc...)
> 
> a big -1 from here.

:-(

-- 
mailto: Stepan Koltsov <yozh@mx1.ru>


From mcherm@destiny.com  Thu Jul 25 19:01:42 2002
From: mcherm@destiny.com (Michael Chermside)
Date: Thu, 25 Jul 2002 14:01:42 -0400
Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline string constants
Message-ID: <3D403D06.6010802@destiny.com>

Stephan Koltsov writes:
 > I wrote a PEP, its number is 295, it is in attachment.
       [... PEP on stripping newline and preceeding spaces
            multi-line string literals ...]

I see ___ motivations for the proposals in this PEP, and propose 
alternative solutions for each. NONE of my alternative solutions 
requires ANY modification to the Python language.

--------

Motivation 1 -- Lining up line 1 of multi-line quotes:

Senario: - Use of string with things "lined up" neatly

     >>> def someFunction():
     ...     aMultiLineString = """Foo  X   1.0
     ... Bar  Y   2.5
     ... Baz  Z  15.0
     ... Spam Q  38.9
     ... """

     Notice how line 1 doesn't line up neatly with lines 2-4 because of
     the indenting as well as the text assigning it to a variable. This
     is annoying, and makes it awkward to read.

Solution: - Use a backslash to escape an initial newline

     >>> def someFunction():
     ...     aMultiLineString = """\
     ... Foo  X   1.0
     ... Bar  Y   2.5
     ... Baz  Z  15.0
     ... Spam Q  38.9
     ... """

     Notice that now everything lines up neatly. And we don't need to
     modify Python at all for this to work.

--------

Motivation 2 - Maintaining Indentation

Senario: - Outdenting misleads the eye

     >>> class SomeClass:
     ...     def visitFromWaiter(self):
     ...         if self.seated:
     ...             self.silverware = ['fork','spoon']
     ...             self.menu = """Spam
     ... Spam and Eggs
     ... Spam on Rye
     ... """
     ...             self.napkin = DirtyNapkin()

     Notice how the indentation makes it quite clear when we are inside a
     class, a method, or a flow-control statement by merely watching the
     left-hand margin. But this is crudely interrupted by the multi-line
     string.

Solution: - Process the multi-line string through a function

     >>> class SomeClass:
     ...     def visitFromWaiter(self):
     ...         if self.seated:
     ...             self.silverware = ['fork','spoon']
     ...             self.menu = stripIndent( """\
     ...                 Spam
     ...                 Spam and Eggs
     ...                 Spam on Rye
     ...                 """ )
     ...             self.napkin = DirtyNapkin()

     where stripIndent() has been defined as:

     >>> def stripIndent( s ):
     ...     indent = len(s) - len(s.lstrip())
     ...     sLines = s.split('\n')
     ...     resultLines = [ line[indent:] for line in sLines ]
     ...     return ''.join( resultLines )

     Notice how it is now NICELY indented, at the expense of a tiny
     little 4-line function. Of course, there are faster and safer
     ways to write stripIndent() (I, personally, would use a version
     that checked that each line started with identical indentation
     and raised an exception otherwise), but this version illustrates
     the idea while being very, very readable.

----

In conclusion, I propose you use simpler methods available WITHIN the 
language for solving this problem, rather than proposing a PEP to modify 
the language itself.

-- Michael Chermside



From xscottg@yahoo.com  Thu Jul 25 18:59:50 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Thu, 25 Jul 2002 10:59:50 -0700 (PDT)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <014b01c233b2$5a9bf240$e000a8c0@thomasnotebook>
Message-ID: <20020725175950.8766.qmail@web40103.mail.yahoo.com>

--- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> What if we would 'fix' the buffer interface?
> 

This gets us part of the way there, but still has shortcomings.  For one I,
and people more significant than me, would still need a type that
implemented the bytes object behavior.  Everything but efficient pickling
_could_ be done with third party extensions, but ignoring pickling (which I
don't want to do), then we'd still have several significant third parties
reinventing the same wheel.  To me at least, this feels like a battery that
should be included.


> Extend the PyBufferProcs structure by new fields:
> 
>     typedef size_t (*getlargereadbufferproc)(PyObject *, void **);
>     typedef size_t (*getlargewritebufferproc)(PyObject *, void **);
> 

How would you designate failure/exceptions?  size_t is unsigned everywhere
I can find it, so it can't return a negative number on failure.  I guess
the void** could be filled in with NULL.

>
>     typedef struct {
>             getreadbufferproc bf_getreadbuffer;
>             getwritebufferproc bf_getwritebuffer;
>             getsegcountproc bf_getsegcount;
>             getcharbufferproc bf_getcharbuffer;
>             /* new fields */
>             getlargereadbufferproc bf_getlargereadbufferproc;
>             getlargewritebufferproc bf_getlargewritebufferproc;
>     } PyBufferProcs;
> 
> 
> The new fields are present if the Py_TPFLAGS_HAVE_GETLARGEBUFFER flag
> is set in the object's type. Py_TPFLAGS_HAVE_GETLARGEBUFFER implies
> the Py_TPFLAGS_HAVE_GETCHARBUFFER flag.
> 
> These functions have the same semantics Scott describes: they must
> only be implemented by types only return addresses which are valid as
> long as the Python 'source' object is alive.
> 
> Python strings, unicode strings, mmap objects, and maybe other types
> would expose the large buffer interface, but the array type would
> *not*. We could also change the name from 'large buffer interface'
> to something more sensible, currently I don't have a better name.
> 

I've been trying to keep the proposal as unintrusive as possible while
still implementing the functionality needed.  Adding more flags/members to
PyObjects and modifying string, unicode, mmap, ... feels like a more
intrusive change to me.  I'm open to the idea, but I'm not ready to retract
the current proposal.  Then there is still the problem of needing something
like a bytes object as mentioned above.


Cheers,
    -Scott


__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From thomas.heller@ion-tof.com  Thu Jul 25 20:47:56 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Thu, 25 Jul 2002 21:47:56 +0200
Subject: [Python-Dev] PEP 296 - The Buffer Problem
References: <20020725175950.8766.qmail@web40103.mail.yahoo.com>
Message-ID: <05d501c23414$2c15c650$e000a8c0@thomasnotebook>

From: "Scott Gilbert" <xscottg@yahoo.com>
> --- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> > What if we would 'fix' the buffer interface?
> > 
> 
> This gets us part of the way there, but still has shortcomings.  For one I,
> and people more significant than me, would still need a type that
> implemented the bytes object behavior.

Sure, the extension of the buffer interface is only part of the
picture. The bytes type is still needed as well.

The extension I proposed is motivated by these thoughts:

It would enable some of Python's builtin objects to
expose the interface extension by supplying two
trivial functions for each in the extended tp_as_buffer slot.

The new functions expose a 'safe buffer interface', where
there are guarantees about the lifetime of the pointer. So
your bytes object can be a view of these builtin objects
as well.

It dismisses the segment count of the normal buffer interface.

>   Everything but efficient pickling
> _could_ be done with third party extensions, but ignoring pickling (which I
> don't want to do), then we'd still have several significant third parties
> reinventing the same wheel.  To me at least, this feels like a battery that
> should be included.
> 

I don't think my proposal prevents this.

> 
> > Extend the PyBufferProcs structure by new fields:
> > 
> >     typedef size_t (*getlargereadbufferproc)(PyObject *, void **);
> >     typedef size_t (*getlargewritebufferproc)(PyObject *, void **);
> > 
> 
> How would you designate failure/exceptions?  size_t is unsigned everywhere
> I can find it, so it can't return a negative number on failure.  I guess
> the void** could be filled in with NULL.
> 

Details, not yet fleshed out completely. Store NULL in the void **,
use ptrdiff_t instead of size_t, or something else. Or return
((size_t)-1) on failure. Or return -1 on failure, and fill out
an size_t pointer:
  typedef int (*getlargereadwritebufferproc(PyObject *, size_t *, void **);


> > Python strings, unicode strings, mmap objects, and maybe other types
> > would expose the large buffer interface, but the array type would
> > *not*. We could also change the name from 'large buffer interface'
> > to something more sensible, currently I don't have a better name.

Maybe it should be renamed 'safe buffer interface extension' instead
of 'large buffer interface' (it could be large as well)?

> 
> I've been trying to keep the proposal as unintrusive as possible while
> still implementing the functionality needed.  Adding more flags/members to
> PyObjects and modifying string, unicode, mmap, ... feels like a more
> intrusive change to me.  I'm open to the idea, but I'm not ready to retract
> the current proposal.  Then there is still the problem of needing something
> like a bytes object as mentioned above.

The advantage (IMO) is that it defines a new protocol to get the
pointer to the internal byte array on objects instead of
requiring that these objects are instances of a special type
or subtype thereof.

> 
> __________________________________________________
> Do You Yahoo!?

No, I google. ;-)

Thomas




From tim.one@comcast.net  Thu Jul 25 20:51:44 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 25 Jul 2002 15:51:44 -0400
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <20020725175950.8766.qmail@web40103.mail.yahoo.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHOEOEDHAA.tim.one@comcast.net>

[Scott Gilbert]
> ...
> How would you designate failure/exceptions?  size_t is unsigned everywhere
> I can find it,

Right, and the std requires that size_t resolve to an unsigned type, so
that's reliable.

> so it can't return a negative number on failure.

The usual dodge is to return (and test against)

    (size_t)-1

in that case.  If the caller sees that the result is (size_t)-1, then it
also needs to call PyErr_Occurred() to see whether it's a normal, or error,
return value (and if it is an error case, the routine had to have set a
Python exception, so that PyErr_Occurred() returns true then).

> I guess the void** could be filled in with NULL.

Sounds easier to me <wink>.



From tdelaney@avaya.com  Thu Jul 25 23:27:23 2002
From: tdelaney@avaya.com (Delaney, Timothy)
Date: Fri, 26 Jul 2002 08:27:23 +1000
Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline string
 constants
Message-ID: <B43D149A9AB2D411971300B0D03D7E8BF0A488@natasha.auslabs.avaya.com>

> From: Michael Chermside [mailto:mcherm@destiny.com]
>
> In conclusion, I propose you use simpler methods available WITHIN the 
> language for solving this problem, rather than proposing a 
> PEP to modify 
> the language itself.

In fact, the simplest mechanism is to declare all multi-line string literals
at module scope.

Presumably all such literals are supposed to be constants (docstrings are a
special exception, but there are already rules for those in terms of how
they should be displayed).

This is a highly incompatible change with very high risk of breaking code.
This is not a -1 or some such - this is a "cannot even be considered unless
you can make it backwards compatible with all uses of multiline strings"
which is of course impossible (since the whole purpose of the PEP is to
modify such strings).

When I first read this PEP I thought it was something that had been
suggested to someone, and it was being proposed in order to be rejeted. It's
obvious from later posts that that is not the case, and Stepan is having
trouble understanding why such a PEP would be rejected out of hand.

You might find support for a library function which performed the
transformation that you desire (if there's a good enough use case for it).
Personally, I don't think there is - too many times that one particular
transformation will be "almost, but not quite what I want" in which case I
need to roll my own anyway.

Tim Delaney


From ping@zesty.ca  Thu Jul 25 22:03:51 2002
From: ping@zesty.ca (Ka-Ping Yee)
Date: Thu, 25 Jul 2002 14:03:51 -0700 (PDT)
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEEMAHAB.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.44.0207251402320.1261-100000@ziggy>

On Wed, 24 Jul 2002, Tim Peters wrote:
> In short, there's no real "speed argument" against this anymore (as I said
> in the first msg of this thread, the ~sort regression was serious -- it's an
> important case; turns out galloping is very effective at speeding it too,
> provided that dumbass premature special-casing doesn't stop galloping from
> trying <wink>).

This is fantastic work, Tim.  I'm all for switching over to timsort as
the one standard sort method.


-- ?!ng

"Most things are, in fact, slippery slopes.  And if you start backing off
from one thing because it's a slippery slope, who knows where you'll stop?"
    -- Sean M. Burke



From python-dev@zesty.ca  Thu Jul 25 23:33:38 2002
From: python-dev@zesty.ca (Ka-Ping Yee)
Date: Thu, 25 Jul 2002 15:33:38 -0700 (PDT)
Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline string  constants
In-Reply-To: <B43D149A9AB2D411971300B0D03D7E8BF0A488@natasha.auslabs.avaya.com>
Message-ID: <Pine.LNX.4.44.0207251532250.13624-100000@ziggy>

On Fri, 26 Jul 2002, Delaney, Timothy wrote:
> You might find support for a library function which performed the
> transformation that you desire (if there's a good enough use case for it).

inspect.getdoc(object) provides this, for docstrings.  There's no
function in the library to do this in general to any string, though.


-- ?!ng

"Mathematics isn't about what's true.  It's about what can be concluded
from what."



From tim.one@comcast.net  Fri Jul 26 02:05:54 2002
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 25 Jul 2002 21:05:54 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEEMAHAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEJNAHAB.tim.one@comcast.net>

[Tim]
> ...
> There's also a significant systematic regression in timsort's +sort case,
> ... also a mix of small regressions and speedups in 3sort.
> These are because, to simplify experimenting, ...(and as many as
> N-1 temp pointers can be needed, up from N/2).  That's all repairable,
> it's just a PITA to do it.

It's repaired, and those glitches went away:

> timsort
>  i    2**i  *sort  \sort  /sort  3sort  +sort  ~sort  =sort  !sort
> 15   32768   0.17   0.01   0.02   0.01   0.01   0.05   0.01   0.02
> 16   65536   0.24   0.02   0.02   0.02   0.02   0.09   0.02   0.04
> 17  131072   0.54   0.05   0.04   0.05   0.05   0.19   0.04   0.09
> 18  262144   1.17   0.09   0.09   0.10   0.10   0.38   0.09   0.18
> 19  524288   2.56   0.18   0.17   0.20   0.20   0.79   0.17   0.36
> 20 1048576   5.54   0.37   0.35   0.37   0.41   1.62   0.35   0.73

Now at

  15   32768   0.17   0.01   0.01   0.01   0.02   0.09   0.01   0.03
  16   65536   0.24   0.02   0.02   0.02   0.02   0.09   0.02   0.04
  17  131072   0.53   0.05   0.04   0.05   0.05   0.18   0.04   0.09
  18  262144   1.17   0.09   0.09   0.10   0.09   0.38   0.09   0.18
  19  524288   2.56   0.18   0.18   0.19   0.19   0.78   0.17   0.36
  20 1048576   5.53   0.37   0.35   0.36   0.37   1.60   0.35   0.74

In other news, an elf revealed that Perl is moving to an adaptive stable
mergesort(!!!harmonic convergence!!!), and sent some cleaned-up source code.
The comments reference a non-existent paper, but if I change the title and
the year I find it here:

   Optimistic sorting and information theoretic complexity.
   Peter McIlroy.
   SODA (Fourth Annual ACM-SIAM Symposium on Discrete Algorithms), pp
   467-474, Austin, Texas, 25-27 January 1993.

Jeremy got that for me, and it's an extremely relevant paper.  What I've
been calling galloping he called "exponential search", and the paper has
some great analysis, pretty much thoroughly characterizing the set of
permutations for which this kind approach is helpful, and even optimal.
It's a large set <wink>.

Amazingly, citeseer finds only one reference to this paper, also from 1993,
and despite all the work done on adaptive sorting since then.  So it's
either essentially unknown in the research community, was shot full of holes
(but then people would have delighted in citing it just to rub that in <0.5
wink>), or was quickly superceded by a better result (but then ditto!).

I'll leave that a mystery.

I haven't had time yet to study the Perl code.  The timsort algorithm is
clearly more frugal with memory:  worst-case N/2 temp pointers needed, and,
e.g., in +sort it only needs (at most) 10 temp pointers (independent of N).
That may or may not be good, though, depending on whether the Perl algorithm
makes more effective use of the memory hierarchy; offhand I don't think it
does.  OTOH, timsort has 4 flavors of galloping and 2 flavors of binary
search and 2 merge routines, because the memory-saving gimmick can require
merging "from the left" or "from the right", depending on which run is
smaller.  Doubling the number of helper routines is what "PITA" meant in the
quote at the start <wink>.

One more bit of news:  cross-box performance of this stuff is baffling.
Nobody else has tried timsort yet (unless someone who asked for the code
tried an earlier version), but there are Many Mysteries just looking at the
numbers for /sort under current CVS Python.  Recall that /sort is the case
where the data is already sorted:  it does N-1 compares in one scan, and
that's all.  For an array with 2**20 distinct floats that takes 0.35 seconds
on my Win98SE 866MHz Pentium box, compiled w/ MSVC6.  On my Win2K 866MHz
Pentium box, compiled w/ MSVC6, it takes 0.58(!) seconds, and indeed all the
sort tests take incredibly much longer on the Win2K box.  On Fred's faster
Pentium box (I forget exactly how fast, >900MHz and <1GHz), using gcc, the
sort tests take a lot less time than on my Win2K box, but my Win98SE box is
still faster.

Another Mystery (still with the current samplesort):  on Win98SE, !sort is
always a bit faster than *sort.  On Win2K and on Fred's box, it's always a
bit slower.  I'm leaving that a mystery too.  I haven't tried timsort on
another box yet, and given that my home machine may be supernaturally fast,
I'm never going to <wink>.



From xscottg@yahoo.com  Fri Jul 26 02:33:30 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Thu, 25 Jul 2002 18:33:30 -0700 (PDT)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <05d501c23414$2c15c650$e000a8c0@thomasnotebook>
Message-ID: <20020726013330.31053.qmail@web40111.mail.yahoo.com>

--- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> From: "Scott Gilbert" <xscottg@yahoo.com>
> > --- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> > > What if we would 'fix' the buffer interface?
> > > 
> > For one I, and people more significant than me, would still need a
> > type that implemented the bytes object behavior.
> 
> Sure, the extension of the buffer interface is only part of the
> picture. The bytes type is still needed as well.
> 
> The extension I proposed is motivated by these thoughts:
> 
> It would enable some of Python's builtin objects to
> expose the interface extension by supplying two
> trivial functions for each in the extended tp_as_buffer slot.
> 
> The new functions expose a 'safe buffer interface', where
> there are guarantees about the lifetime of the pointer. So
> your bytes object can be a view of these builtin objects
> as well.
> 
> It dismisses the segment count of the normal buffer interface.
> 
[...]
> > 
> > I've been trying to keep the proposal as unintrusive as possible while
> > still implementing the functionality needed.  Adding more flags/members
> > to PyObjects and modifying string, unicode, mmap, ... feels like a more
> > intrusive change to me.  I'm open to the idea, but I'm not ready to
> > retract the current proposal.  Then there is still the problem of 
> > needing something like a bytes object as mentioned above.
> 
> The advantage (IMO) is that it defines a new protocol to get the
> pointer to the internal byte array on objects instead of
> requiring that these objects are instances of a special type
> or subtype thereof.
> 

I like your idea for adding the flags and methods to create a "safe buffer
interface".  As you note, string, unicode, mmap, and possibly other things
could implement these methods and return a (possibly large) pointer that
could be manipulated after the GIL is released.  Of course the pickleable
bytes object falls into that category too.

It seems to me that we have two independant proposals.  Do you see any
reason why they shouldn't be two separate PEPs?  I don't see any reason to
piggyback them into one.  They're related in topic, but neither seems to
rely on the other in any way.










__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From guido@python.org  Fri Jul 26 04:16:36 2002
From: guido@python.org (Guido van Rossum)
Date: Thu, 25 Jul 2002 23:16:36 -0400
Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline stringconstants
References: <B43D149A9AB2D411971300B0D03D7E8BF0A488@natasha.auslabs.avaya.com>
Message-ID: <00a901c23452$daae16c0$7f00a8c0@pacbell.net>

My mails to Stepan Koltsov have been bouncing (after the first one
apparently went through).  Assuming he's not subscribed to python-dev,
he may not be aware of our responses.  What to do?  Simple reject it
in absentia?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From Rick Farrer" <rfarrer@avisionone.com  Fri Jul 26 04:24:41 2002
From: Rick Farrer" <rfarrer@avisionone.com (Rick Farrer)
Date: Thu, 25 Jul 2002 22:24:41 -0500
Subject: [Python-Dev] Please remove me from the mailing list
Message-ID: <000a01c23453$fc4b04e0$3745fea9@ibm1499>

This is a multi-part message in MIME format.

------=_NextPart_000_0007_01C2342A.123F3C00
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Please remove me from the mailing list.

rf@avisionone.com

Thanks,
Rick


------=_NextPart_000_0007_01C2342A.123F3C00
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.2600.0" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>Please remove me from the mailing=20
list.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2><A=20
href=3D"mailto:rf@avisionone.com">rf@avisionone.com</A></FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Thanks,</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>Rick</FONT></DIV>
<DIV>&nbsp;</DIV></BODY></HTML>

------=_NextPart_000_0007_01C2342A.123F3C00--



From cce@clarkevans.com  Fri Jul 26 05:37:10 2002
From: cce@clarkevans.com (Clark C . Evans)
Date: Fri, 26 Jul 2002 00:37:10 -0400
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <20020723063611.26677.qmail@web40102.mail.yahoo.com>; from xscottg@yahoo.com on Mon, Jul 22, 2002 at 11:36:11PM -0700
References: <20020723063611.26677.qmail@web40102.mail.yahoo.com>
Message-ID: <20020726003709.C17944@doublegemini.com>

| Abstract
| 
|     This PEP proposes the creation of a new standard type and builtin
|     constructor called 'bytes'.  The bytes object is an efficiently
|     stored array of bytes with some additional characteristics that
|     set it apart from several implementations that are similar.

This is great.  Python currently lacks two "standard" programming
objects which most languages have: (a) timestamp, and (b) binary.
This addresses the second.  This will greatly help YAML data
interoperability among other programming languages such 
as Java, Ruby, etc.

Best,

Clark
Yo! Check out YAML
Serialization for the masses!
http://yaml.org

-- 
Clark C. Evans                   Axista, Inc.
http://www.axista.com            800.926.5525
XCOLLA Collaborative Project Management Software


From sholden@holdenweb.com  Fri Jul 26 14:15:33 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Fri, 26 Jul 2002 09:15:33 -0400
Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline stringconstants
References: <B43D149A9AB2D411971300B0D03D7E8BF0A488@natasha.auslabs.avaya.com> <00a901c23452$daae16c0$7f00a8c0@pacbell.net>
Message-ID: <127c01c234a6$867957f0$6300000a@holdenweb.com>

----- Original Message -----
From: "Guido van Rossum" <guido@python.org>
To: <python-dev@python.org>
Sent: Thursday, July 25, 2002 11:16 PM
Subject: Re: [Python-Dev] Re: PEP 295 - Interpretation of multiline
stringconstants


> My mails to Stepan Koltsov have been bouncing (after the first one
> apparently went through).  Assuming he's not subscribed to python-dev,
> he may not be aware of our responses.  What to do?  Simple reject it
> in absentia?
>

Well, at least that way he'll see it's been rejected from the PEP listing.
You can always direct him to the Mailman archives when his mail comes back
on line.

regards
-----------------------------------------------------------------------
Steve Holden                                 http://www.holdenweb.com/
Python Web Programming                http://pydish.holdenweb.com/pwp/
-----------------------------------------------------------------------






From mal@lemburg.com  Fri Jul 26 08:35:07 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jul 2002 09:35:07 +0200
Subject: [Python-Dev] Sorting
References: <LNBBLJKPBEHFEDALKOLCGEJNAHAB.tim.one@comcast.net>
Message-ID: <3D40FBAB.8090909@lemburg.com>

Tim Peters wrote:
> One more bit of news:  cross-box performance of this stuff is baffling.
> Nobody else has tried timsort yet (unless someone who asked for the code
> tried an earlier version), but there are Many Mysteries just looking at the
> numbers for /sort under current CVS Python.  Recall that /sort is the case
> where the data is already sorted:  it does N-1 compares in one scan, and
> that's all.  For an array with 2**20 distinct floats that takes 0.35 seconds
> on my Win98SE 866MHz Pentium box, compiled w/ MSVC6.  On my Win2K 866MHz
> Pentium box, compiled w/ MSVC6, it takes 0.58(!) seconds, and indeed all the
> sort tests take incredibly much longer on the Win2K box.  On Fred's faster
> Pentium box (I forget exactly how fast, >900MHz and <1GHz), using gcc, the
> sort tests take a lot less time than on my Win2K box, but my Win98SE box is
> still faster.
> 
> Another Mystery (still with the current samplesort):  on Win98SE, !sort is
> always a bit faster than *sort.  On Win2K and on Fred's box, it's always a
> bit slower.  I'm leaving that a mystery too.  I haven't tried timsort on
> another box yet, and given that my home machine may be supernaturally fast,
> I'm never going to <wink>.

I can give it a go on my AMD boxes if you send me the code.
They tend to show surprising results as you know :-)

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From thomas.heller@ion-tof.com  Fri Jul 26 15:28:50 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 26 Jul 2002 16:28:50 +0200
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
Message-ID: <082b01c234b0$c33564e0$e000a8c0@thomasnotebook>

Here is the draft PEP for the ideas posted here.

Regards,

Thomas
--------

PEP: xxx
Title: The Safe Buffer Interface
Version: $Revision: $
Last-Modified: $Date: 2002/07/26 14:19:38 $
Author: theller@python.net (Thomas Heller)
Status: Draft
Type: Standards Track
Created: 26-Jul-2002
Python-Version: 2.3
Post-History: 26-Jul-2002


Abstract

    This PEP proposes an extension to the buffer interface called the
    'safe buffer interface'.

    The safe buffer interface fixes the flaws of the 'old' buffer
    interface as defined in Python versions up to and including 2.2:

    The lifetime of the retrieved pointer is clearly defined.

    The buffer size is returned as a 'size_t' data type, which allows
    access to 'large' buffers on platforms where sizeof(int) !=
    sizeof(void *).


Specification

    The 'safe' buffer interface exposes new functions which return the
    size and the pointer to the internal memory block of any python
    object which chooses to implement this interface.

    The size and pointer returned must be valid as long as the object
    is alive (has a positive reference count).  So, only objects which
    never reallocate or resize the memory block are allowed to
    implement this interface.

    The safe buffer interface ommits the memory segment model which is
    present in the old buffer interface - only a single memory block
    can be exposed.


Implementation

    Define a new flag in Include/object.h:

        #define Py_TPFLAGS_HAVE_GETSAFEBUFFER

        /* PyBufferProcs contains bf_getsafereadbuffer
           and bf_getsafewritebuffer */
        #define Py_TPFLAGS_HAVE_GETSAFEBUFFER (1L<<15)


    This flag would be included in Py_TPFLAGS_DEFAULT:

        #define Py_TPFLAGS_DEFAULT  ( \
                             ....
                             Py_TPFLAGS_HAVE_GETCHARBUFFER | \
                             ....
                            0)


    Extend the PyBufferProcs structure by new fields in
    Include/object.h:

        typedef size_t (*getlargereadbufferproc)(PyObject *, void **);
        typedef size_t (*getlargewritebufferproc)(PyObject *, void **);

        typedef struct {
                getreadbufferproc bf_getreadbuffer;
                getwritebufferproc bf_getwritebuffer;
                getsegcountproc bf_getsegcount;
                getcharbufferproc bf_getcharbuffer;
                /* safe buffer interface functions */
                getsafereadbufferproc bf_getsafereadbufferproc;
                getsafewritebufferproc bf_getsafewritebufferproc;
        } PyBufferProcs;


    The new fields are present if the Py_TPFLAGS_HAVE_GETLARGEBUFFER
    flag is set in the object's type.

    XXX Py_TPFLAGS_HAVE_GETLARGEBUFFER implies the
    Py_TPFLAGS_HAVE_GETCHARBUFFER flag.

    The getsafereadbufferproc and getsafewritebufferproc functions
    return the size in bytes of the memory block on success, and fill
    in the passed void * pointer on success.  If these functions fail
    - either because an error occurs or no memory block is exposed -
    they must set the void * pointer to NULL and raise an exception.
    The return value is undefined in these cases and should not be
    used.


Backward Compatibility

    There are no backward compatibility problems.


Reference Implementation

    Will be uploaded to the sourceforge patch manager by the author.


Additional Notes/Comments

    It may be a good idea to expose the following convenience functions:

        int PyObject_AsSafeReadBuffer(PyObject *obj,
                                      void **buffer,
                                      size_t *buffer_len);

        int PyObject_AsSafeWriteBuffer(PyObject *obj,
                                       void **buffer,
                                       size_t *buffer_len);

    These functions return 0 on success, set buffer to the memory
    location and buffer_len to the length of the memory block in
    bytes. On failure, they return -1 and set an exception.


    Python strings, unicode strings, mmap objects, and maybe other
    types would expose the safe buffer interface, but the array type
    would *not*, because it's memory block may be reallocated during
    it's lifetime.


References

    [1] The buffer interface
        http://mail.python.org/pipermail/python-dev/2000-October/009974.html
    [2] The Buffer Problem
        http://www.python.org/peps/pep-0296.html


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:




From ville.vainio@swisslog.com  Fri Jul 26 08:11:41 2002
From: ville.vainio@swisslog.com (Ville Vainio)
Date: Fri, 26 Jul 2002 10:11:41 +0300
Subject: [Python-Dev] Multiline string constants, include in the standard library?
References: <20020725194802.22949.82629.Mailman@mail.python.org>
Message-ID: <3D40F62D.7000106@swisslog.com>

>     where stripIndent() has been defined as:
>
>     >>> def stripIndent( s ):
>     ...     indent = len(s) - len(s.lstrip())
>     ...     sLines = s.split('\n')
>     ...     resultLines = [ line[indent:] for line in sLines ]
>     ...     return ''.join( resultLines )


Something like this should really be available somewhere in the standard 
library (string module [yeah, predeprecation, I know], string method). 
Everybody needs this kind of functionality, and probably more often than 
many of the other string methods (title, swapcase come to mind).

-- Ville



From xscottg@yahoo.com  Fri Jul 26 16:01:09 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Fri, 26 Jul 2002 08:01:09 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <082b01c234b0$c33564e0$e000a8c0@thomasnotebook>
Message-ID: <20020726150109.4104.qmail@web40111.mail.yahoo.com>

--- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> Here is the draft PEP for the ideas posted here.
> 
[...]

I like it.  :-)

> 
>         typedef size_t (*getlargereadbufferproc)(PyObject *, void **);
>         typedef size_t (*getlargewritebufferproc)(PyObject *, void **);
> 

I'm sure this is a cut-and-pasto for 

          typedef size_t (*getsafereadbufferproc)(PyObject *, void **);
          typedef size_t (*getsafewritebufferproc)(PyObject *, void **);








__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From thomas.heller@ion-tof.com  Fri Jul 26 16:06:55 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 26 Jul 2002 17:06:55 +0200
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020726150109.4104.qmail@web40111.mail.yahoo.com>
Message-ID: <089d01c234b6$15385220$e000a8c0@thomasnotebook>

From: "Scott Gilbert" <xscottg@yahoo.com>
> > Here is the draft PEP for the ideas posted here.
> > 
> [...]
> 
> I like it.  :-)
:-)

> 
> >         typedef size_t (*getlargereadbufferproc)(PyObject *, void **);
> >         typedef size_t (*getlargewritebufferproc)(PyObject *, void **);
> 
> I'm sure this is a cut-and-pasto for 
> 
>           typedef size_t (*getsafereadbufferproc)(PyObject *, void **);
>           typedef size_t (*getsafewritebufferproc)(PyObject *, void **);
> 
Exactly. Everything is named safebuffer instead of largebuffer.

Thanks,

Thomas



From mwh@python.net  Fri Jul 26 10:44:45 2002
From: mwh@python.net (Michael Hudson)
Date: 26 Jul 2002 10:44:45 +0100
Subject: [Python-Dev] Sorting
In-Reply-To: Tim Peters's message of "Thu, 25 Jul 2002 21:05:54 -0400"
References: <LNBBLJKPBEHFEDALKOLCGEJNAHAB.tim.one@comcast.net>
Message-ID: <2meldq3jsi.fsf@starship.python.net>

Tim Peters <tim.one@comcast.net> writes:

> One more bit of news:  cross-box performance of this stuff is baffling.
> Nobody else has tried timsort yet (unless someone who asked for the code
> tried an earlier version), but there are Many Mysteries just looking at the
> numbers for /sort under current CVS Python.

If you put the code somewhere, I'll try it on my PPC iBook (not today,
as it's at home, but soon).

I'd thank you for working on this, but you're clearly enjoying it an
unhealthy amount already <wink>.

Cheers,
M.

-- 
  ZAPHOD:  You know what I'm thinking?
    FORD:  No.
  ZAPHOD:  Neither do I.  Frightening isn't it?
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 11


From jacobs@penguin.theopalgroup.com  Fri Jul 26 16:18:58 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Fri, 26 Jul 2002 11:18:58 -0400 (EDT)
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEJNAHAB.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.44.0207261116580.8238-100000@penguin.theopalgroup.com>

On Thu, 25 Jul 2002, Tim Peters wrote:
> One more bit of news:  cross-box performance of this stuff is baffling.

I'll run tests on the P4 Xeon, Alpha (21164A, 21264), AMD Elan 520, and
maybe a few Sparcs, and whatever else I can get my hands on.

Just let me know where I can snag the code + test script.

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From pinard@iro.umontreal.ca  Fri Jul 26 16:05:39 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 26 Jul 2002 11:05:39 -0400
Subject: [Python-Dev] Re: Multiline string constants, include in the standard library?
In-Reply-To: <3D40F62D.7000106@swisslog.com>
References: <20020725194802.22949.82629.Mailman@mail.python.org>
 <3D40F62D.7000106@swisslog.com>
Message-ID: <oq4rem34xo.fsf@carouge.sram.qc.ca>

[Ville Vainio]

> >     where stripIndent() has been defined as:
> >
> >     >>> def stripIndent( s ):
> >     ...     indent = len(s) - len(s.lstrip())
> >     ...     sLines = s.split('\n')
> >     ...     resultLines = [ line[indent:] for line in sLines ]
> >     ...     return ''.join( resultLines )


> Something like this should really be available somewhere in the standard
> library (string module [yeah, predeprecation, I know], string
> method). Everybody needs this kind of functionality, and probably more often
> than many of the other string methods (title, swapcase come to mind).

Strange.  I did a lot of Python programming, and never needed this.

In fact, I like my doc-strings and other triple-quoted strings flushed left.
So, I can see them in the code exactly as they will appear on the screen.
If I used artificial margins in Python so my doc-strings appeared to be
indented more than the surrounding, and wrote my code this way, it would
appear artificially constricted on the left once printed.  It's not worth.

For me, best is to use """\ always while the opening triple-quote,
and write flushed left until the closing """.  As most long strings end
with a new line, the closing """ is usually flushed left just as well.
My opinion is that it is nice this way.  Don't touch the thing! :-)

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard



From oren-py-d@hishome.net  Fri Jul 26 09:15:07 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 26 Jul 2002 11:15:07 +0300
Subject: [Python-Dev] Iteration - my summary
Message-ID: <20020726111507.A28836@hishome.net>

There has been some lively discussion about the iteration protocols
lately. My impression of the opinions on the list so far is this:

It could have been semantically cleaner. There is a blurred boundary
between the iterable-container and iterator protocols. Perhaps next should
have been called __next__. Perhaps iterators should not have been required
to implement an __iter__ method returning self. With the benefit of
hindsight the protocols could have been designed better.

But there is nothing fundamentally broken about iteration. Nothing that
justifies any serious change that would break backward compatibility and 
require a transition plan.

A remaining sore spot is re-iterability. Iterators being their own
iterators is ok by itself. StopIteration being a sink state is ok by
itself. When they are combined they result in hard-to-trace silent errors
because an exhausted iterator is indistinguishable from an empty
container. This happens in real code, not in some contrived examples. It
is clear to me that this issue needs to be addressed in some way, but
without a complete redesign of the iteration protocols. My proposal of
raising an exception on calling .next() after StopIteration has been
rejected by Guido. Here's another approach:

Proposal: new built-in function reiter()

def reiter(obj):
    """reiter(obj) -> iterator

Get an iterator from an object. If the object is already an iterator a
TypeError exception will be raised. For all Python built-in types it is
guaranteed that if this function succeeds the next call to reiter() will
return a new iterator that produces the same items unless the object is
modified. Non-builtin iterable objects which are not iterators SHOULD
support multiple iteration returning the same items."""

    it = iter(obj)
    if it is obj:
        raise TypeError('Object is not re-iterable')
    return it


Example:

def cartprod(a,b):
   """ Generate the cartesian product of two sources. """
   for x in a:
       for y in reiter(b):
           yield x,y

This function should raise an exception if object b is a generator or some
other non re-iterable object. List comprehensions should use the C API
equivalent of reiter for sources other than the first.

This solution is less than perfect. It requires explicit attention by the 
programmer and is less comprehensive than the other solutions proposed but I 
think it's better than nothing.

A related issue is iteration of files. It's an exception for the guarantee 
made in the docstring above. My impression is that people generally agree 
that file objects are more iterator-like than container-like because they 
are stateful cursors. However, making files into iterators is not as simple 
as adding a next method that calls readline and raises StopIteration on EOF.
This implementation would lose the performance benefit from the readahead 
bufering done in the xreadlines object.

The way I see file object iteration is that the file object and xreadlines 
object abuse the iterable-container<->iterator relationship to produce a 
cursor-without-readahead-buffer<->cursor-with-readahead-buffer relationship.
I don't like objects pretending to be something they're not.

I can finish my xreadlines caching patch that makes a file into an iterator 
with an embedded xreadlines object. Perhaps it's not the most elegant 
solution but I don't see any real problems with it.  

I am also thinking about implementing line buffering inside the file object 
that can finally get rid of the whole fgets/getc_unlocked multiplatform mess
and make xreadlines unnecessary. The problem here is that readahead is not 
exactly a transparent operation. More on this later.

	Oren



From yozh@mx1.ru  Fri Jul 26 17:05:59 2002
From: yozh@mx1.ru (Stepan Koltsov)
Date: Fri, 26 Jul 2002 20:05:59 +0400
Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline string constants
In-Reply-To: <00a901c23452$daae16c0$7f00a8c0@pacbell.net>
References: <B43D149A9AB2D411971300B0D03D7E8BF0A488@natasha.auslabs.avaya.com> <00a901c23452$daae16c0$7f00a8c0@pacbell.net>
Message-ID: <20020726160559.GA24120@banana.mx1.ru>

On Thu, Jul 25, 2002 at 11:16:36PM -0400, Guido van Rossum wrote:
> My mails to Stepan Koltsov have been bouncing (after the first one
> apparently went through).  Assuming he's not subscribed to python-dev,
> he may not be aware of our responses.  What to do?  Simple reject it
> in absentia?

I don't understand, what happens with my DNS, but I am subscriber of
this maillist and I read it sometimes.

So...  What you (and others) think about just adding flag 'i' to string
constants (that will strip indentation etc.)?  This doesn't affect
existing code, but it will be useful (at least for me ;-)  Motivation
was posted here by Michael Chermside, but I don't like his solutions.

-- 
mailto: Stepan Koltsov <yozh@mx1.ru>


From tim.one@comcast.net  Fri Jul 26 17:02:34 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 26 Jul 2002 12:02:34 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <Pine.LNX.4.44.0207261116580.8238-100000@penguin.theopalgroup.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKELFAHAB.tim.one@comcast.net>

Apart from fine-tuning and rewriting the doc file, I think the mergesort is
done.  I'm confident that if any bugs remain, I haven't seen them <wink>.  A
patch against current CVS listobject.c is here:

    http://www.python.org/sf/587076

Simple instructions for timing exactly the same data I've posted times
against are in the patch description (you already have sortperf.py -- it's
in Lib/test).  This patch doesn't replace samplesort, it adds a new .msort()
method, to make comparative timings easier.  It also adds an .hsort() method
for weak heapsort, because I forgot to delete that code after I gave up on
it <wink>.

X-platform samplesort timings are interesting as well as samplesort versus
mergesort timings.  Timings against "real life" sort jobs are especially
interesting.  Attaching results to the bug report sounds like a good idea to
me, so we get a coherent record in one place.

Thanks in advance!



From mal@lemburg.com  Fri Jul 26 17:58:49 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jul 2002 18:58:49 +0200
Subject: [Python-Dev] Sorting
References: <LNBBLJKPBEHFEDALKOLCKELFAHAB.tim.one@comcast.net>
Message-ID: <3D417FC9.6030308@lemburg.com>

Tim Peters wrote:
> Apart from fine-tuning and rewriting the doc file, I think the mergesort is
> done.  I'm confident that if any bugs remain, I haven't seen them <wink>.  A
> patch against current CVS listobject.c is here:
> 
>     http://www.python.org/sf/587076
> 
> Simple instructions for timing exactly the same data I've posted times
> against are in the patch description (you already have sortperf.py -- it's
> in Lib/test).  This patch doesn't replace samplesort, it adds a new .msort()
> method, to make comparative timings easier.  It also adds an .hsort() method
> for weak heapsort, because I forgot to delete that code after I gave up on
> it <wink>.
> 
> X-platform samplesort timings are interesting as well as samplesort versus
> mergesort timings.  Timings against "real life" sort jobs are especially
> interesting.  Attaching results to the bug report sounds like a good idea to
> me, so we get a coherent record in one place.
> 
> Thanks in advance!

Here's the result for AMD Athlon 1.2GHz/Linux/gcc:

Python/Tim-Python> ./python -O Lib/test/sortperf.py 15 20 1
  i    2**i  *sort  \sort  /sort  3sort  +sort  ~sort  =sort  !sort
15   32768   0.07   0.00   0.01   0.09   0.01   0.03   0.01   0.08
16   65536   0.18   0.02   0.02   0.19   0.03   0.07   0.02   0.20
17  131072   0.43   0.05   0.04   0.46   0.05   0.18   0.05   0.48
18  262144   0.99   0.09   0.10   1.04   0.13   0.40   0.09   1.11
19  524288   2.23   0.19   0.21   2.32   0.24   0.83   0.20   2.46
20 1048576   4.96   0.40   0.40   5.41   0.47   1.72   0.40   5.46

without patch:

Python/Tim-Python> ./python -O Lib/test/sortperf.py 15 20 1
  i    2**i  *sort  \sort  /sort  3sort  +sort  ~sort  =sort  !sort
15   32768   0.08   0.01   0.01   0.09   0.01   0.03   0.00   0.09
16   65536   0.20   0.02   0.01   0.20   0.03   0.07   0.02   0.20
17  131072   0.46   0.06   0.02   0.45   0.05   0.20   0.04   0.49
18  262144   0.99   0.09   0.10   1.09   0.11   0.40   0.12   1.12
19  524288   2.33   0.20   0.20   2.30   0.24   0.83   0.19   2.47
20 1048576   4.89   0.40   0.41   5.37   0.48   1.71   0.38   6.22

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/




From tim.one@comcast.net  Fri Jul 26 18:22:04 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 26 Jul 2002 13:22:04 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <3D417FC9.6030308@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOELPAHAB.tim.one@comcast.net>

[MAL]
> Here's the result for AMD Athlon 1.2GHz/Linux/gcc:
>
> Python/Tim-Python> ./python -O Lib/test/sortperf.py 15 20 1
>   i    2**i  *sort  \sort  /sort  3sort  +sort  ~sort  =sort  !sort
> 15   32768   0.07   0.00   0.01   0.09   0.01   0.03   0.01   0.08
> 16   65536   0.18   0.02   0.02   0.19   0.03   0.07   0.02   0.20
> 17  131072   0.43   0.05   0.04   0.46   0.05   0.18   0.05   0.48
> 18  262144   0.99   0.09   0.10   1.04   0.13   0.40   0.09   1.11
> 19  524288   2.23   0.19   0.21   2.32   0.24   0.83   0.20   2.46
> 20 1048576   4.96   0.40   0.40   5.41   0.47   1.72   0.40   5.46
>
> without patch:
>
> Python/Tim-Python> ./python -O Lib/test/sortperf.py 15 20 1
>   i    2**i  *sort  \sort  /sort  3sort  +sort  ~sort  =sort  !sort
> 15   32768   0.08   0.01   0.01   0.09   0.01   0.03   0.00   0.09
> 16   65536   0.20   0.02   0.01   0.20   0.03   0.07   0.02   0.20
> 17  131072   0.46   0.06   0.02   0.45   0.05   0.20   0.04   0.49
> 18  262144   0.99   0.09   0.10   1.09   0.11   0.40   0.12   1.12
> 19  524288   2.33   0.20   0.20   2.30   0.24   0.83   0.19   2.47
> 20 1048576   4.89   0.40   0.41   5.37   0.48   1.71   0.38   6.22

I assume you didn't read the instructions in the patch description:

    http://www.python.org/sf/587076

The patch doesn't change anything about how list.sort() works, so what
you've shown us is the timing variance on your box across two identical
runs.  To time the new routine, you need to (temporarily) change L.sort() to
L.msort() in sortperf.py's doit() function.  It's a one-character change,
but an important one <wink>.



From tim.one@comcast.net  Fri Jul 26 18:50:30 2002
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 26 Jul 2002 13:50:30 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <3D418884.2090509@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEMEAHAB.tim.one@comcast.net>

[MAL]
> Dang. Why don't you distribute a ZIP file which can be dumped
> onto the standard Python installation ?

A zip file containing what?  And which "standard Python installation"?  If
someone is on Python-Dev but can't deal with a one-file patch against CVS,
I'm not sure what to conclude, except that I don't want to deal with them at
this point <wink>.

> Here's the .msort() version:
>
> Python/Tim-Python> ./python -O sortperf.py 15 20 1
>  i    2**i  *sort  \sort  /sort  3sort  +sort  ~sort  =sort  !sort
> 15   32768   0.08   0.01   0.01   0.01   0.01   0.03   0.00   0.02
> 16   65536   0.17   0.02   0.02   0.02   0.02   0.07   0.02   0.06
> 17  131072   0.41   0.05   0.04   0.05   0.04   0.16   0.04   0.09
> 18  262144   0.95   0.10   0.10   0.10   0.10   0.33   0.10   0.20
> 19  524288   2.17   0.20   0.21   0.20   0.21   0.66   0.20   0.44
> 20 1048576   4.85   0.42   0.40   0.41   0.41   1.37   0.41   0.84

Thanks!  That's more like it.  So far I've got the only known box were ~sort
is slower under msort (two other sets of timings were attached to the patch;
I'll paste yours in too, merging in the smaller numbers from your first
report).



From mal@lemburg.com  Fri Jul 26 18:36:04 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jul 2002 19:36:04 +0200
Subject: [Python-Dev] Sorting
References: <LNBBLJKPBEHFEDALKOLCOELPAHAB.tim.one@comcast.net>
Message-ID: <3D418884.2090509@lemburg.com>

Tim Peters wrote:
> [MAL]
> 
>>Here's the result for AMD Athlon 1.2GHz/Linux/gcc:
>>
>>without patch:
>>
>>Python/Tim-Python> ./python -O Lib/test/sortperf.py 15 20 1
>>  i    2**i  *sort  \sort  /sort  3sort  +sort  ~sort  =sort  !sort
>>15   32768   0.08   0.01   0.01   0.09   0.01   0.03   0.00   0.09
>>16   65536   0.20   0.02   0.01   0.20   0.03   0.07   0.02   0.20
>>17  131072   0.46   0.06   0.02   0.45   0.05   0.20   0.04   0.49
>>18  262144   0.99   0.09   0.10   1.09   0.11   0.40   0.12   1.12
>>19  524288   2.33   0.20   0.20   2.30   0.24   0.83   0.19   2.47
>>20 1048576   4.89   0.40   0.41   5.37   0.48   1.71   0.38   6.22
> 
> 
> I assume you didn't read the instructions in the patch description:
> 
>     http://www.python.org/sf/587076
> 
> The patch doesn't change anything about how list.sort() works, so what
> you've shown us is the timing variance on your box across two identical
> runs.  To time the new routine, you need to (temporarily) change L.sort() to
> L.msort() in sortperf.py's doit() function.  It's a one-character change,
> but an important one <wink>.

Dang. Why don't you distribute a ZIP file which can be dumped
onto the standard Python installation ?

Here's the .msort() version:

Python/Tim-Python> ./python -O sortperf.py 15 20 1
  i    2**i  *sort  \sort  /sort  3sort  +sort  ~sort  =sort  !sort
15   32768   0.08   0.01   0.01   0.01   0.01   0.03   0.00   0.02
16   65536   0.17   0.02   0.02   0.02   0.02   0.07   0.02   0.06
17  131072   0.41   0.05   0.04   0.05   0.04   0.16   0.04   0.09
18  262144   0.95   0.10   0.10   0.10   0.10   0.33   0.10   0.20
19  524288   2.17   0.20   0.21   0.20   0.21   0.66   0.20   0.44
20 1048576   4.85   0.42   0.40   0.41   0.41   1.37   0.41   0.84

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From thomas.heller@ion-tof.com  Fri Jul 26 19:17:10 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 26 Jul 2002 20:17:10 +0200
Subject: [Python-Dev] PEP 296 - The Buffer Problem
References: <20020723063611.26677.qmail@web40102.mail.yahoo.com>
Message-ID: <0a7101c234d0$a8c463c0$e000a8c0@thomasnotebook>

[sorry if you see this twice, didn't seem to get through
the first time]

If the safe buffer PEP would be accepted and implemented,
here's my proposal for the bytes object.

The bytes object uses the safe buffer interface to gain
access to the byte array it exposes.

The bytes type would probably accept the following arguments:

  PyObject *type - the (bytes) type or subtype to create
  PyObject *obj - the object exposing the safe buffer interface
  size_t offset - starting offset of obj's memory block
  size_t length - number of bytes to use (0 for all)

and maybe a flag requesting read or read/write access.

A convention could be that if a NULL is passed for obj,
then the bytes object itself allocates a memory block
of length length.

Of course the bytes object itself would also expose the safe
buffer interface. And slicing, but not repetition.

Isn't the above sufficient (provided that we somehow
add the pickle stuff into this picture)?

Thomas




From thomas.heller@ion-tof.com  Fri Jul 26 18:46:33 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Fri, 26 Jul 2002 19:46:33 +0200
Subject: [Python-Dev] PEP 296 - The Buffer Problem
References: <20020723063611.26677.qmail@web40102.mail.yahoo.com>
Message-ID: <098c01c234cc$621a78f0$e000a8c0@thomasnotebook>

If the safe buffer PEP would be accepted and implemented,
here's my proposal for the bytes object.

The bytes object uses the safe buffer interface to gain
access to the byte array it exposes.

The bytes type would probably accept the following arguments:

  PyObject *type - the (bytes) type or subtype to create
  PyObject *obj - the object exposing the safe buffer interface
  size_t offset - starting offset of obj's memory block
  size_t length - number of bytes to use (0 for all)

and maybe a flag requesting read or read/write access.

A convention could be that if a NULL is passed for obj,
then the bytes object itself allocates a memory block
of length length.

Of course the bytes object itself would also expose the safe
buffer interface. And slicing, but not repetition.

Isn't the above sufficient (provided that we somehow
add the pickle stuff into this picture)?

Thomas



From guido@python.org  Fri Jul 26 21:48:30 2002
From: guido@python.org (Guido van Rossum)
Date: Fri, 26 Jul 2002 16:48:30 -0400
Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline string constants
In-Reply-To: Your message of "Fri, 26 Jul 2002 20:05:59 +0400."
 <20020726160559.GA24120@banana.mx1.ru>
References: <B43D149A9AB2D411971300B0D03D7E8BF0A488@natasha.auslabs.avaya.com> <00a901c23452$daae16c0$7f00a8c0@pacbell.net>
 <20020726160559.GA24120@banana.mx1.ru>
Message-ID: <200207262048.g6QKmU123924@pcp02138704pcs.reston01.va.comcast.net>

> So...  What you (and others) think about just adding flag 'i' to string
> constants (that will strip indentation etc.)?  This doesn't affect
> existing code, but it will be useful (at least for me ;-)  Motivation
> was posted here by Michael Chermside, but I don't like his solutions.

And I don't like your proposal.  Sorry, but I really don't think the
syntax should be changed for something that's so trivial to code if
you need it.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From nhodgson@bigpond.net.au  Sat Jul 27 01:51:39 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Sat, 27 Jul 2002 10:51:39 +1000
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <082b01c234b0$c33564e0$e000a8c0@thomasnotebook>
Message-ID: <028501c23507$c46ebcb0$3da48490@neil>

Thomas Heller:

>     The size and pointer returned must be valid as long as the object
>     is alive (has a positive reference count).  So, only objects which
>     never reallocate or resize the memory block are allowed to
>     implement this interface.

   I'd prefer an interface that allows for reallocation but has an explicit
locked state during which the buffer must stay still. My motivation comes
from the data structures implemented in Scintilla (an editor component),
which could be exposed through this buffer interface to other code. The most
important type in Scintilla (as in many editors) is a split (or gapped)
buffer. Upon receiving a lock call, it could collapse the gap and return a
stable pointer to its contents and then revert to its normal behaviour on
receiving an unlock.

   Neil




From xscottg@yahoo.com  Sat Jul 27 03:26:38 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Fri, 26 Jul 2002 19:26:38 -0700 (PDT)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <098c01c234cc$621a78f0$e000a8c0@thomasnotebook>
Message-ID: <20020727022638.86727.qmail@web40101.mail.yahoo.com>

--- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> If the safe buffer PEP would be accepted and implemented,
> here's my proposal for the bytes object.
> 
> The bytes object uses the safe buffer interface to gain
> access to the byte array it exposes.
> 
> The bytes type would probably accept the following arguments:
> 
>   PyObject *type - the (bytes) type or subtype to create
>   PyObject *obj - the object exposing the safe buffer interface
>   size_t offset - starting offset of obj's memory block
>   size_t length - number of bytes to use (0 for all)
> 
> and maybe a flag requesting read or read/write access.
> 
> A convention could be that if a NULL is passed for obj,
> then the bytes object itself allocates a memory block
> of length length.
> 
> Of course the bytes object itself would also expose the safe
> buffer interface. And slicing, but not repetition.
> 
> Isn't the above sufficient (provided that we somehow
> add the pickle stuff into this picture)?
> 

It's probably sufficient but more than necessary.  In particular,
supporting the safe buffer protocol makes sense to me (if that gets
accepted), but I'm not eager to immediately support the obj pointer as you
describe above.

We've gotten side-tracked a bit when describing the "view behavior" for the
slicing operations on a bytes object.  It was not my intent that the bytes
object typically be used to create views into other Python objects.  That
whole discussion was an attempt to describe the slicing behavior.  From my
perspective, describing the whole inner-thing and outer-thing stuff was to
explain the implementation.  Think of the bytes object as a mutable string
with some additional restrictions, and that's what I have in mind.

The mmap example is sort of a retrofit since mmap should probably have been
implemented via something like bytes in the first place (to get the bytes
style slicing among other things), not because I think there are a lot of
objects that you would want to wrap up in bytes views.

The existing buffer object is ok for creating views, and truthfully I don't
know how often it is really used for that.  What I (and I think others)
need is more like a pickleable-mutable-reliable-byte-string.  I'm not eager
to grow bytes into a superset object.

Even if I'm wrong about the need for this, at the very least, the
additional functionality can be added later.  I really just want to push
through a simple, usable, bytes object for the time being.  We can easily
add, we can't easily take away.







__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From xscottg@yahoo.com  Sat Jul 27 03:40:12 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Fri, 26 Jul 2002 19:40:12 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <028501c23507$c46ebcb0$3da48490@neil>
Message-ID: <20020727024012.85905.qmail@web40107.mail.yahoo.com>

--- Neil Hodgson <nhodgson@bigpond.net.au> wrote:
> Thomas Heller:
> 
> >     The size and pointer returned must be valid as long as the object
> >     is alive (has a positive reference count).  So, only objects which
> >     never reallocate or resize the memory block are allowed to
> >     implement this interface.
> 
> I'd prefer an interface that allows for reallocation but has an explicit
> locked state during which the buffer must stay still. My motivation comes
> from the data structures implemented in Scintilla (an editor component),
> which could be exposed through this buffer interface to other code. The
> most important type in Scintilla (as in many editors) is a split (or 
> gapped) buffer. Upon receiving a lock call, it could collapse the gap and
> return a stable pointer to its contents and then revert to its normal 
> behaviour on receiving an unlock.
> 

A couple of questions come to mind:

First, could this be implemented by a gapped_buffer object that implements
the locking functionality you want, but that returns simple buffers to work
with when the object is locked.  In other words, do we need to add this
extra functionality up in the core protocol when it can be implemented
specifically the way Scintilla (cool editor by the way) wants it to be in
the Scintilla specific extension.


Second, if you are using mutexes to do this stuff, you'll have to be very
careful about deadlock.  I imagine:

  thread 1:
      grab the object lock
      grab the object pointer
      release the GIL
      do some work
      acquire the GIL # deadlock

  thread 2:
      acquire the GIL
      try to resize the object # requires no outstanding locks

Thread 2 needs to make sure no objects are holding the object lock when it
does the resize, but thread 1 can't acquire the GIL until thread 2 gives it
up.  Both are stuck.

If you choose not to implement the locks with true mutexes, then you're
probably going to end up polling and that's bad too.

Is there a way out of this?  This is part of the reason I didn't want to
put a lock state into the bytes object.







__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From ask@perl.org  Sat Jul 27 06:40:33 2002
From: ask@perl.org (Ask Bjoern Hansen)
Date: Fri, 26 Jul 2002 22:40:33 -0700 (PDT)
Subject: [Python-Dev] python.org/switch/
Message-ID: <20020726223911.T70962-100000@onion.valueclick.com>

As presented on the Perl Lightning talks here at OSCON: Switch
movies.

You guys will dig Nathan's (nat.mov and nat.mpg).

http://www.perl.org/tpc/2002/movies/switch/

;-)

  - ask

-- 
ask bjoern hansen, http://askbjoernhansen.com/   !try; do();



From tim.one@comcast.net  Sat Jul 27 09:02:48 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 27 Jul 2002 04:02:48 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOELPAHAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPBAHAB.tim.one@comcast.net>

    http://www.python.org/sf/587076

has collected timings on 5 boxes so far.  I also noted that msort() gets a
32% speedup on my box when sorting a 1.33-million line snapshot of the
Python-Dev archive.  This is a puzzler to account for, since you wouldn't
think there's significant pre-existing lexicographic order in a file like
that.  McIlroy noted similar results from experiments on text, PostScript
and C source files in his adaptive mergesort (which is why I tried sorting
Python-Dev to begin with), but didn't offer a hypothesis.

Performance across platforms is a hoot so far, with Neal's box even seeing a
~6% speedup on *sort.

Skip's Pentium III acts most like my Pentium III, which shouldn't be
surprising.  Ours are the only reports where !sort is faster than *sort for
samplesort, and also where ~sort under samplesort is faster than ~sort under
timsort.

~sort (only 4 distinct values, repeated N/4 times) remains the most puzzling
of the tests by far.  Relative to its performance under samplesort,

    sf userid    ~sort speedup under timsort (negative means slower)
    ---------    ---------------------------------------------------
    montanaro    -23%
    tim_one      - 6%
    jacobs99     +18%
    lemburg      +25%
    nascheme     +30%

Maybe it's a big win for AMD boxes, and a mixed bag for Intel boxes.  Or
maybe it's a win for newer boxes, and a loss for older boxes.  Or maybe it's
a bigger win the higher the clock rate (it hurt the most on the slowest box,
and helped the most on the fastest).  Since it ends up doing a sequence of
perfectly balanced merges from start to finish, I thought perhaps it has to
do with OS and/or chip intelligence in read-ahead cache optimizations -- but
*sort also ends up doing a sequence of perfectly balanced merges, and
doesn't behave at all like ~sort across boxes.  ~sort does exercise the
galloping code much more than other tests (*sort rarely gets into galloping
mode; ~sort never gets out of galloping mode), so maybe it really has most
to do with cache design.

Whatever, it's starting to look like a no-brainer -- except for the
extremely mixed ~sort results, the numbers so far are great.



From mal@lemburg.com  Sat Jul 27 09:54:35 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 27 Jul 2002 10:54:35 +0200
Subject: [Python-Dev] Sorting
References: <LNBBLJKPBEHFEDALKOLCMEMEAHAB.tim.one@comcast.net>
Message-ID: <3D425FCB.2010104@lemburg.com>

This is a multi-part message in MIME format.
--------------080204010802000409070906
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit

Tim Peters wrote:
> [MAL]
> 
>>Dang. Why don't you distribute a ZIP file which can be dumped
>>onto the standard Python installation ?
> 
> 
> A zip file containing what?  And which "standard Python installation"?  If
> someone is on Python-Dev but can't deal with a one-file patch against CVS,
> I'm not sure what to conclude, except that I don't want to deal with them at
> this point <wink>.

Point taken ;-) I meant something like this:

Here's a ZIP file. To install take your standard Python CVS
download, unzip it on top of it, then run

echo "With .sort()"
./python -O sortperf.py 15 20 1
echo "With .msort()"
./python -O timsortperf.py 15 20 1

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/

--------------080204010802000409070906
Content-Type: application/zip;
 name="tim.zip"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
 filename="tim.zip"

UEsDBBQAAAAIABSc+izfCZtEKlMAAOEeAQAUABUAT2JqZWN0cy9saXN0b2JqZWN0LmNVVAkA
A7iHQT3BqUE9VXgEAPQBZADMPGtzGzeSn8lfASu1NkmRMinbsmPa2lJk2quLIvskba1yjouF
GYIkzOFgMg9SzMb726+7gZnBPEhJyV7dqlIxOQAajX53o4dPO+xcRjFTzlfhxkwuA08shR/z
WCqfdZ42m99J3/WSiWB7nzbxXPkH8z18OJ2IKbu6fnc6/tvo5N3o8iqf+CaKJzB6MD9ufie8
SNgjm+hpvAlEBIONxtMOe69CFsnfxDjGzb4T/kROm80I93eZ9ONmqBJ/kgQ4pwXfmd9u/rPZ
SPxIznwxYfTIkXHE3rL+sDxwCE9b9qO2z46P2Yths4mbXyJslgSvmw3WYWdT5rM3TP8dvjjq
slgxzpaJF0ugClNTM8ZeHVQWHPafv9q64Oh5ZcHg6Nmr59sWvBgclhcMng36Lw+3LHje//6o
tGAA+Lx4ueUMzw5fHpXP8OrZq1dH/fozHB4dDp6XznD0ctCHFbVnOOx//3LwoniGF8+OXr3s
fz+oO8Pg6OXLl4cDc4aDg8LCw06n9WL/WUe2a1biIA7RClp1PZcRUysR9rjnKZfHImJBqAIV
okRzD2HEc8E8lHqUqi5b8oX0ZyxUakkgpiCTfDKRZsEsVOt4fsAAtChARg2BzZbSm3SZk4D2
RLRe+CqZzXGjmVzhTr7gYS+WS8H4EvH4DYTREXO+krATQoRTecqf0epI/JoI36Xj8SAAjYha
bZBdwjoIRZQNskCp0Nv0AhECykuZAthEsViyUBCasLa1nkt3jqhyeirjTZeJg9lBl3E3VBE8
9zw29fhKhfoEAP0f0p+oddRlaxnP8dv3tznOjkCCBRzO4iYeBySYwyes12PcnxCEtXgCR49i
CZBnKkZ6AuoRiwIO2E9DPsutDLDHAasDbPP1RpqKK+HrzWNkaeTOwTABxh4YISSvjOEwvyYS
wIJIwPmQgkBKZGwExCcgAHqlFgJpt4T/cR+XJRHQP1ZaaJ42GxPFwKQ0wFocH79lz4b4mUzK
vv72jQEBQd5a/mEbvoYiTkKftVpkS2hmm+2zQZu9eaO/DpvsG9hIsIHAenZxObo6+59Ra8VD
kGAwfl3mS+BQ1G7A3y9N3L5h/n5pNow1HPtiPcbPYMJsG2iWDvVkOWWtfOYbsHatf7U0hHa/
zZ6ShKtpC7dtt9u0pvFp85NYjqtYZYAQOkxE253j1YCZgMvF38/Ph+ZZRph+u9n8tPmofUgH
PqJPGV+INZlsAolWWz9Pp6lgmJ3WdzagqEN9IH0WBvj/k7AdheH4Bz4582MRgkKegrS2kBEp
JwglYFOzocEAlgSik54+Rw2XgeE/nQt3QXqO2jf11JoEATc3IJ7WrGWP3rKUuHQmws8godG8
UEBaFW4IP0BIBYBMCmL84ZRoYpOhyx4bal0jiwwFcJkmdWGP/KA5md6mdFJB71g5YxSPlE00
FblYM8E6GJxMi8RPJ+fnH08NCYjCGhlrnY3VjqPjxo0lCKqI7fVd1u+mhO1Yz9uGVxnVaMRI
P/4Dz8c2Ga8vT05/BMi2PtqcQtkCFcQAwpD3CpUnnwJLUSLxfI/MDBIKhHlfuesNcgrnT1ut
opi3ASLrHTNzIMLLxDc5OmBsxa0Iw2GdHn0Q8RkQqYB9lwIc+ecPUZQpiXrHfv+dSXb89q6j
6C1omUE/kw+UjvwhHPQqDsFljN+Haqk/tnBKY4/cMM1kCjwoOJ6QgxnfIxQ19lfCIADYjG7d
8RnOhgEVdlm6R/2J7uYICt9n+aUiLIbioZjBV3AqRdJnj4kH3SYr/tWsAstKco7sqgPqTXB4
WDvYCYZ38Hh88250ejl630q3GT5Ifv8g43fuCxQ0fK4wzWI8jzA0x0hgqwyUcEV7egc/wRdL
mGxoCvOJgB1cafAcNm3kzUTLlPRtJUWxkH40KG0aCW+qlXA9FyF4z5xhK62VeCjaKWMkuW7D
zFXRlj6MW7i5ZSLfsrOLa7DeNzasMgc+Gk+XM2HP5b6vYzMGlluYJBCDJ4qiahlAZ0CrnKJg
CJpGOTRuU6PLCtjuD1InZyBVqVD0JDXHJ4pThADj+ovO/vLB4+Ku9sTCACzCMADkvzwA0Szp
Ai0bMoRAGH+W+4MvMNt8+aKF6eyCZGlFh6MRWocTVxjm2NRKF9vPiTT7ZRG0TNKZH4kwrvEB
FfGzbc2/xb+Zr6QEVdXrphjkRqCI+QklMSXM///QhQUYlbZrzUjGi9JxIGb8SUDiMYmoJmJM
w0rJSRM1ZTwxyVY5wq21BBjA/N2/Drk+nxYgiGiu/nZ6cjG+Onk/Gv8w+nB2gYPNSgj2yFYY
wOudwpzGAWBrHk5A9VCeT+ch4CEh37mW0VKEkOo0GGWwoXiCeWBElR4GcV7MXA4RotaOSC3F
HKJhSrlCMUlcEeml8Tzk0RzTPpjpA4QO2JNNh0H2NzPZNCxxIb+MIbuiNBCXyeVSTCQ8gxRx
IjwBgwcUazdQ46xAb0gaStmEUTwT1xZMdU4H0Lws0NTR6/vL0cieUQwmMcXpHcfBeBoK0bKD
xS0sGF28IwYUPQFxOwCzGld43WXvz85HrDM1igkJ9SyyBaBJZ4aNLkUQjlCMczxSNLQvfpSd
PvPNbSvgRmFqTAmLaQu32/t8cHDwpWCv+/r0xVk0I7N3/SFD0DYTmNzftzc+NhsXwMB/WXKQ
yfQnokmRQSCLML/fts7TMMc/F3xVx4aiNuMJyofQx7wDTJ0jtwJr4mII60tM3Oa5Qas6cKSg
8DCQwqVkE9URJoQiSrw4y712c3tVx+yUvUh29tfaqDnlNHudx7kmoCgEBBlEg1M9LCMyMwUe
/53yhQGXncxK5PspwulgHulby5vGIhFt21jQEdydM6GLygeMXagYqzE81iZmyTdsmQB/RFaU
AwMNdiNS8DzCEs9UxO68WLLL4IKchVRD0jalLNkrS7D396URbKxIwI4J+t9GZOfnyClDR9vE
UMRVSG1sgjU0sJxaxt9pOkH0Y6LisbFh8J0xoJKu7eUWEx0OHtdYR9xTA06Vv8wlgHECkRty
EUyrqzQlorTAOZUhRtiwhcfROGPEockEUbcOJXR2OboeU+SmEW6TzmNoslVq9lJR2CYFDdSV
nCC4w9n16KeMJH2twQb2qfJdHp/4k3fCaz2GYVytZ+hkrLLcEHUXCjvw//Kn8d9GuR4b7DgZ
AjWYbzvZDsA5WRBzjeB2FfwELDdVT6ztxmoG0QtklVRPRQPOHBGvhTCKs5VWxtTvJFZmY8YZ
iP9S0m/BeQz+2l7n8t9s4tLXhSysMLXetK8sy643LedpZNc94c/iecmyc5N602JuRR0AoArC
VX7MIYAsA7HjVuEVfXvZ+PBtxsddFoqCl9Kdn6plwEPxg1JeS3jdiuDBxrKtE2eKqoBEo//O
bBMCPC5GCOQ8qfKXTiiFEKVgWTvKbZ5S6upTiRTFAlSxgJAf/j++SJQnb7xs+QviYpeKttAp
8qQrthAKMm/zaS5n85pyuE9lHhN6EL2wLG3YRp91jptxlZ4d26TOJ9oSroHhrgANx2kefX9L
3wtA6XkFqpldAOsHuoxcyKHskEGv6uk9DR5+wWwVmZFpECFFSkQgrKjUKrJk6GSMaZQTcUDR
mmEwSVPyb7WVY18XjmuLsFeGvfdgbTWR5X+gImseWLJVJjeioLenrXcJp0vOaJdFc5zMoqWS
k0XCVUmtnBCW20d8r8IlT9URLzfywhNjWHqCKA6SQo2W8DEEJD1vYUnql72/HBz2+9EvEOmm
1Siz1HGK+ZzPl6KWfOn9m1PN9PGkDXO9kMs022fOgwQ8vbIoCXb9lc0u//BvkW9LsMt7OXfv
5TxMl/Yt/O/WqO8SH9tFnN3ZmKgTz7znQxfb2ddhQURrRLOmdO6nhtQ3VrTK/A7z/wTPKxyn
cnNOspq827d4QWNf9djXknB8TSfpGrYlFF+JURanOiaHDqiUaLLnuvsxY+bKcQ9kBg9wYtWa
N0SePwiXJ+BMPt980ZEdQ1UPhZuEkVxhGUj6dCVPyq4CkaYtym/qOhOkhZQHsrXQOWCgojiA
gJFC2Rwud2O5kvGGJX4sPVrMp3h1kuWKAAnrYtg8gdf5gIjypcs9Fs15IA5MPQxoL2h1ul/a
OALJk90HwsOQb7rsCZxl43riCVFCMWquSNdjXA6JWCjJsgUbwkUXuinphSAvLYOxKYQ5OE5r
Ed8D9rrXonjckmCzW9cIc+k2wegCiBKQ/opS4ykm3x53KePWdKC7bfgyoWmncwyekCha7s3g
ggbPlQpMqIX9WNtN6KpdvsewtCsLJwoOYmX8g5+Zm7Twh1M5AnKMpONpAuFKIDtVJ/f4Z/n6
KxoavodtJkRbx+S4lDDTGUDUSf7RoFle06Gc0c/qS+loLu5FP9pl2uRZWcuqUJwy+3wrXrDf
w+/B356WMbr52u7xCK29fNVqt9MrXOX9h8eN5gqE27ZxgtYSAjQNp2dHjCnkFAO0gUYpKGnA
0u/F6B+WeesWwLSH2Q19cakx1LjHJO2jQBV4R/rJehNzS5MtyYEalSbBI9O9yKLWRR61LsBw
a7O9v2/ufD4vvqTyPslbXDSMpx2Cgms7TzUg2w2k0Boazv7kSxFmww5k3oKu4zP7Vq54KWcx
SlsF7j+J2ZRLz+hTzh+zT1nicZW+lWL3JNV9sIHIYpJltjU9J/g43eWRlUiaS4B3o/N0OFXh
6pVipcacM9GmeK+3wFzWcFOnq/W0L/OvN9AcPE6lAkBtEYVaMldZmQZ1Cx0jLHT8ADKhDVc1
plsXY7pFGtPdmFBhrYlM0ZzJita5DUlJSECza5kAD5QOla5mTOhRywUD1DoVVqjZ48eWESje
a1k3OvmUdoVelVSpX4w1iy0d98vfam7y/0QyV6qzlNzO7nROu6HtIbP0yc/Xh855f4Jvig3V
NgQ/7YDVYXUzC4vLdUgEltUBizcMefCZTqoLOHFsaO4q6hsI8jh9oAGXr8wNrxul3gfybZXY
Ok91SmKa3tqXhNU0Vf6R82xpfOjoRreO/8B2B6qsQtAFgrTRorOtdyA786CUT1AsGHOwy67C
2NifMQg9BxABQZzRt7yWlXCUUw3rtr6mEK6F62sWJBmKqfRBsbqtZ5daHbpMpfe3hSJcSvWt
RGcpdV7Dx2LStSWj2VG+rCp7oSlrvOJekoVCOyuc/7dtT5VIu96ecFJluT8wEWwph89OVFNb
KHw3CX0e/mYr73G5Wq2b390rReXhaGBExcxamYvjSm6dHws+XUBmaGFlnuywm9K3rr+KKFq+
ISze3dvVkrz6dRLOxp94GInrJPCA/LCoy/bkx9d6j70uewz8eAyJT+UMeZeKObS8w9zztI1m
F9or+5ojh60bXgoaWN4t1RhxG8M24Fu0Q7tjP2enb4HJnvB3+ROY5cCUodX4mDdL2O0F1L4L
2WHueMDIRXOFRk6GbiJj3ZfiYNIPSXi80XYuF2Gn2h2RN9Mh3MJRO20n24ZHmCyXVK2dlQsi
O1GlfJ7ewpBxQpWN14wfaKJC2EBjesLNzc1rdob1Dn+hqx5rvmH4akOMCaAjIBiPdDmD+yyJ
6PUOXGgltm2qL+C6CnYTJSLq+qklg1O83DecyuLuR07lrqjqYvUaq4h1L6ch7+k0HNIIy0c0
G04qTEWZMFeeoaCAgGtSMTmRaqmLGw/rWDQNi7jXPgnnAx34FokzlnyrO6czBCpIvLQZArsD
wCmQeNK9rfMksrKpSo23yIwyL67Ma0Xj99xmipNypMyQhzjxb83yqe93rWji19qbibKlUXh/
TdaGPlXP1KLnYH7BCifkW0GLIHsiHXWE7hUBImUX2o8MyJJl1oN1plCjpFelBZbS0gfEMVsN
vTEW9zC8TaPHRToASynMSK1Om2UU2U4L58F0cO6mwR/xzqACD3LNQABSrQf459/la9iFnHON
Y67tr859jilL9sjaLxW2T6rlUvlUOklCwXT521jdO2JCtgeI6EKw9lmVtuua1zOosIaSXW5k
rgapxR7o+8WpGqc74tIcrVXZsta3RBtRslxUFvVQuGrrh7avj+wUs1CF3XZPukpbdS/FSgDT
sc0Vt9IvS5omAkY2p6uJ7imWBGgnWuLW9RK8oGizuaSmGLvHN9QAs7uRLNzxlC2ZnbnuhzAN
VgD98WMARx6q15tjGGlqKTD0BkdKphor0wATDwn/4Bda1IB/4AvVnPf39biB980cGbw4+zWR
7iLCkKj0Gje90Iq3FxGSQnf5g68hhYYggRqDub+g1rGJjNwE8hO8kiGvcy2X7JOgmdT7jLVC
aleREYJOfFc34LFrvhB4yRISwV1I0SjzxNctw16UBIEnxQQ3c6vLWYv7G1qDZonpN80Npm2A
jXWViIIr7A2MuT/h4aQOUM6OSmNNu4uVavSshBFBMljleEAYhnJFRLkksYrAvlDDodaOAQK5
Be5tuqBo9BkUbXNg94XTKyORZ3fr39pysrG/6EOIUiAd0tuQRaM3tDuJsa1Hr6xJC7fSANCg
zcfn12ncdEoNYoYqEGHUcQe7CGOwAlEWnDzrQdRJr9ia9i+6C4tDyC1B2KYcK8Qt+ERUy960
xVOQxyJ7TKHnYWoe9FjlKMa4p9bktpjWbvRXDS6LVbSd73fZ7Y5hMDob7ZkKPaBUwTOU7TI9
lVAq5MPkg4ZprXQr3sYNnflp4RDmtstGLRRp5anOMhdvkBp7dewxnbIm6dvy3g4dEjE5ic4V
gA/LjXnmQdaE/MaEb/juw9kFhsqoHcj8aAlEwjcHyKjom801vekt4Bng5Xk9x1Nrn0UcLRHa
JFSoGHtd0SwNDYjMLHF69yDEa1EdvjvS5yHeFaMl1fblTMcvCAhCGNjSE9jY+pLMG6GVbVY2
gHhNq8IFAPmB4CKMDDSlSmwq1oCOYXyk3+fHm+somU5h4GProtM5hLSKxxzc/kr3M2vTKfTb
U+IWEnR0IRYUuorGNyTCjIaQtCYevv9/YN+wpqODfj+l+KeTy+uz67OPu0iv/Zt1cGKDo0Pk
WCF+9JY+njMnOq3SRA/QV00o4STKIJCc1prQCEXT+rBLpsBX6aEoNzW4wzIQXio3QjaOBWyY
YN4hIdy/Q0DoGgx5AA+19jbmtxN87ISOzFS+xMolTi4QXLs4b5MRXlez6NWVNaTp7lxF+HJK
FCVL7XxSG+kq/FWAfx2+uNXMQsTJA6x5EOipAZch7rikZK6H+eQE90wdJd47I6Z4vxCa63zc
KCC/Q44Nb9YRWJyh5mxQIcA5T9RvgJl+zS4EckYRmwPGfDYLxQytqgA5c4nWeoPBX5CgyoJF
79ckGH1qkmE5N1aJO9+wN2vpL47LElWUoOdGsk5ufhpdfsgkijgJvPWTpSOIAKZbP8IfcQBp
4t4aVXQpQuoa0FLF2cIHBe/FqudkquvOE3/BVpJX1JfemQWXTedW08yTm44FIiathjN/kCt6
wUhDg9GLFH6KmJZCjr/MkQRGqH4EshFKttbBsX4k82B+ASMG/+aAjJJyAEtyBBFKi6S5aL9g
47b+5QyUJNAbatbPLC9hh8j8SLmAZyzLSnmrVA7QLLi4KCdwpgC4MhPwyDAe32wgEgdBqOCz
Pi2CCgX4X21goljTDeS92ESDQS+mKFPpEpDQnUNU7kJmYn7BZCpvARlLcVCsFNJhRcD1dpmV
YpwkG/Qfu8f9Eh9T8HGi5fqiKIKpqA1ekOhdXZ+c/liwZob/KgnJQGNo56IMnOBvXszmDDgo
l6gdUjfQNNMOIezRWQPFKGYly5cHuAT1QleRccEFe8o8v3XRBhdY0YoO/oZMhlgXXzw56tPP
uqRmwvyeC0IqBtK0D6w+eo6OxfQ9rcGmJpF+YcWRM5LBzAQTDOxZ0TIKR05fv1MuNkiZd+tA
6aLEsVyjUcEMjo6jxS13Y2qoYvFaFSifU/qob4Xsgt0gvX4mHaUIFYPrD3jltIf5614aIVtR
RshlRKJKW2IkAktGGN/tLfaQUJGgUgvFfnI6ZTdvfjYK6rM9jIAW7T0G6oSNWJGJ30N6D5Dc
DGUNRGww3RH1JWH7vbiNM6sGWCO8n4kWWZ0OMhfrxGfvz69bN132c5v6UXRHBp4Qn3VTlWvr
ogXTt2xw4iFj+Nsm/9velza3cSRpfyZ+RYsTSwMkQJOUZMmkpQmOTHm0q2sszYw3NFpGg2iK
EIEGBgCvHWt/+5vPk1lXd4OiZGuO3VfhMNHddWZlZeVd2LUcKYEVjkKHqH0s45gRoEQCfMMG
54lqKLGLYw+8BFdZSH6dmRCKUz3P8sHMZAZ25IkEoW30jsfdG8iaEBmNGpWL4buzydm8KuMq
7LEuSjRlqcky1Rkqpkpys5RFEcmLPb0Ayi9mcjCTBzjJkcRoAhcZrhx+nAwdVRLCzZHxkw5u
JAf5gJTQhtAGl2GVhVOWtgTYV5MzgQecTkC2b3XY9ZNjQ0slt4jByDwj283oeLLFkgc4JIAr
3D3HTjwDDCdni6kAXOGiDJFGtmZCKBltpqRbiQkwDsXhhcX4Vvgfkr7OssGZI/kdIlss4AUk
uVYhkDwSBI3CH528BPUcv0IftzqnrxJxl6RW5A3iomyAHBKYjO9vFTvK6V5zRokR3Qnxv1lz
ienwfLKAgBhUGX7519ZiPIDIQBsJHYEqqGDr73LrQKVhrXSoyuBPZ8rec81KqwiZsWacrUeo
DJN3aaCyglIWFlmfzL1oBL8+KkegHnZtr3Au0KVgrit0WjoXuOaysXbVoqIlYNQVXBGc4ixG
nc34K7yx7OvMTdEK4FSdF7Ijue3O8yNsS6HLpIiMdCy0QrDHOLBKqxqWYjmo6KI2yjaEgM2y
nowBKaa21UJCCsfByMJN1ekJE53yq7m4KRSmG+rZ5PIzhX58x7IQ+uZrncDQA8Wydp1MkNZM
jsMl0Akx3logwGfk4BNVdwsFsc3CQFsWVy6v9NTBQQc5yrGfhVBHLrlNeqSsuItw1PrqAiqS
xCJ4/46xfdDwVzhRryyBWIXQsfarkfD4moJNGsVZxDRw9vVRfqYWPJVJRTAA4SZzgIRjEF+k
onJtz1796RESC66oJ/BXMhHw52fzM/gqgOKzCai5MFgLxVTMn3p0zRDJNYK/2NT8xYC47WlP
cWCdi6t7M/V6avE0223FYr/K7kE4VILoWC87wcB2nQgfUnziIaMmStvgTq36kePjRHPl/DMT
eZ5akR4BHuRz+IzLGgb9qrQMgjkZDuTUZyK+C+Xsd4RTFpacnU6y8mthO8uOcthj6b8YRMcm
ccEje5ss9XZHpXLpCDHB5HdnUmMy9u7iJvmTSZ8YHyUbeE5W9BShoqHRnCxsQdaNWe/yuaYQ
HKlIQlSaK1tb2n4LYoqfb6JGINuzKKYI74zfEakUw6AK5GAows01VluXQm20fRgWCBs/eOWk
6NrjqrnR+00kzLhIHuR9Ra5OFBYhrpws9OkDBMxi2yOPBdj30ZXTERWAgrKzk9lA2aKwtLrV
cwd1Iz2AtjW8U21XxWGmuSh7k+Pe7dCaDOyxLpsM7HxYXHSt/dPsf4TOv2sbhnSUDQ7STbaq
7WEZrMxq0ux/CvcEjjIfCYlFy2A+Ma/sdFgOTCbunwkaLYgm5lTDZeltuy99JBYUsu+UQYp2
g6syHyvO64Z47eTpCDlskxj/ngdOVhNdcsfmIwjFFInH08l8PuyPTLzM51fj6WKysJ01mA3P
neZHaHH+rlD7WsINe6mZO8EDQ2j8joh4o1L+B43z9s7X99yz0wyGxUFDtzM76uAlcUHtoJR/
J8IrZ/vnQqEqlOkqA6nGdIXU90bFeSHn0Qwdo9LR2SLl2f0QtaE/FYJZk/4cwreQidcTmZyQ
DEMBEXBmuSmblKS9fGmK0IVIQwOTCFSI9zv+5Oz4eISNzJyQ3M/xJpzOJkdFMZhDhI4YDxLu
xl3cNSUPqbQgx1iQo60SjqziuMNBDUtH6GXDFDMS8XzwXgi+8JjtYkht40QFmFFxTKpKfcXw
3cniVkfTtnp1zk+ePrXc+c+dzgMlEnF3s5dSVpr6KXsJWVEIlPak+WoE7gU1lyQrChd604Ad
dAfSOjgQSzfB+bVlgh2vCHUQn3hQY/hIoQP1ArK0XtnK2PBIRBWryEoUU2qNgITEs+FACJTt
PlvSTtrcT/XW/bBMqSqVdl1OuQACZDh7qVRSWSp5xwpcDdMQcX2izvUILieqyR3JmeXkbhCP
BfidXJlEO7KYH/a/C0fjIJ4JwosMObqVVmN6Vku5qnK6LQXZIZrojPcqXLKho8mgICN4Qa5j
7ogh40DD2lzk07lX8J4WV073U59gwCQbaokcAnOGXxHbcdTNzoFKUvaIqhzke8a2IX7DQ13E
uZRjYKzYRHAXZ5GdEsWMBxUX+sV/iEBv2p6fMLiz0naHHOgX+dUmOZxksD+hmZN84JDcjUue
J90IREHJ8IRSMeoqjF60tykkzo+EyIL3FBohZDKkwB0VZfsn5KFNASks+BTiuPK4O6rvV/C+
GiKRsNUDjK+mRo7nMHeD9RZKPk9U/a7xrtfGXVBPCFbYbT7QVCctz85Emnzl+alXUO89Bxr8
TRNwF5rQeDGPbfWeyTErfXst0wTi0nxHbfYaKjjJ2tCs0HavH2G7z1ZJWlfj2L21QK888WHA
4Iyr0atyRG3duPNpXpJUCl/cY6sihUygYWJtJeEPYWmUPfoGwv7JUMtRC6XfvxNmVpHZyI32
rFoSI6xeoe+YU6qyjVEiKpHyJWN3opiw65vZz+zrZ9W/UFmIgEnylKbVDjzqJqv/GeyMjlAq
bXVNaDGXDxlq0DvryphByNLzEOH7V7qPdHTjIi9Nrkk0IokhWz0aoJxg3yKl7FFMeZ0cpIDU
BWw6JXlwx9nGXAynOld+20hPUPTiXDSGQ60ojuHyRZxXAeA9GvbH4OZEIFImbKi6+PmQXlIU
Y+Wwnxwfq95cDw1kT/THdLaqBVYzipdqpJRNIZRHP8CrumrDe57Nz5jbSIjbKSTyR398/eLx
49/tvzpAEstEoxt9uuPUUCC/rnn4bMumunO769KQQvZb2tMdLtP21jfdULpYHPHtzt2tbmsF
2eOlxM639+XPzv37d+XPN7dvf4uXt+9v4VG+SX/y+s72Ngvfvrd1e4sftu/evYMv2/e+tU9b
d7f1270793e+vcse7m1/880Oe9mSYvfwc+fuzr1v7rFT+XX7zjfbKLq99e39na27t++hxM79
2zt3tnbw+86929/eu3d/G6OQEnfv3L9z5073Y5Pf+UbnuXX7zvbdb7fubmkFZzJ6rkWPcQ4w
Spa3B9ze6fWHiz1r4R5aAOpGGsGKpH1jteAXVAJWdX5L1H0LZCZySr+6CpFRRN3MCOACqe74
+0X5Izg8KXENpadJ5403SLytaRVVhwjP3hPk6CC9cLFAZlqO/CoitasSP1DkoN83p5beZ/+j
T8tzxPCaR/dc+p3uQu9BCUh5cXhGkVL2z4dQj0xzGBVLvbjgc7vGkYnBK1HefZAqFoR+mSTY
Vbq+rXT8MRVgQiMGKXHRVuabIi0CkkYm+PZtcCDWUhpFRfRzZ5Zl6dZqUSZ5a2eLEVSssbFh
J97fzHu81h1VWX05705N4Vh6aV3KX8DTXbhaO3koREyAY8HyLQLgSjhwVT+pguc6jUbren4i
m1y/8Nw9bfymjzM+ZTQU1pVMQ9CkwAwfjH0CZjldbm9bh56/LJUlBkqIRAP4feBiPRfGhyRB
xwb1y0ZErO3k96ui8znF4UHwUe6SdkMNqMDmE+URVKnufEMecmjtuChTg3E0biHb27iGoN08
mg7LJxp7z8OoZwc60VcOG18JGxkprOwAtCRzXrr04NQJK5NApsqY6ohTz6bz4mww6amKRZfO
62YsDQQ40KtsdlZCvlJ1C7wwSuXnboE1pZ5m9E449cXJWH0virnewZGpO0dJxbKWPDc1h154
IQ3PwQEAckDb5NqYOeRKxKN/bTyK5uubCfmfQVlExW1SY9gUIOEvnulYKz4LGMD6HhvclExs
Qri5IZIUWPh10v57qvhtXPyznn3z7dY338rq3uM3hPIhKws//hv2YM+FWiyYb2w0gWOu/tHH
9/rIrAoL5rZ0gRbkzEPCDurF/O0jCaO86W071cOPRoKJjMhODUegnQdzsMBKhwvekyBw8/9k
COaCfJzl51KMfqE8S1TZj35HdPB3a2TVaCPIGEQJwmheIWGP2KGl/Un5WPNiQiZ1GBAV3R74
hcfJS6fU8Ay78tYauKv5Sxz5o5/aLz1GzKC2p+jWcNqmx60S5T8XJKBQFoDvJn/hXkB7QE4Y
1hg3m12lkbynB9oQ0LiuOtlQO6dGFbU4b/qyS0UOxSiFzDqEjXWoW7ReXjoRA9IxGvNZMo1N
SL06fv7Z+gm++1b44QPPVOhrB56oPLeNGsZM00tn0DkF5khJyTOxZTn5aBtR/bVzdlDNBoU4
V0oOcKdayKD91s2+dAeBvanunJVk86xoAoOQhYQD/13F2h/589jSdN2IqKgpIn3Q4FxWH7ML
aiRf1oRWvf/HmR+iOSQ7bG0tizgDHdhcTxPdcu4qpmi7sRx8N0w65LMzjnpiZj7y+Jf4yfOF
+sorQcM/eswL8fVFvFkUWRn03Qc/g4S/lP1h5Mv4zJbLwJgFhfR1C2PE/vGQ7jCKwypNm0tq
5nLXau+9Hkihb095JqVr5US5FBEKBiLb37JanL7y2FL17aZBhkCIXtvkHZcQfQmArpFHNyqv
x3ArWSm5vRfeIh4nNEm40oxYMo5aQYIEpcLOF8ayOuqreRoHIAzCA21ubpqY0dsO6ctL+orV
DiIqAxY2UmGLvjZP1QujH1QkwMIL+63rs8/UT6ZbrKCi4zi9+iNoQzhcMzWHmtR1o2oYqyl6
uVWMEXJcitcldSNXGhur+fQ6kdWobEUD5TKURjw/FcfW+7q6WgBWL1/au7az7SZ13fkYfI20
O1VmOdY+qRLU1J4Q+x3LK7S2lWMqz8ZumxPunmTH6BMlfHKkIdbOV9YFHWJ4lTTSkBeQD8zb
2gXEpxvbGi25a3V06JoWTAFjW8iTl0BcskBWsgpBqZATT0xOez0XrBoR43heAW51pLGhRG3Z
mHo9dGfRPjcc4wcbhtblWqCszJs65Wju3jcmsAEQaGhYhO2u12Ah9ec/7R+GL2FBxt43taAC
mazBFJpXqO5uafEXOBgng0HmFZ5qu40MQGpG/o574+GDrjnpldJnPnDt0zAMrkYLOZ+2k2I0
VV9Kv4nNt8SZ87zNX5+G8PHHEFW6sc6ckn+4oLXDsAtka+69ReZn/WNZMrskYXV6NitWszbE
2dBFx9EiFyquxk0hPlorGgzia2absTsTbpZzLk1k2LZjNx4AaISDlh4++DHTwDKupkNsnLt0
dBkpxoGETHjnIfhEmFgemltJdUfQ5wiqI361Q052gBF853MUhPu6z1EygJkZ2Kr9f5d0n7Rg
I2nQWs1ktejX1evthdHirRtvlfGg6Rn8zVHhmY3ZG17nsm5zEoaCKZrQzF51cnqe8Qxrmme8
Km4xHoT1Mdb763WVbhX1YwIHOzP2wneR24HpIWoVlHQgOBM14vUzImsuXgoCXcdZso5uGf0q
9nozR8ASkUGnIuT0QZjMg2g2pkmYmW8nnEnezN6ma2oN2rmko0irc7yC4aw+eltByZiO2okP
8rAarK6rVNMbJ3uCcGhBrsWJ0Y13JY764cL2lpFOznd9FpyqWt4fLpzdIE7w5CuRmGnN+xoF
46J2sR+842bqxWMASD/PuqOOBwFMW7AtyIgRlc1BDjqV9ro4s9FeZZGTewxTguahtAlS6wOS
qBxRdmhOt63Y7U74y9BIcFmGz39/+M7qgQHzVRynA4eIq10T5uYLeTI8Xf1uNXKClokOGA1k
KA1vI6TvFLZvOCvgyb5J5wsaPs2nEPuyqzb84uqrWVyVvuaQTZ374frIsyPR1jyJdC5Ro1nQ
ClQ8K0cx653sD2nBjpqSHcJkO9TgPyMmtqU+tJq3UESs4/3j1QPKBlNHMKN/JX0pEVTMae8i
ruAYmmYqfGfzBRLujQsazS7cbavUKCovppKbWkpjIuN/q08Guc+UqgTmV9+w4K1kK6ug4vX6
PncI/VZN1aAurP4M8j6yDIyYGdQS4QVn3l7lLXkb43+qgksqbrDkbKkwU2PNVEX00QFNGkc0
WzagMJ5oPk0iE8azsbHAZZW4ZVaNcJr1I3HUIXvLTX8Dh8+Whi3bIuMqBLfk0Kb25RhVkzY8
cSddZ75UoSPYkzOLUG/xHij6GAwYBQ3PTtjY99Hcqg8igyMIM1sd4QpvSjyak0IvEKY/FLSN
W295JE5w6OqPHf7Apc8tcxtyjQ2KG7T20Bp7aG091KYQ7VwI+VqPGgmxKltu2rg32TyKqO+S
b9vu24g2nc0WrihXUyBYP0aoMkolN1dijYxTJzvViQtBXJQh0g3GObqirobBEHJwYCoGiYvS
kYo+jDzNj6F9nflkBqF2y18S7QxR58MJYsPo1ZX3hySsbR2JgKQo52czdbAL1J5UtOX18riu
WqBjyji6rsAfNgqCrzri0skjsn8y29yhoMWnWj41N1i0VD69h7c+VqyGRj3j1dU9bukRQrSB
Us9wLYZaGnekjPG2Mr51KdvbtsDstM2Q+shIwcb2ntsbkAJRfWOjTNisuEkeC+kx4k+WSvrg
X9pL2q4nDAK6ZZTiqaa5VtU+ogjhNDF0Fhu6XhHNLZBIlmwy23MhYvroNQOt3OsQguO+NNF1
rt3sxTUfXwuXHk4t95typLYUro968w6Bf9NDnkqW5SrYjorLo2LqIuHnraTHKtvsH+r9ZG2Z
Yl5edTbftlqr+ao6wydQ0NQYZRSRSkuY0dVcCJMMtvQx1A/hz95aPUFwPlsrzbpBf2WEptIx
7J0jxgXiKpFfgqdoCWajNEsi/V1mLb4dRpc9QaGsqRiUCvnkYUO1dkn/rxlCqqlS8uGZDwXg
RsNSb21ulkHxpFQ2f3PaA61WMe0BngUqU1Xfqd+3TGK9Da0bIyOG5RlopYwINIgS15uSvi/T
UfQFXnIv/txtodkoOETBcrqXwc8fBWx6OKdPgxGRynbTactYjoSOtohsTklHRU/ZO3WFlJ3H
JFKa5hJ7MHi9QFK2zcXlgiy1jHTivFqd3SEleIaMwNaI5HEcEclzlxjoHyzcsvQfQ4a8uzwf
mEJ4Oo0oIGCG5Ln4X0nbLfPMlIzSCQ/fMeW51IKVDG+kKWvUqKX+2vbEUIYqbYfEd29Qyy1+
r5fphJ0SwfLuw1alJbMNN+gEYewTXrdcABQijinDZeP8UocBkyVHia7XBGmAd9DmIJOAS/Nu
nD1qfGdVY6qYv0Ev0SxWohkrMOWfPmojzDe74VTYvGwTr2EGWqERcpHeER/qa++mcMuMwqNO
08Rd3ZhI+94eupm0GhqHPOpTr/A6A9nzk+NjYWXmISJc3q0p2VHdkU06LPxK+hgfPNUhv42W
WtU19ZXuZY2r3PMIcN1KG1y23UpvvdV4nIubLjPIDV7pQlfktdoK9OIVqKPDUmy4ETJUl+uX
raseV+fFRxcYCvdAIKJZhRm7RffvTnXd86zn0MBTFKGE3/kGKaLaz+8ye2GAKC3ek1riN027
XTfgnIe/p+zwElXXTRe7h8xB7ix2nakPrBxdMxrK9Du9cuSrUOjvEUegBjo24M5JHsY+gDGM
q3JwYWQuE5HwVB56TlsQTzlc1zYOkGZcJv56NPfxmS1PgMZvA1aGZRlvKLbju42J6+i4Qiu1
l2Dv+K1zNHKsr2uP2NvZU7F6onNrmC2r25HPuS5lBQ8slwAdpOJzrdN1bJXz10IPzhBcXA7n
6h0luLlbvu224Ig2/yizdy0bRiX8L+VXHpheWfkVOc57zNxlggvbZjyC03lZ8q8xcjMjCYY6
Jan5XxNARGaCd0xRwk6v03rBbdMsG/SKYNToSc6oANiJlScZHrcuZkORNZG4hm7b5oqseH2F
nEpmOl510TUE2W9XecfwEq5EWeT/DWwJR76ed5LD6jPOqgdJzb/XWaWjtwPr78uURDxJIyh+
XZ7k73F2xTyL504ffA57WkEGz55+Se5UJRRS5gamJeUXkyX652Zafg1m9IswJWGN/9mYknRk
X5opMbwbv43XdC8mpcZpeHZkGeMSkPJTmJJ0tjdmSjS1Xn45HJ+N4/Rk5YKuvNRNPYPu9RUu
EJfzdaoqiR70HZZKqpUFLz47xpckjNJEsOqsJ5Xw79mT54dMYHX46umTR8gVNT0ZytGJxFb2
4eXB8++fPP8BNRSVUOJ/5PDa/Gb7PqJwXSq6alsynts73ez+XfoXDgfDo7MR04agKQ010PF2
s3cTyzoEtwwffqCZp+IcQWnmrXSA0hFB+gSKkzIXQENfdBYcN6NkfvE4u9CzMOi0mkAI43Q5
EuCQ5lQlwXCqY40VzP2ZZQcksNFCpTvnwbEKhwu46SxWnZM0EidZ8PgYuSv+2zR/RwwxX6fR
BNHPcxfAL5Nj0GOOoFbhxPI++FtClom3XIh56WPM1wM0VdnGDFtxrjRcwpDPEB2GiPbZAjsb
uUZmV4y+RrQGMWoHjTF4cfsbGaUmM9veuc8g0PkRqo1lUGeML7PUf9Ln0yjUUkpT+dXK1jj3
nk8axexvw7K2Irocll0L4D45m1nuAksp+A2GpSy0ZluspZ0KUUZHTLlbaa2CZRWcvr0TcAzg
v4DHGAr9sP/06YuXFk/KfTuDMR02S0FlVX/q4c0kKcKXdzFSRrwvmGEUwFc/aSS3pEaTpmlL
RLdQdryMekMDzGt3dKaHEjpvUPPRqRK6vvrcbNj3OSvY0QO1cZKPIb06pk8ZM3U+OfKhxN40
xKBrIVHvzxRbmVxFcHyIBcUFDnSs1ixZuHnSwvvHE8usAZcmad2tTnW7YxFevd5/fXD4+uDZ
S16yke3c/YaEmzkHxrzUUnOXcTvjpkrmTcKdgrgwy0K4/tZyHhBJ3uamQLRNH4E2hCVexkG5
qBoBaqKERWZ9lX9FOQ4X2LsgZiCXzE/JmoLMgtSd2yGS8qjDFGCHqSAdpB4E7vSMZRseWXBe
99EAJiDNp/mRXtOp1EnD0d0dmNayDXXflhU5ONzZoph3Vbg7OjhahNT/KPj+m6HuF2iydaSD
wQwI2pethrgLILRm09ekK6OixOuI6hBJNIKXLdDzpC2nKWM9cycAD3wCV9AB4Mq8o/Io7xex
88t1u+E6wg2bfLex/daXNK8ZqsldSgu3Mi7VgeVo5eCg4sdhxBStOk2Z1mlRaB5VyyZN7Xlx
CeF6uNDw9uHx0NPsQeGWzcx78fJxjLUj7K0TQGUuTR/jJvwCv2naGrjEPdrLe5YhsYSaQ538
KowF5QFsj8mslhueGHAIm0Y7Ig/r8ZUiqThtbNN47q+8k4mN572HPp935rcM38OnAH/9pNx7
2wcPGve/FbIrWZWhejwrCr9C3IKafySbXJQhH0WYhjuOnR2L1umBMTslUvhB+cHTn3s3Bpse
PBaSpxWDxzXSOvHARRZA55x9jNFVRybHLfP9NkIdNaRcBfDLoQwZRyF6qwJScMZ26aC0qYU6
vxD+CvMD2t498xlNDZvrK/gAfOWzOU80QM5HPJF/deaqLeadOjsCj0PgqkpL7VNs8YiXeAq8
302gT0rABtWQQu1dsagDzbRBBfzPrgMfk658pwAxAEQG9i0VxL5nXkfkEpECtzRzBn0ayLrw
XkjNoOOuSJ6MNBEuyYnNSaNo0J86J5BlABm6KCxxJO8auNCsHMbIsJIjLSmS6AV/bkFjH4WO
rvwzDldnuO7iXX260U6CQCqKpViAinutyh1TTXcEJYJOcobjYp5nB8/az151s+cHB993RLbT
vwJyedsJ/f1W8GF35S+U59N1dXU1mSmXWZm8PDDjsbl6muvuZZn+kjJ9YGLs9XKB/BzDsqeJ
3zK0Tte4fmz1ViIgHWzgO9we+0TpZ3rPthw9dFdTG/K0Dy++aZ5kIOZ3qQsxv19n39BYwsHZ
btG7GWyzHJ+Nuqn+t3FfjCbXUHCOi1skT1/37XU/cVppYH/iE25QzBd2nNn9CbzGBnvn3dlw
5C41h0RzzhjQcsLkJXQCC3tzbQ2wxf/7VLrmTusqq2C/Etg7FE5wDfMs8+ptPrw3IJxI0fmE
TVSMj6ZXuhOQBg4NLNsymCocbOERN3WkNDfnnQV9rQWIvLC+1yv7jsz0NRSvZTFVr7CSypkl
sY3k2uh4ROD9hjYFihn7IuWVXtiw4BPcN9dY+ne10qrt+t7FOc9yKH6k7wvcPKKu2LpG8BaB
j1ZHdfdnpekcTVREoJfmNJ7Dgb2En6tXOOvQBQR990OoRraSzDFTjSlTJhLdiIoh4YEp/U6d
bpkP1Zi0xy4mrQHsdFw1sPApjwajjuFcF2s7WpqV6uq4MjYdDbY0OcoqRKECTkFtA49HloeR
5dHI+vWR5dHI8o+PLL/ZyLIQqvZCVxXn8oX5TsJnO1pPpV+RAIurUYADCJsVuRkah2Fp6fxn
WlyJnAuNiypHauuAWaWlPsM4IsTqF35I8XgMAcsrUMaAbC6qA9iUGImIVbqLeW1KglseGU4/
GdWMTGBZtf3TOpHIlErIPxKKDdcPaEV4KKkfPr3xUhO1mjDdo3INkytkJksBRVsoNx5gVfYr
cMpi3PwsQEG74yDV/0RI9WNI9auQun671iCVB0jlAVL5UkhlUSxM0+ZCLHYjOUDVDy1rh+pf
Owhlbz9WjbB2jV5rqLT8wLEDTJvzN9j863BB5HW4JWtsUJ9sUP5l2aCT4b8CG1TVF+R1FUJ/
7wsxS7WrD69jlgIH1EfjfQvhq7BQJCrLMJrzczwUZxaxUVP/EDev1LPM9dGYrV5Pdzjj5Wx/
V3b3/2e2fn1mKwH7P46laWC2bGT9MLJ/CBv4f4/Z4o5WfqvMkfGpjmEBqVKcQoPc16cfZc9I
eHoxRxUeYpYDF3DKub/9Ebbj85mwCp79ciaMRFD5sLL/GfDre/hdx7VV4ddP4ecYEoKv/3Hw
fTZnViEg/zScWb/CmfXaWIxOtDyfyaMtLiZqaMkXZoNxJo+h3m6zsb1UM9pN3PqbWZy8rqtn
PEX1NnugGngb41G4Xd3DEvZmierUv4RGHoncope8qncrefFAj3Q4He3obb7Rm9t07fC6FLPy
xKyAN/LIy9KVUxsQ3vTjN1rKeq4yRdGgqryRSxM2mQ0aYg0FHv0hbApYRsYMDZ375u3ZoAef
EXW3ERrLwGqywfPoOg4XxMEccHZ/G5M86/VqxUB1xZaXj+vacTEyZzO6afIg2djO3iFKKIeV
FTwHL5qdB/sTb01OwOt1uw5ACbx2AK+VBMwp2FngQyu08FZJ9oYiT6/Hnq7jGJGNiyZvxjf1
Lc0eBv9b3L0VchLmSRIbjSNXuGgsOVJRqKeuy4jedvP+FC1AQlBjnjcI6eFwqFClKGS1Mq+c
wa4wYFZm1Y/z69xkUv6CWaJ2fFpM8zc4YN8Gub12XhgpbphdhUhXZqLkCuYHRGHTTB0CbX1y
81xNPsFvhzMZD8u2UpNOnHnHI6RTeEfdeh31eB6WSSeFWZjnVlJYJLnGwmaYuszHMDwE/wVk
k6BjRT5cmDOzN3F3+Zczcpm0URZUWDnz0Ex0lRCwe1b0CkrLw/kJjpNsHZW2N2nI7d1m6C5+
7ThDtSwXSuxs+vcPo/f4dK0Pxy85GI4mo1E+nRdNFkU7SY2Ew+fSyURlFpFrEAdLNaeU1JGB
0nzh/DPm6x+EimiDMNx7O1Ool5Z05cALeFbAn20UUqO0ZDFOhyD4pI90ZH4wK9l1DWfNLbci
9j65pMjdzh7fIBmQpWtXjMI+TTRMfGssSSLzBpVFixdPc9M5GzVydvOKG/VgsDh6NY46J5xf
iB2Cakd/DxyprbVb5usWo3LfcwJ1m8nLK9wmbSzLcIw7D/ojoSkiyhzCDUgdIfZNtyVn8iCf
0ncqdmNavvcimAXGqIXWx3ZN3lP57T7Mi9FxrELildqAYQTWcXoJusY8hwzypaKAECUWiya3
Ps/PC51SnUnLlC3bq9y3riyufvFcD0ZZNYHzpnL3zmc9vvXyan/27vClSIsFLxy3e8ZXf36x
y/mvdrM1fw1ltGI2lg/OXk1XkjWsbpzU2k0oQwLY0XHv4aRva7aSPMv3tealXQngihsBf+7O
ulDgO+FQnTpoHtRBmm3QVUV6NanKbBjMVJUsiEnAWKgQUr/ntgO9k0v6urisAZWElwKuKCNA
2DIO4ZNci/gU5w8geJk44bCSj7ZMmqq4MBrVC/7/3PJg32LQVP0ef6sSXBaX2nXvKoU9ta4k
MufY2JsbZ1PmzzS/JMDHKo76eiljk3OrOkl1lHHdJJ+KQm99dhN5AsFxL0ltNjbwiF+aaJc/
oyn27FWgSp40Cvp2GhfKC6ShHaJ3aKNCZOOW4pzB87p0+vLq8LmcDS7fQVbbGBFRSF1D1tQ3
RBr46cnzRz8ePG5ro8tE1RYcJt9hbUqR1jp6W49NbJitge6DafS3+Ijww7yIfLYGh/TSr8Ru
nRT59NA37EVSazpuG/CwVsMgtc14kJxnWzbVexslUhtAIoE3/vu3YYR42eW77AH/dPnKf7fd
NNACSIAUvYkv4v3x4E+/e/K6/QSuK74IHmVoYPYeynt5Wsu2Lu91OpyQr/v46ZOXoX5D7f+S
6TK/eWii5Wv//mD/JfG9/aSb/XsHyqO/KBniVvyLkZ5DkCupvxe/Q/7s9r/7dxbLeagwOXwP
dki/RGfG4YKgkkJ7+om/9d17eedfEmSHC/cmmuXh+469/WBD9Btkq5MuIsS8uextW8fLzOGT
lt8OS3k6hDdS+9JHBnlAXob1RCDnEEkTs3GoSadRcG7+DbqRcg51LikoC6cELcylrMBYOJjL
7L/iRi4jXA/ju0x7UbwcQxS77CRfqtU1olb+39tW2I4pzuE1y4Dj3L1RZ0u7atiDMcjrQZNj
RJdZfvajk3wG/kFBfF0MpUtmc2nBjTXbSm3hDEsu/ZHllqwT89grtkjKRKCYXyZhLdtcJtlA
+CULZTGIka+sNuFIiSM2l3YWhl1lq8WjJyqPg0d3wpiaEUYKuYWyF7p21cQDy3ta0k/C1zZG
DbktcyGgwUIOj6/atmPejSZ9JknRlWrFVE0G9+ZxLuN6i+ibomzniinMAwI1w+UMiXvb+glw
2ILA0Al4pjgWzoQuSHcyFB717qyIh8c37q74B8we85HOt5s6N2wdanu/yeYnw+OFptXZttDo
XdeELtD27lv7oG9sDPFN1xVG/qRy3/U5kXvJXrgp723d2u5wzH2qX63w5yupLDFVh874Ved8
rxISYFzGOfYGoxZ+/jm7pZUOmUixfe73Blw2f5cPnpS41SUf4aIoem7WeXY0OfWMNPlma+PQ
WAljS+LqnlPR0McH2TTipzX2AF6ejDNRv2GNWwBGbMsyLRhqksRXbVpYncfndrIq8HFVWDg3
13jUG9k9PWGdiBPaMTGnNnVhm+bFoh3onmyIZU22fNwo8q1p5s6ceyDDJtgM19EMPSzYCIzX
YGpwVUOvx0xegUikvBI3nHX1ipIqGk131d5HetnWXrxIVzsJQP67WZhzjUFPWF0bTLQNN80t
WU1vcHSF1aidjqWz1IiUv0lLRnt2r9XECDsa6deezu1u/J/A8jZTg0+R6peaTJooAtPEra9P
rxHrP1cUv5Ekfq24zbsaYuk5bPpTZywnHgzcFr+piG5Z+2Pxmle6JJK6IlUhJMBnTrN7yzRO
Mdx1xmyOU2blsqsaWpqZnvcYLY5O3JW7qk3vWYYzTYaoH0Y9+dtDcjVW1U8uxxZs40iI2Mt7
x8VFz+mxe4uJvNHGepidN7f4izGVckVXZ7qrQHjrD3KQaKqQkJCw5TO4Ixcmkpu5tLAZ06Fi
StiutDQtaF+ArWgq3UyOj6VzT2KmDSDexsXrLtneVNdSEw3hXjRev55wXHr1Uw2+3ezNVFPs
W1b9zSz7wabgR+dzMHJKAjhXeM4MlHiR3Do1WeSjcM0Oc6pOoawV6Z50EMdYFR2TG05OSSW8
siGZuypcpqn9vIq8SaLhsHzpZSBGVHoOEhMz35jJAzcyJVODUldTihoW+yz4n7FKXCH46ly/
StUx1lZrOZR98pEfLR0n7TzulJzORAK97FpgE+NxFTWF/Ko7SKqMqqzBlGD/NdZp/0a31Hn7
X/XKmYY+Ix0kutrNLDoUgM/GuE4r3EGLO/L6xSgrhV4MslUcPqvo7EYaGbUxPvT39dycgUpN
X0rEBR6/QdDzcfbk1dPXrRYEO+P1XjWxsL8aZ+j07+dQrbkzssabduP4o06sXD5vYLnYpkz5
+wPC4zw6o7euOZ4N5xpOaD/llHDAYPExXL32dDKOor5u1WX7kKzIj36gX3hRYqjU1qQG1Qic
1u/+XNmJ6kDD80XKx0ydpaK6ur+O8AEFcMNEYr3+BflBDvvweXHRLh2iXSzn7afWLGv5di86
sdLf3H7ajC5f70y7LXctknu1bGDEI1+6XG/mdf0VTWUjTVhXmjmls3pQC1xcsx2YWvQj7Oq5
51WH3vs1utIvZcWy4cZGUN4c8WIWz2n/ODw6eaS0E7mo021EXes5Oj48+INX7qCFh6k17+WV
YIFw7ZPx00n5rt1GFHhHE8d7Gy6qVYyAAUMUm4RdfLVA7g2Z/sHl0eGfwCkewNIpTDFgs6nA
uezsZpdU7iK1grxfjTaFJ6zN4KUd58bgjV1L/wnAzeFo4MPN4Hrd+rCxzrWEmfLfvwAqhpvz
KKbk83l0KlAYHooE60xd9UMNslnqhGAgXHa+1w/41GHhV0F2g/8nYTuOKxXVZnnTsSpM5vlw
PlwgB43+6mZKCIUhC0vZhYNBckZcJqs7ida011PPwL1M1+HSf7eFc6t1mYrAK9KFFOUg2sKV
YgB7bh3lW7IeOpyacrVh4kejIp9VZj2acmok+Z3MMViyAESSGB1GU6qIRkGD0VX3whswMocz
wWBjROOzN94uF6qZn0ztQI5HeY67ry9GMakxNUFy/tbP5ItOIy+6eAK2GXKMKlESpI0/GmN+
Plqin7xo+nDhNZWjwJrJAl9Ej9Dr487UB7pvMXL/+Pyg46WVVyfCfh6dLXZdUL16ac4tl6WK
+WRT7Y1KK4ne1WFZ1J9tv7nqmqg4d6Qz+fJ6prf+BfjJpxhi2rzTlDFzmpcy7XIOJgTX3Fq8
zo0+ZSEVZ6I6NNp4nkIKryLgVYjl6XWk0lpyhFIJ3YprzdPPlHieLqFQinSnFRnVeaDKTo/H
Taffh8mq+1V9bldBKjyYR6FyYbxeTToPgTjnWJGo/T17f4H3F/X3R3qVXA0RLF+TIIOOhjm0
ZPpPX+/asXOOZHXS7l7m7VO+0EFc6EFzoYM/RIUeLCn0PG7p1pJCP8RjerhkTD/ELT2stCTy
Y342kt0TLyQk4KO8xLlxQhWYQtnlUJTGOo2boLY7/L652fb4c6FRgLg8ALIX1Su6EeZY+bnt
dU0tdPCHr58feIVGvHfrFI0DqRAyN7gPlQaeHzQ0gClW6tusnf7IsDLcVcAp5O9wSbc6yka3
UOB/OW5ZiDIFNm3R6vas7cvJ1Hu77sOXrRhoUrayuMAlK8j/27/KXg/HFXc/HjrHQ5G7Utrs
LlhYLv8NF+EmahnGDMdKyHJEJonBjfB+xDKZzp5qUu57V3jorirREjUFcHyTIGq3c2qSO13N
8TUt3M28WL1GQdPLo/ExbUaAiHjpLdnzyRkcoZDrr4AoxQxnTmvLBECm5mKdCrPowCbnv9Sh
oG14cTCeLq6gPlMFFSK0qB30ftDKQPrxXtvwNdxnqgdg11DMYoVyJnWCTmtWuHRFyAyKHL+9
UXFejJCfb3g0LMqjK3OJBtpaAEIf0QRHJw6DF1Ng36wsLhd6T3l0Y6XiuYgMUkBk8ctFu8Mm
eOmjpraZwitqgXxBkykurv9vn5ia+saJNFgsjryn+CI+u2RCTxTjTMLH50Zdks6fOJjXMFCx
prRIXKx9fIW6YtMru34osE5rayrRn3vtXu+hwCKXlbKyvYfzvx4q+6HrWXLwvqlX0reOveLn
p1z9I3KfnUCRfAG0c19zlM76Q2HPZ5qh9PmPB1CCt6MNpCosH4rUzbw2pFLIAy5SySQpfFT5
SmfhPfPdNHtopbOtrjPmrVe+wc5XKgec7HB1v9PwmrPS4yhQr7g8yc/mmnuvyvhEbE1MjjCX
BOeGCw/j+kxtfTHdF0cMoxm0O5HroZtwQ9LoIRN764KpGHDw+vDJ64NnfpdCWsS8laDIrhhB
JX8cMs/aEEoGfS3OcEzK9tn2DVQA5ZozadK0o9YF+lADVYZUlwuqp4Xuhx5VAaodpQ6WfpYf
wgEmm5P5juPtAltTOVwIHTaKfoEEfZOJKuED+SVoeAOgVq4ZLJeQteHHyFoVCWMopLJVK9Ny
u0sLmQNN9RSkN/LHbbzx8+lF5VCUAsHKax2oq8jpBTp5A0P231YdpViVXfMhEtJSW+5+OfiP
4upiMhvMzax7eqGmXYrweMSPbrYGubcppAfDuVUjjOHM1/mZ1NykJd+qVqvqRba6qX4mWsSg
7W8SfKE/UriXk5N8fhKtvdfbN2s5YCWPlRzZhPVUXDor0RjMzav1FX959f3k6PDV6x//tP9j
W7fM4WBy1G2tPt3Ux7a21QGXaXtK35AmlQM0mrSiOS59K/pILGcb+qx7SM5PbZIXHeKNMxHW
WlXDmm9VH9sUELtZNEQzwNkQLUCOxWpNTidT3578br9hsbfSzEPlUKU51RbR6u3cbMm7Otm0
bVICwxU7tS60uu/FdE+01Xei5lXanSjRxc1+wjCwUAMYpNcIClDcuuYe8pZC5BBmyzpajhKp
fW/chfrkuy5Us3tdFyF9cGh9vrx5Z2AKMFGNmsFDDazrT55nL5/uPzpYr9WHTc1XpoHtjUhd
SEv7lk0wD3Koz/BTpqbtZlYOarErTsZ8+bbRSVA/PeN9X9/LUcHNqNd/zZVOtVb+tqoou9pd
kd336LHlw+2grH7pZtmzg9e/P3zRzcKO+tBFVcXOpqr6xVWVie7/+MOrbhbQXhvQ3SOETq0t
1Vb0czSAsBm1vuC59F7vXt53s5Vq77ZDtKriatPY9UvUa0B7N23sv8ZZc/+Gmh6/tSKxr6ki
P8QVPda6wRKRmkfLT12t+vyFTjVCS21BfYXq1fWuzxqoHFpq5fGy2vzQrdTWOifL6pwkdXTA
rILDRBYTfz7YJbtIclHQ7eBDjNKO0VbUnruDy/PnitpCXWDMv2Kvxq5L+2zZse/kayx+BZtJ
y4rsdpTLGK2oPmpRIRhymoais2JaREX1sbkomWjfPwmvK1ctygPYN6oX2rqychZIWejmO/7A
tqa1uHsRt95UyToJlaJ+pLz8F8q77NUxTDSddSMAhyVjmT0gbcrJ2yVAsjIOrmnNAF7gQ0JI
WVsJKXUIpIhgkvH0l3LVXjsUST471lfYG9OquGLIxgrNZEpSvcnAlnWW8DZ/bSyrrND8rD8/
mg2nKR+6nqV86LpKA85vQC1zKp/yg3LbTKM9VJFIvu+/Onz64vkPbS9IOHZd+eshwp5MnPlB
xBnKk2TE9iocIBpwRjETUr3NShqQXq8dCwvsz2lErIzlAQRxCA+NYlmDhvnXHz6tOfXxm6AG
Wjhf4PCQwwf/R2mjHEjF0KVL90pYJ+dXGr8b0oN1Je4P+gzN+nFw2XZvHRNdcUYhk00BLluz
Ia3pmNZ0UGvRqNRdV4XNCvw+uEFExfUqnaS4wRT+FLRhVW4Wjxxxfbm4f2+Mu2XqtkS/jCGo
ZC/Ag4+Uzie2/Ya22BQZARTeeGBr4FUBK05J5JGAIrkutVSx3DRBm2vqgY9I8VqG0Kp6C9fu
Nf6YvMLeVo2kaJYXlzvPOE1Skrq9t0FaJUX+BHIRPyuD+w+iH+78CYZ1Hc7fn5RoLP8Xn8dn
05R/KJ1wiQ7UDreAYil7+kZF/7zb70BAOM8OC71N8VCvU3z6Jt/t6xe1x1HvVNCioiFaTaPu
9B66Qs5BIhpIo9ohAZyDu9P00zk8VfbJFAYy10WRBS4m1h6uv8tn/fxdAV/8oSaFo22QNL2b
vVdS1UguW5GLwVZUDjMKUF3BUB2JCxfe6eMDTgQuhlJpPemCURK9UBxgynr4G9ElG3wWGTjW
1ztG25jvHdmuHsUNV33R6IrmIDXLeavRdHgEBT1uusZNYLi4xGgwL9mRsat7/FzDdeCvbnbY
pUQ9oeHfcdrX0HV8UejZDDXB0HUknqXZ+3s9Rd6zGwFX9j6cFDWS75uQRXgv8B46A3hjX/Zt
ZcVVkPV0uc8+tNz/qjDA+sYLgLV0iBDDJHV3kpdh3MuH3UuYkesGT0C5Uw04laj8eg9qZy6P
1WomhBVvdBguEjtDzUs26sF09MYB1R29oo7jcz2ocQMaRBOwyyx+PDhw3w0PotsJKmwLNOZz
xIx9nByIEFMhB2GPV9xqSIbCoHHwPJ7MxvmikRHgkq/y/LexkDFow9r+l9V/29zZ2pr/ZbXD
G30wyNXgfJYpyUsNUHBFNxxMSLjRiMg46s81U3TdStb8mglE/mVJdPEqvO3HUypJ47m4C+f+
baAmnQWDaAzm4WMyM4fitJFxfMmZ+JEZ3oqLNlBmWXiRWhdYZFrkAfw3u7sWJhpbS5AN5UHF
ZWm9YxBTCNlBkwWZvN1UHFo325BhfpU1iNA5ylwaeFXPWnwJkv/rceCfRqahH6gXM5gN6/x6
OV/CsPuWqUKMwfRrkJilNCapFzEhVcLzJUUEs/PHDoXP8ikyvla1X2N9fY3yq0FT4yWMrtf6
1BVFroxpXaIoRluo1xoCE1nKDn9/sP+9LOyT1+01rcFCgsDYK6tq32qteIwNm6rT1TJMw6kX
VelQBnoHj6qghB7as+qRpgLsRZgWH03ZJkX5yIJbXe5Tff2uEJAuZvUP82UfnEsa+5wV01mi
CZyFHvFUry6rZPYFfFqr6i27cUGvzEyK2iInJd3Co2BkcOuGfvHYMBmkJ6vP3eYdOWOUxWx4
JLLJvsCkWwHeZCn0Gr7IYPtn3jcTrlUvHz/d/+GV7LHH+398+jr7OYte/n7/TweHPzzKfua2
8q9/ty904T9fHviR4OZw8qXZitcChj4HiiLZStu5Owfsdm98U+6F1Ug2EZ2GA+D5GGAe+fVG
WBBe1oGBoHFU1TuHQ0tQKEbjx2O9svfM8dXMsBPVtDf1yuMCONjwQdbUDSV5jyxL9bdCvRrK
InXVDC0t+9TYA9qKANGGRjasE54CVPFkCETCYgi6b9TB4ZonDmmx58VFKAQNcIrsjw6/L0ah
AC9yM30zfO++P5tpKkw1nFwUejWYu6/MvBqv4EU2ngyGx1fOmwrz2EM8L28nbDEC+WyO7LLw
OjsbTzX739lc7xIVkrsYLs5wfya+jq8yhhnSUXKWD+fwASnRCu4lzEt41VzJb1z7By8r3mmX
R0NQP0RzwEJyQentkQw2XzD1RrgSc26OY5yJ900eXe3RqeuJeYfY9Yn9fJANhQ5jynrzZ2gM
I+cxt9BbzTS21qK3N+lR0qAlrwSTT6bqj38TZwGcrKvWpzm19m0VcOunRmENGV8dj+X6uKBg
Pa2M7IZ21Np8ug22smY76o2qXmNBbahv9sXlxtMb9dlsOr3hTD9mOtV/8Tg/bjWt11liML3R
GJsMlzepeFP7JTS8lfaEywp4Xt8CS12b6qbQWsP/54yiddDGNs5rDKPXV/xc42hiuL4+Z+kn
c9Ai+bhWunJKzBxR63yEtd7qwqXwUcXxPzqm/06MNUbx54nd82n3qmIcX4LXXr4xulGNhOmu
Nve/nM/uflFuugKSL8A81/jjj3DNzQd66P9fiX32LDFCql68Ptj1iaTXZY+vu+TT5QCXEAWC
owL2rYjBbf6XgYZkT5z79pJS8k94uvol7AlFa5mBD0y9HPl71QBH+SCbkH7Vr/RqcuYYIEPr
/cdlXuY/TorR+sA54bN6SjarKeg4/lpTLi3xz2hw5sCRnLbO0JjGwEuU/4y8a2nswyMa2dMu
u9laZQ7XRkZY28NF76EDtFlngrZNHVVcGXlqiN/ESrRWDuOxvf5x/9F/pO7WsVf3MEmyxfut
3UzcudKuQ5NATrr54/O4I6T28o7ebrz6IZHbtHjdmM7ufbRzvf+PBzwT0gFSNYhvBXBosHIK
FFfT+WIjveOS1A4Yqmz/CiYajKrODd6L97pkEWgRVG8Z4Ksu8X/dS8PANJOfu9BlEecQV7QJ
8/Ne5k0wMqy0huobx+2pBG2/azDWu03mQ0Kqeui/drO4EW67jY34lbrsBGCac0HkFb2nARNB
Ady0YWxKVfHxOuJzcy7P0XdnugN9X3VrCDq4Gh3MsA6xhPF96Vp3uvFhNDzy7VWPx7H/Iq/i
E7Cqjo13dKNKNmn5iyhgkw83YQKrn27M7SXf/lcxeRUOw3H9NeYuoaDdT+Dv0kX6RJaubYTQ
c/4xdezGXJ3JdMbfpeXxJinsWcAKW/fPzO8Jf/b/AFBLAwQUAAAACAA6Vvss8K2lrOUGAACU
EQAACwAVAHNvcnRwZXJmLnB5VVQJAAOwXkI9p15CPVV4BAD0AWQAjVdtb9s2EP6uX3FVEFRO
HNV2ugwzlgEF9oIB2Qas+eYaAS1RNmuJVEkqtvehv313JCVLtjfUSGCZvDvey8PnTnEcf1Ta
Qs11oXTFZMbBcmPTKPrIOVRMyGQEuAWZqnA7h1JIDuYgLdunTsayVVMyy4OcamzdWHDWyEwc
x1EkqppOMQfTPlpR8fZZo11Vtb8qps2GlZ0gr+pClJ2wMlFkc3jsNtI1t/ScC52MoijnhTNY
lIpZk8jRPAL8oBt/c9toCQxDMGinABlOBi8LQsJiMobpKCWnSesKfuOSa2aFXHdSBvi+5tKI
Vz4Go8BucGmnBeYNn3lFKQCr8CByD60GU8y5DOgnz6zShxTg94I0vBwrNWf5AY2je2YMwsLn
Bh2lVW842FmxbEu+UjXMpimKsj2YwUpgzkmskJgiZdKa2U36WWEZbT6GWOvryUMewzXIkZOz
+uAT5JRqUsLYkkKS8Cr2Qnyf8drC73/9orXSR3mN4j6Faahht8NNU1rcXuiAC0Eu71FszbEo
y05y4MDFhQue7VrP+p8AnDRvqjrxDoxR8VywqNOsVAYRe/mcP5Xkg51CSFaWF9wSBWqcL/9n
GO0HC9NIvElbDOfcCfqElKMgH+b89FMzY6ITrVCoMVRmPdSstZAW4ozJt9ZjNsYcUUrnsRP3
9S4Nn5/Xsk0w3oM86Sf2QkKv4KOHJuHY4zJNjwotIDweppPR0E0xBFaLmiFO6DI9Bu8Wc7Ec
7Oa8/K8tUkw1f+X6DAJeI+V7y2WekODozCqtukVMPEdCKhGUXm8Ej4/gb7v2XOPXPSkVZWM2
SQgUmTA1NkemSMO6F8qVsMlTELITojkkSsputg2+PqUGeTD8sNNLIqHK1w/prKC7ntjpnZ2M
xp4Z+ud15K2PPPkc1sA4zq45z13BiDYN8eYr00I1Boz4hxukeNJ7Rhpzv4FpDrObG9ErMjqA
20yvm4pLvJSegkc91dA1cmHqkh2QZMfOxtiRHClTjESqziemNQmhKyQUolKepWtFscumWmF9
YSfsxlOsKku1cwIaeURbwc3cn39DNudtL8iZZW75k1/OuckQDKTZbb3zW+x85/5kZ0yHS7hv
reMF3RCY/ZW9vSw9nbTizDrncdfJf/Xy2IcPkDd1KTIskzf1GEyVJfAvDbZPWnzjF3dKYx/J
mMES4UlUvagttvumLUNAQps8WWQbuIWYdGPf+jeuivHNp3f3t18f38RLD7OiIlZI4uuZgevv
TUxacP1g4hu6E87oqA9IkkcwJrFArompdPEIdbxgNOSFIx9QI5vCjz+C6JaeAj10Pb7baZE/
y9Ej1+cSBJEM0O/Dv7vS/sYhYTkcHM+4QBFH2U8D2eO6Q0bUY8Gf1YXiD4kQ+1V1OJLh/SkX
Tr+BDMXsG4SeFmK6HNPXbIni7nvsVy+Ecn8ayt8c7ybNh4jIkiGiEKfufg2mqGNsgiasn7B4
k/mJG3e4tHTDQX94CIPCSTqwN1zy7vbUuw/ayXumUp6ojnfk1Ct4P/SJiP1p8X4+7BSEsye4
gUTCu3fwfpjNK/hV6ZAOXnJiNkMMteJEY8hFGbbv1Wec9XCYo45KHIaX3CDgJUqdGGNISYXI
BA0aGMAuPfOkYnVSsmqVM9jP4e5uj7W7hM2vQ2xSZINEtRSB8+eHNQ74Y8rZtwWSRqcOsRVG
t7ibpN8tMVHykj+P/+/PM43POHFh0GprsDNsOSzukf9xEh/DxP1N3c/7JY3MOLcwQyMFyx11
9ExR7TG3qsw58khV+2CwLyhJcPjSiGxLzoxhtxHIao3hvrtUPBdM9iypdjBH5hx7tFMnqkSe
l70sMTd4Qy1elT3mBmekAtPjQDM7JS2EtNu/o6Du3P+oRzph9DgKjvozVUCc8nctvH2EJtGi
i/ovvqzVTLMVvh2BT7AwPTNGrCWBjUmLYCswOspXATssAmevBAVWUXBIphdqHg6/CL43g2I7
OvaThn+V7IaMP/AnteK1xpN21PWEfFVbrIcrrsm0qG0YEP6Sx+Fh3o0sJIUBYzl0e1ued6oT
NANJzw0JYhnpH1/cRl7hl73V7KjicudBgZMGD+AI9Lb2b4JKp8PmeYWQLhhNyGeHuP0t8ff0
O/9MND2buGfMN02BePjrYjpfzvvoGwQconSDTSfkrDpzuJocDS2PRekfMOsf4A+xw2ThnJeJ
4uCDGIienzJbnnSf3kn3pye1OdKYkTaXLreFxifNCRg0Q3W+nGnvKYHnr2p02alP/P/RrYGH
HyYPPyBF7XHi2DAcAdj5i1doR+Rdsvfburu1W7yr29ntdHRm/gpcyXPeDmu9WvUG7CjCPL28
SFbxlxd6VXj78kLBv7y89W77KxL9C1BLAwQUAAAACAA2VvssFCAPP+YGAACVEQAADgAVAHRp
bXNvcnRwZXJmLnB5VVQJAAOnXkI9p15CPVV4BAD0AWQAjVdtb9s2EP6uX3FVEFROHNV2ugwz
lgEF9oIB2Qas+eYaAS1RNmuJVEkqtvehv313JCVLtjfUSGCZvDvey8PnTnEcf1TaQs11oXTF
ZMbBcmPTKPrIOVRMyGQEuAWZqnA7h1JIDuYgLdunTsayVVMyy4OcamzdWHDWyEwcx1EkqppO
MQfTPlpR8fZZo11Vtb8qps2GlZ0gr+pClJ2wMlFkc3jsNtI1t/ScC52MoijnhTNYlIpZk8jR
PAL8oBt/c9toCQxDMGinABlOBi8LQsJiMobpKCWnSesKfuOSa2aFXHdSBvi+5tKIVz4Go8Bu
cGmnBeYNn3lFKQCr8CByD60GU8y5DOgnz6zShxTg94I0vBwrNWf5AY2je2YMwsLnBh2lVW84
2FmxbEu+UjXMpimKsj2YwUpgzkmskJgiZdKa2U36WWEZbT6GWOvryUMewzXIkZOz+uAT5JRq
UsLYkkKS8Cr2Qnyf8drC73/9orXSR3mN4j6Faahht8NNU1rcXuiAC0Eu71FszbEoy05y4MDF
hQue7VrP+p8AnDRvqjrxDoxR8VywqNOsVAYRe/mcP5Xkg51CSFaWF9wSBWqcL/9nGO0HC9NI
vElbDOfcCfqElKMgH+b89FMzY6ITrVCoMVRmPdSstZAW4ozJt9ZjNsYcUUrnsRP39S4Nn5/X
sk0w3oM86Sf2QkKv4KOHJuHY4zJNjwotIDweppPR0E0xBFaLmiFO6DI9Bu8Wc7Ec7Oa8/K8t
Ukw1f+X6DAJeI+V7y2WekODozCqtukVMPEdCKhGUXm8Ej4/gb7v2XOPXPSkVZWM2SQgUmTA1
NkemSMO6F8qVsMlTELITojkkSsputg2+PqWVQSIMv+z0kkwo8/VDOivosid2emcno7Gnhv6B
HXvrI1E+hzUwjrRrznNXMeJNQ8T5yrRQjQEj/uEGOZ70npHH3G9gmsPs5kb0qowO4DbT66bi
Em+l5+BRTzW0jVyYumQHZNmxszF2LEfKFCOxqvOJaU1C6AoJhaiUp+laUeyyqVZYYNgJu/Ec
q8pS7ZyARiLRVnAz9+ffkM152wxyZplb/uSXc24yRANpdlvv/BY737k/2RnT4RLuW+t4QzeE
Zn9nby9LTyetOLPOedx18l+9PDbiA+RNXYoMy+RNPQZTZQn8S4P9kxbf+MWd0thIMmawRHgS
VS9qi+2+acsQkNAmTxbZBm4hJt3Y9/6Nq2J88+nd/e3Xxzfx0sOsqIgWkvh6ZuD6exOTFlw/
mPiGLoUzOuoDkuQRjEkskGxiKl08Qh0vGA2J4UgI1Mmm8OOPILqlp8APXZPvdlrkz3L0yDW6
BEEkA/T78O/utL9yyFgOB8czLnDEUfbTQPa47pAR9WjwZ3Wh+EMmxIZVHY5seH9KhtNvYEMx
+wahp4WYLsf0NVuiuPse+9ULodyfhvI3x7tJAyIismSIKMSpu1+DMeoYm6AR6ycs3mR+4sYd
Li3ddNCfHsKkcJIObA6XvLs99e6DdvKeqZQnquMdOfUK3g99ImZ/WryfD1sF4ewJbiCR8O4d
vB9m8wp+VTqkg5ecmM0QQ6040RhyUYb9e/UZhz2c5qilEofhJTcIeIlSJ8YYUlIhMkGTBgaw
S888qVidlKxa5Qz2c7i722PtLmHz6xCbFNkgUS1F4AD6YY0T/phy9m2BpNGpQ2yF0S3uJul3
S0yUvOTP4//780zzM45cGLTaGuwMWw6Le+R/HMXHMHF/U/fzfkkzMw4uzNBMwXJHHT1TVHvM
rSpzjjxS1T4Y7AtKEhy+NCLbkjNj2G0EslpjuO8uFc8Fkz1Lqp3MkTnHHu3UiSqR52UvS8xN
3lCLV2WPucEhqcD0ONDMTkkLIe327yioO/c/6pFOmD2OgqP+UBUQp/xdC68foUm06KL+i29r
NdNsha9H4BMsTM+MEWtJYGPSItgKjI7yVcAOi8DZK0GBVRQckumFmofDL4LvzaDYjo79pOHf
Jbsh4w/8Sa14rfGkHXU9IV/VFuvhimsyLWobBoS/5HF4mHcjC0lhwFgO3d6W553qBM1A0nND
glhG+sc3t5FX+GVvNTuquNx5UOCkwQM4Ar2t/aug0umweV4hpAtGI/LZIW5/S/w9/c4/E03P
Ju4Z801jIB7+upjOl/M++gYBhyjdYNMJOavOHK4mR0PLY1H6B8z6B/hD7DBZOOdlojj4IAai
56fMlifdp3fS/elJbY40ZqTNpcttofFJcwIGzVCdL2fae0rg+bsaXXbqE/9/dGvg4YfJww9I
UXucODYMRwB2/uYV2hF5l+z9tu5u7Rbv6nZ2Ox2dmb8CV/Kct8Nar1a9ATuKME8vL5JV/OWF
3hXevrxQ8C8vb73b/opE/wJQSwECFwMUAAAACAAUnPos3wmbRCpTAADhHgEAFAANAAAAAAAB
AAAApIEAAAAAT2JqZWN0cy9saXN0b2JqZWN0LmNVVAUAA7iHQT1VeAAAUEsBAhcDFAAAAAgA
Olb7LPCtpazlBgAAlBEAAAsADQAAAAAAAQAAAKSBcVMAAHNvcnRwZXJmLnB5VVQFAAOwXkI9
VXgAAFBLAQIXAxQAAAAIADZW+ywUIA8/5gYAAJURAAAOAA0AAAAAAAEAAACkgZRaAAB0aW1z
b3J0cGVyZi5weVVUBQADp15CPVV4AABQSwUGAAAAAAMAAwDeAAAAu2EAAAAA
--------------080204010802000409070906--



From skip@pobox.com  Sat Jul 27 16:32:20 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sat, 27 Jul 2002 10:32:20 -0500
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEPBAHAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCOELPAHAB.tim.one@comcast.net>
 <LNBBLJKPBEHFEDALKOLCCEPBAHAB.tim.one@comcast.net>
Message-ID: <15682.48388.755636.474915@12-248-11-90.client.attbi.com>

    Tim> Skip's Pentium III acts most like my Pentium III, which shouldn't
    Tim> be surprising.

    ...

    Tim>     sf userid    ~sort speedup under timsort (negative means slower)
    Tim>     ---------    ---------------------------------------------------
    Tim>     montanaro    -23%
    Tim>     tim_one      - 6%
    Tim>     jacobs99     +18%
    Tim>     lemburg      +25%
    Tim>     nascheme     +30%

I should point out that my PIII is in a laptop.  I don't know if it's a
so-called mobile Pentium or not.  /proc/cpuinfo reports:

    processor       : 0
    vendor_id       : GenuineIntel
    cpu family      : 6
    model           : 8
    model name      : Pentium III (Coppermine)
    stepping        : 1
    cpu MHz         : 451.030
    cache size      : 256 KB
    fdiv_bug        : no
    hlt_bug         : no
    f00f_bug        : no
    coma_bug        : no
    fpu             : yes
    fpu_exception   : yes
    cpuid level     : 2
    wp              : yes
    flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse
    bogomips        : 897.84

It also has separate 16KB L1 I and D caches.  From what I was able to glean
from a quick glance at a Katmai vs. Coppermine article, the Coppermine's L2
cache is full-speed, on-chip, with a 256-bit wide connection and 8-way set
associative cache.

Does any of that help explain why my results are similar to Tim's?

Skip


From tim.one@comcast.net  Sat Jul 27 21:22:47 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 27 Jul 2002 16:22:47 -0400
Subject: [Python-Dev] Sorting
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEPBAHAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEACAIAB.tim.one@comcast.net>

[Tim]
> ...
> I also noted that msort() gets a 32% speedup on my box when sorting a
> 1.33-million line snapshot of the Python-Dev archive.  This is a puzzler
> to account for, since you wouldn't think there's significant pre-existing
> lexicographic order in a file like that.  McIlroy noted similar results
> from experiments on text, PostScript and C source files in his adaptive
> mergesort (which is why I tried sorting Python-Dev to begin with), but
> didn't offer a hypothesis.

Just a note to clarify what "the puzzle" here is.  msort() may or may not be
faster than sort() on random input on a given box due to platform quirks,
but that isn't relevant in this case.  What McIlroy noted is that the total
# of compares done in these cases was significantly less than log2(N!).
That just can't happen (except with vanishingly small probability) if the
input data is randomly ordered, and under any comparison-based sorting
method.

The only hypothesis I have is that, for a stable sort, all the instances of
a given element are, by definition of stability, already "in sorted order".
So, e.g., "\n" is a popular line in text files, and all the occurrences of
"\n" are already sorted.  msort can exploit that -- and seemingly does.
This doesn't necessarily contradict that ~sort happens to run slower on my
box under msort, because ~sort is such an extreme case.

OK, if I remove all but the first occurrence of each unique line, the # of
lines drops to about 600,000.  The speedup msort enjoys also drops, to 6.8%.
So exploiting duplicates does appear to account for the bulk of it, but not
all of it.

If, after removing duplicates, I run that through random.shuffle() before
sorting, msort suffers an 8% slowdown(!) relative to samplesort.

If I shuffle first but don't remove duplicates, msort enjoys a 10% speedup.

So it's clear that msort is getting a significant advantage out of the
duplicates, but it's not at all clear what else it's exploiting -- only that
there is something else, and that it's significant.  Now many times has
someone posted an alphabetical list of Python's keywords <wink>?



From guido@python.org  Sat Jul 27 22:56:30 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 27 Jul 2002 17:56:30 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2
In-Reply-To: Your message of "Mon, 22 Jul 2002 15:38:16 EDT."
 <15676.24360.88972.449273@anthem.wooz.org>
References: <15676.16356.112688.518256@anthem.wooz.org> <LNBBLJKPBEHFEDALKOLCOEJPAGAB.tim.one@comcast.net>
 <15676.24360.88972.449273@anthem.wooz.org>
Message-ID: <200207272156.g6RLuU826463@pcp02138704pcs.reston01.va.comcast.net>

> It's a bit uglier than that because since Lib/test gets magically
> added to sys.path during regrtest by virtue of running "python
> Lib/test/regrtest.py".

Perhaps regrtest.py can specifically remove its own directory from
sys.path?  (Please don't just remove sys.path[0] or ''; look in
sys.argv[0] and deduce from there.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sat Jul 27 22:51:50 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 27 Jul 2002 17:51:50 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2
In-Reply-To: Your message of "Mon, 22 Jul 2002 13:24:52 EDT."
 <15676.16356.112688.518256@anthem.wooz.org>
References: <E17W9rU-0001Nk-00@usw-pr-cvs1.sourceforge.net>
 <15676.16356.112688.518256@anthem.wooz.org>
Message-ID: <200207272151.g6RLpoi26443@pcp02138704pcs.reston01.va.comcast.net>

> A better fix, IMO, is to recognize that the `test' package has become
> a full fledged standard lib package (a Good Thing, IMO), heed our own
> admonitions not to do relative imports, and change the various places
> in the test suite that "import test_support" (or equiv) to "import
> test.test_support" (or equiv).

Good idea.

> I've twiddled the test suite to do things this way, and all the
> (expected Linux) tests pass, so I'd like to commit these changes.

You've done this by now, right?  Fine.

> Unit test writers need to remember to use test.test_support instead of
> just test_support.  We could do something wacky like remove '' from
> sys.path if we really cared about enforcing this.  It would also be
> good for folks on other systems to make sure I haven't missed a
> module.

Perhaps it would be a good idea for test_support (and perhaps some
other crucial testing support modules) to add something at the top
like this?

   if __name__ != "test.test_support":
      raise ImportError, "test_support must be imported from the test package"

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sat Jul 27 23:17:39 2002
From: guido@python.org (Guido van Rossum)
Date: Sat, 27 Jul 2002 18:17:39 -0400
Subject: [Python-Dev] More Sorting
In-Reply-To: Your message of "Mon, 22 Jul 2002 23:19:32 EDT."
 <3D3CCB44.4F2592ED@metaslash.com>
References: <3D3CCB44.4F2592ED@metaslash.com>
Message-ID: <200207272217.g6RMHdA00500@pcp02138704pcs.reston01.va.comcast.net>

> Sebastien Keim posted a patch (http://python.org/sf/544113) 
> of a merge sort.  I didn't really review it, but it included
> test and doc.  So if the bisect module is being added to, 
> perhaps someone should review this patch.

It doesn't strike me as a "fundamental" algorithm like bisection or
heap sort.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Sun Jul 28 06:48:26 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 28 Jul 2002 01:48:26 -0400
Subject: [Python-Dev] RE: companies data for sorting comparisons
In-Reply-To: <KJEOLDOPMIDKCMJDCNDPEEJDCDAA.altis@semi-retired.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEANAIAB.tim.one@comcast.net>

Kevin Altis kindly forwarded a 1.5MB XML database with about 6600 company
records.  A record looks like this after running his script to turn them
into Python dicts:

  {'Address': '395 Page Mill Road\nPalo Alto, CA 94306',
   'Company': 'Agilent Technologies Inc.',
   'Exchange': 'NYSE',
   'NumberOfEmployees': '41,000',
   'Phone': '(650) 752-5000',
   'Profile': 'http://biz.yahoo.com/p/a/a.html',
   'Symbol': 'A',
   'Web': 'http://www.agilent.com'}

It appears to me that the XML file is maintained by hand, in order of ticker
symbol.  But people make mistakes when alphabetizing by hand, and there are
37 indices i such that

    data[i]['Symbol'] > data[i+1]['Symbol']

So it's "almost sorted" by that measure, with a few dozen glitches -- and
msort should be able to exploit this!  I think this is an important case of
real-life behavior.  The proper order of Yahoo profile URLs is also strongly
correlated with ticker symbol, while both the company name and web address
look weakly correlated, so there's hope that msort can get some benefit on
those too.

Here are runs sorting on all field names, building a DSU tuple list to sort
via

    values = [(x.get(fieldname), x) for x in data]

Each field sort was run 5 times under sort, and under msort.  So 5 times are
reported for each sort, reported in milliseconds, and listed from quickest
to slowest:

Sorting on field 'Address' -- 6589 records
    via  sort:  43.03  43.35  43.37  43.54  44.14
    via msort:  45.15  45.16  45.25  45.26  45.30

Sorting on field 'Company' -- 6635 records
    via  sort:  40.41  40.55  40.61  42.36  42.63
    via msort:  30.68  30.80  30.87  30.99  31.10

Sorting on field 'Exchange' -- 6579 records
    via  sort: 565.28 565.49 566.70 567.12 567.45
    via msort: 573.29 573.61 574.55 575.34 576.46

Sorting on field 'NumberOfEmployees' -- 6531 records
    via  sort: 120.15 120.24 120.26 120.31 122.58
    via msort: 134.25 134.29 134.50 134.74 135.09

Sorting on field 'Phone' -- 6589 records
    via  sort:  53.76  53.80  53.81  53.82  56.03
    via msort:  56.05  56.10  56.19  56.21  56.86

Sorting on field 'Profile' -- 6635 records
    via  sort:  58.66  58.71  58.84  59.02  59.50
    via msort:   8.74   8.81   8.98   8.99   8.99

Sorting on field 'Symbol' -- 6635 records
    via  sort:  39.92  40.11  40.19  40.38  40.62
    via msort:   6.49   6.52   6.53   6.72   6.73

Sorting on field 'Web' -- 6632 records
    via  sort:  47.23  47.29  47.36  47.45  47.45
    via msort:  37.12  37.27  37.33  37.42  37.89

So the hopes are realized:  msort gets huge benefit from the nearly-sorted
Symbol field, also huge benefit from the correlated Profile field, and
highly significant benefit from the weakly correlated Company and Web
fields.  K00L!

The Exchange field sort is so bloody slow because there are few distinct
Exchange values, and whenever there's a tie on those the tuple comparison
routine tries to break it by comparing the dicts.  Note that I warned about
this kind of thing a week or two ago, in the context of trying to implement
priority queues by storing and comparing (priority, object) tuples -- it can
be a timing disaster if priorities are ever equal.

The other fields (Phone, etc) are in essentially random order, and msort is
systematically a bit slower on all of those.  Note that these are all string
comparisons.  I don't think it's a coincidence that msort went from a major
speedup on the Python-Dev task, to a significant slowdown, when I removed
all duplicate lines and shuffled the corpus first.

Only part of this can be accounted for by # of comparisons.  On a given
random input, msort() may do fewer or more comparisons than sort(), but
doing many trials suggests that sort() has a small edge in # of compares on
random data, on the order of 1 or 2%  This is easy to believe, since msort
does a few things it *knows* won't repay the cost if the order happens to be
random.  These are well worth it, since they're what allow msort to get huge
wins when the data isn't random.

But that's not enough to account for things like the >10% higher runtime in
the NumberOfEmployees sort.  I can't reproduce this magnitude of systematic
slowdown when doing random sorts on floats or ints, so I conclude it has
something to do with string compares.  Unlike int and float compares, a
string compare takes variable time, depending on how long the common prefix
is.  I'm not aware of specific research on this topic, but it's plausible to
me that partitioning may be more effective than merging at reducing the
number of comparisons specifically involving "nearly equal" elements.
Indeed, the fastest string-sorting methods I know of move heaven and earth
to avoid redundant prefix comparisons, and do so by partitioning.

Turns out <wink> that doesn't account for it, though.  Here are the total
number of comparisons (first number on each line) done for each sort, and
the sum across all string compares of the number of common prefix characters
(second number on each line):

Sorting on field Address' -- 6589 records
    via  sort: 76188 132328
    via msort: 76736 131081

Sorting on field 'Company' -- 6635 records
    via  sort: 76288 113270
    via msort: 56013 113270

Sorting on field 'Exchange' -- 6579 records
    via  sort: 34851 207185
    via msort: 37457 168402

Sorting on field 'NumberOfEmployees' -- 6531 records
    via  sort: 76167 116322
    via msort: 76554 112436

Sorting on field 'Phone' -- 6589 records
    via  sort: 75972 278188
    via msort: 76741 276012

Sorting on field 'Profile' -- 6635 records
    via  sort: 76049 1922016
    via msort:  8750  233452

Sorting on field 'Symbol' -- 6635 records
    via  sort: 76073 73243
    via msort:  8724 16424

Sorting on field 'Web' -- 6632 records
    via  sort: 76207 863837
    via msort: 58811 666852

Contrary to prediction, msort always got the smaller "# of equal prefix
characters" total, even in the Exchange case, where it did nearly 10% more
total comparisons.

Python's string compare goes fastest if the first two characters aren't the
same, so maybe sort() gets a systematic advantage there?  Nope.  Turns out
msort() gets out early 17577 times on that basis when doing
NumberOfEmployees, but sort() only gets out early 15984 times.

I conclude that msort is at worst only a tiny bit slower when doing
NumberOfEmployees, and possibly a little faster.  The only measure that
doesn't agree with that conclusion is time.clock() -- but time is such a
subjective thing I won't let that bother me <wink>.



From tim.one@comcast.net  Sun Jul 28 09:07:12 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 28 Jul 2002 04:07:12 -0400
Subject: [Python-Dev] RE: companies data for sorting comparisons
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEANAIAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBAAIAB.tim.one@comcast.net>

[Tim]
> ...
> Sorting on field 'NumberOfEmployees' -- 6531 records
>     via  sort: 120.15 120.24 120.26 120.31 122.58
>     via msort: 134.25 134.29 134.50 134.74 135.09
> ...
> [where the # of comparisons done is]
> Sorting on field 'NumberOfEmployees' -- 6531 records
>     via  sort: 76167 ...
>     via msort: 76554 ...
> ...
> [and various hypotheses for why it's >10% slower anyway don't pan out]
> ...
> I conclude that msort is at worst only a tiny bit slower when doing
> NumberOfEmployees, and possibly a little faster.  The only measure that
> doesn't agree with that conclusion is time.clock() -- but time is such a
> subjective thing I won't let that bother me <wink>.

It's the dicts again.  NumberOfEmployees isn't always unique, and in
particular it's missing in 6635-6531 = 104 records, so that

    values = [(x.get(fieldname), x) for x in data]

includes 104 tuples with a None first element.  Comparing a pair of those
gets resolved by comparing the dicts, and dict comparison ain't cheap.
Building the DSU tuple-list via

    values = [(x.get(fieldname), i, x) for i, x in enumerate(data)]

instead leads to

Sorting on field 'NumberOfEmployees' -- 6531 records
    via  sort:  47.47  47.50  47.54  47.66  47.75
    via msort:  48.21  48.23  48.43  48.81  48.85

which gives both methods a huge speed boost, and cuts .sort's speed
advantage much closer to its small advantage in total # of comparisons.  I
expect it's just luck of the draw as to which method is going to end up
comparing tuples with equal first elements more often, and msort apparently
does in this case (and those comparisons are more expensive, because they
have to go on to invoke int compare too).

A larger lesson:  even if Python gets a stable sort and advertises stability
(we don't have to guarantee it even if it's there), there may *still* be
strong "go fast" reasons to include an object's index in its DSU tuple.

tickledly y'rs  - tim



From fredrik@pythonware.com  Sun Jul 28 09:30:32 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 28 Jul 2002 10:30:32 +0200
Subject: [Python-Dev] RE: companies data for sorting comparisons
References: <LNBBLJKPBEHFEDALKOLCCEBAAIAB.tim.one@comcast.net>
Message-ID: <008b01c23611$0bc24460$ced241d5@hagrid>

tim wrote:    

> A larger lesson:  even if Python gets a stable sort and advertises stability
> (we don't have to guarantee it even if it's there)

if we guarantee it, all python implementors must provide one.

how hard is it to implement a reasonably good stable sort from
scratch?

I can think of lots of really stupid ways to do it on top of existing
sort code, which might be a reason to provide two different sort
methods: sort (fast) and stablesort (guaranteed, but maybe not
as fast as sort).  in CPython, both names can map to timsort.

(shouldn't you be writing a paper on this, btw?  or start a sort
blog ;-)

</F>



From nhodgson@bigpond.net.au  Sun Jul 28 14:00:06 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Sun, 28 Jul 2002 23:00:06 +1000
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020727024012.85905.qmail@web40107.mail.yahoo.com>
Message-ID: <003701c23636$b25b0a80$3da48490@neil>

Scott Gilbert:

> First, could this be implemented by a gapped_buffer object that implements
> the locking functionality you want, but that returns simple buffers to
work
> with when the object is locked.  In other words, do we need to add this
> extra functionality up in the core protocol when it can be implemented
> specifically the way Scintilla (cool editor by the way) wants it to be in
   (Thanks)
> the Scintilla specific extension.

   Would this mean that the explicit locking completely defines the validity
of the address or is the address valid until the 'view' buffer object is
garbage collected? I would like the gapped_buffer to be put back into gapped
mode as soon as possible and depending on the lifetime of a view buffer
object is not that robust in the face of alternate Python implementations
that use non-reference-counted GC implementations (Jython / Python .Net).

> Second, if you are using mutexes to do this stuff, you'll have to be very
> careful about deadlock.

   By locking, I want to change state on the buffer from having a gap and
allowing resizes to having a static size and address which will remain valid
until an unlock. The lock and unlock are not treating the buffer as a mutex
(I'd call the operations 'acquire' and 'release' then) although mutexes may
be needed for safety in the lock and unlock implementations. It is likely
that the lock and unlock would be counted (it can be locked twice and then
won't be expandable until it is unlocked twice) and that exceptions would be
thrown for length changing operations while locked.

   If you think my particular use is out of the scope of what you are trying
to achieve then that is fine.

   Neil



From neal@metaslash.com  Sun Jul 28 15:03:13 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Sun, 28 Jul 2002 10:03:13 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test
 test_email_codecs.py,1.1,1.2
References: <E17W9rU-0001Nk-00@usw-pr-cvs1.sourceforge.net>
 <15676.16356.112688.518256@anthem.wooz.org> <200207272151.g6RLpoi26443@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D43F9A1.C4D491A3@metaslash.com>

Guido van Rossum wrote:
> 
> > A better fix, IMO, is to recognize that the `test' package has become
> > a full fledged standard lib package (a Good Thing, IMO), heed our own
> > admonitions not to do relative imports, and change the various places
> > in the test suite that "import test_support" (or equiv) to "import
> > test.test_support" (or equiv).
> 
> Good idea.

Shouldn't this also be done for from XXX import YYY?

    grep test_support `find Lib -name '*.py'` | \
	egrep -v '(from test |test\.test_support)' | grep import

Neal


From guido@python.org  Sun Jul 28 16:17:17 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 28 Jul 2002 11:17:17 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2
In-Reply-To: Your message of "Sun, 28 Jul 2002 10:03:13 EDT."
 <3D43F9A1.C4D491A3@metaslash.com>
References: <E17W9rU-0001Nk-00@usw-pr-cvs1.sourceforge.net> <15676.16356.112688.518256@anthem.wooz.org> <200207272151.g6RLpoi26443@pcp02138704pcs.reston01.va.comcast.net>
 <3D43F9A1.C4D491A3@metaslash.com>
Message-ID: <200207281517.g6SFHHS16631@pcp02138704pcs.reston01.va.comcast.net>

[Barry]
> > > A better fix, IMO, is to recognize that the `test' package has become
> > > a full fledged standard lib package (a Good Thing, IMO), heed our own
> > > admonitions not to do relative imports, and change the various places
> > > in the test suite that "import test_support" (or equiv) to "import
> > > test.test_support" (or equiv).

[Guido]
> > Good idea.

[Neal]
> Shouldn't this also be done for from XXX import YYY?
> 
>     grep test_support `find Lib -name '*.py'` | \
> 	egrep -v '(from test |test\.test_support)' | grep import

Good catch!  Looks like Barry hardly scratched the surface of this.
I *thought* his checkin which claimed to fix this throughout Lib/test
was a tad small. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sun Jul 28 16:23:41 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 28 Jul 2002 11:23:41 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2
In-Reply-To: Your message of "Sun, 28 Jul 2002 11:17:17 EDT."
 <200207281517.g6SFHHS16631@pcp02138704pcs.reston01.va.comcast.net>
References: <E17W9rU-0001Nk-00@usw-pr-cvs1.sourceforge.net> <15676.16356.112688.518256@anthem.wooz.org> <200207272151.g6RLpoi26443@pcp02138704pcs.reston01.va.comcast.net> <3D43F9A1.C4D491A3@metaslash.com>
 <200207281517.g6SFHHS16631@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200207281523.g6SFNfm16682@pcp02138704pcs.reston01.va.comcast.net>

> [Neal]
> > Shouldn't this also be done for from XXX import YYY?
> > 
> >     grep test_support `find Lib -name '*.py'` | \
> > 	egrep -v '(from test |test\.test_support)' | grep import

[me]
> Good catch!  Looks like Barry hardly scratched the surface of this.
> I *thought* his checkin which claimed to fix this throughout Lib/test
> was a tad small. :-(

Neal, Barry: on second thought, DON'T FIX THIS YET!

I'd like to have a discussion with Barry about the motivation for
this.  I need to at least understand why Barry thinks he needs this,
and reconcile this with my earlier position that relative imports were
compulsory in this case.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@pythonware.com  Sun Jul 28 16:49:27 2002
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 28 Jul 2002 17:49:27 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172
References: <E17Yppl-0005f3-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <010901c2364e$5ce6be10$ced241d5@hagrid>

> SF patch #577031, remove PyArg_Parse() since it's deprecated

> ! v = PyNumber_Float(v);
> ! if (!v)
>   return -1;

> v = PyNumber_Int(v);
> ! if (!v)
>   return -1;

umm.

doesn't PyNumber_Float and PyNumber_Int convert its argument to
a float/integer, if it's not already the right type?

in earlier versions of Python, "%g" % "1.0" raised a TypeError.  does
it still do that with this patch in place?

</F>



From neal@metaslash.com  Sun Jul 28 17:13:12 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Sun, 28 Jul 2002 12:13:12 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects
 stringobject.c,2.171,2.172
References: <E17Yppl-0005f3-00@usw-pr-cvs1.sourceforge.net> <010901c2364e$5ce6be10$ced241d5@hagrid>
Message-ID: <3D441818.83FD38F7@metaslash.com>

Fredrik Lundh wrote:
> 
> > SF patch #577031, remove PyArg_Parse() since it's deprecated
> 
> > ! v = PyNumber_Float(v);
> > ! if (!v)
> >   return -1;
> 
> > v = PyNumber_Int(v);
> > ! if (!v)
> >   return -1;
> 
> umm.
> 
> doesn't PyNumber_Float and PyNumber_Int convert its argument to
> a float/integer, if it's not already the right type?

Yes.

> in earlier versions of Python, "%g" % "1.0" raised a TypeError.  does
> it still do that with this patch in place?

No. :-(  That wasn't an intentional change.  The intent was
to convert an int/long to a double in the case of '%g' et al and
from a double to an int in the case of '%d'.

What is the best way to fix this?  If I call PyNumber_Check()
before this code, the behaviour is the same as before.

Neal


From neal@metaslash.com  Sun Jul 28 17:29:33 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Sun, 28 Jul 2002 12:29:33 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects
 stringobject.c,2.171,2.172
References: <E17Yppl-0005f3-00@usw-pr-cvs1.sourceforge.net> <010901c2364e$5ce6be10$ced241d5@hagrid> <3D441818.83FD38F7@metaslash.com>
Message-ID: <3D441BED.43049678@metaslash.com>

Neal Norwitz wrote:
> 
> Fredrik Lundh wrote:
> >
> > > SF patch #577031, remove PyArg_Parse() since it's deprecated
> >
> > > ! v = PyNumber_Float(v);
> > > ! if (!v)
> > >   return -1;
> >
> > > v = PyNumber_Int(v);
> > > ! if (!v)
> > >   return -1;
> >
> > umm.
> >
> > doesn't PyNumber_Float and PyNumber_Int convert its argument to
> > a float/integer, if it's not already the right type?
> 
> Yes.
> 
> > in earlier versions of Python, "%g" % "1.0" raised a TypeError.  does
> > it still do that with this patch in place?
> 
> No. :-(  That wasn't an intentional change.  The intent was
> to convert an int/long to a double in the case of '%g' et al and
> from a double to an int in the case of '%d'.
> 
> What is the best way to fix this?  

To answer my own question, it appears that I should use 
PyFloat_AsDouble() and PyInt_AsLong() and check for an error.
I don't know why I didn't do this before.  This restores the
original behaviour.

I'll check this in later.  Let me know if I screwed up again.

I'll also update the tests to check for the exception.

Neal


From guido@python.org  Sun Jul 28 17:37:39 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 28 Jul 2002 12:37:39 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172
In-Reply-To: Your message of "Sun, 28 Jul 2002 12:13:12 EDT."
 <3D441818.83FD38F7@metaslash.com>
References: <E17Yppl-0005f3-00@usw-pr-cvs1.sourceforge.net> <010901c2364e$5ce6be10$ced241d5@hagrid>
 <3D441818.83FD38F7@metaslash.com>
Message-ID: <200207281637.g6SGbd816840@pcp02138704pcs.reston01.va.comcast.net>

> Fredrik Lundh wrote:
> > 
> > > SF patch #577031, remove PyArg_Parse() since it's deprecated
> > 
> > > ! v = PyNumber_Float(v);
> > > ! if (!v)
> > >   return -1;
> > 
> > > v = PyNumber_Int(v);
> > > ! if (!v)
> > >   return -1;
> > 
> > umm.
> > 
> > doesn't PyNumber_Float and PyNumber_Int convert its argument to
> > a float/integer, if it's not already the right type?
> 
> Yes.
> 
> > in earlier versions of Python, "%g" % "1.0" raised a TypeError.  does
> > it still do that with this patch in place?
> 
> No. :-(  That wasn't an intentional change.  The intent was
> to convert an int/long to a double in the case of '%g' et al and
> from a double to an int in the case of '%d'.
> 
> What is the best way to fix this?  If I call PyNumber_Check()
> before this code, the behaviour is the same as before.

Revert the change.

I don't believe PyNumber_Check() is the right thing to use here at all.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sun Jul 28 17:38:43 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 28 Jul 2002 12:38:43 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172
In-Reply-To: Your message of "Sun, 28 Jul 2002 12:29:33 EDT."
 <3D441BED.43049678@metaslash.com>
References: <E17Yppl-0005f3-00@usw-pr-cvs1.sourceforge.net> <010901c2364e$5ce6be10$ced241d5@hagrid> <3D441818.83FD38F7@metaslash.com>
 <3D441BED.43049678@metaslash.com>
Message-ID: <200207281638.g6SGch016860@pcp02138704pcs.reston01.va.comcast.net>

> To answer my own question, it appears that I should use 
> PyFloat_AsDouble() and PyInt_AsLong() and check for an error.
> I don't know why I didn't do this before.  This restores the
> original behaviour.

Good!

> I'll check this in later.  Let me know if I screwed up again.
> 
> I'll also update the tests to check for the exception.

Great!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Sun Jul 28 18:21:11 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 28 Jul 2002 13:21:11 -0400
Subject: [Python-Dev] RE: companies data for sorting comparisons
In-Reply-To: <008b01c23611$0bc24460$ced241d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECCAIAB.tim.one@comcast.net>

[Tim]
>> A larger lesson:  even if Python gets a stable sort and
>> advertises stability (we don't have to guarantee it even if
>> it's there)

[/F]
> if we guarantee it, all python implementors must provide one.

Or a middle ground, akin to CPython's semi-reluctant guarantees of refcount
semantics for "timely" finalization.  A great many CPython users appear
quite happy to rely on this despite that the language doesn't guarantee it.

> how hard is it to implement a reasonably good stable sort from
> scratch?

A straightforward mergesort using a temp vector of size N is dead easy, and
reasonably good (O(N log N) worst case).  There aren't any other major N log
N sorts that are naturally stable, nor even any I know of (and I know of a
lot <wink>) that can be made stable without materializing list indices (or a
moral equivalent).  Insertion sort is naturally stable, but is O(N**2)
expected case, so is DOA.

> I can think of lots of really stupid ways to do it on top of existing
> sort code, which might be a reason to provide two different sort
> methods: sort (fast) and stablesort (guaranteed, but maybe not
> as fast as sort).  in CPython, both names can map to timsort.

I don't want to see two sort methods on the list object, for reasons
explained before.  You've always been able to *get* a stable sort in Python
via materializing the list indices in a 2-tuple, as in Alex's "stable sort"
DSU recipe:

    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52234

People overly <wink> concerned about portability can stick to that.

> (shouldn't you be writing a paper on this, btw?

I don't think there's anything truly new here, although the combination of
gimmicks may be unique.  timsort.txt is close enough to a paper anyway, but
better in that it only tells you useful things; the McIlroy paper covers all
the rest <wink>.

> or start a sort blog ;-)

That must be some sort of web thing, hence beyond my limited abilities.



From tim.one@comcast.net  Sun Jul 28 18:52:33 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 28 Jul 2002 13:52:33 -0400
Subject: [Python-Dev] RE: companies data for sorting comparisons
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEBAAIAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECDAIAB.tim.one@comcast.net>

Turns out there was one comparison per merge step I wasn't extracting
maximum value from.  Changing the code to suck all I can out of it doesn't
make a measurable difference on sortperf results, except for a tiny
improvement on ~sort on my box, but makes a difference on the Exchange case
of Kevin's data.  Here using

    values = [(x.get(fieldname), i, x) for i, x in enumerate(data)]

as the list to sort, and times are again in milliseconds:

Sorting on field 'Address' -- 6589 records
    via  sort:  41.24  41.39  41.41  41.42  86.71
    via msort:  42.90  43.01  43.07  43.15  43.75

Sorting on field 'Company' -- 6635 records
    via  sort:  40.24  40.34  40.42  40.43  42.58
    via msort:  30.42  30.45  30.58  30.66  30.66

Sorting on field 'Exchange' -- 6579 records
    via  sort:  59.64  59.70  59.71  59.72  59.81
    via msort:  27.06  27.11  27.19  27.29  27.54

Sorting on field 'NumberOfEmployees' -- 6531 records
    via  sort:  47.61  47.65  47.73  47.75  47.76
    via msort:  48.55  48.57  48.61  48.73  48.92

Sorting on field 'Phone' -- 6589 records
    via  sort:  48.00  48.03  48.32  48.32  48.39
    via msort:  49.60  49.64  49.68  49.79  49.85

Sorting on field 'Profile' -- 6635 records
    via  sort:  58.63  58.70  58.80  58.85  58.92
    via msort:   8.47   8.48   8.51   8.59   8.68

Sorting on field 'Symbol' -- 6635 records
    via  sort:  39.93  40.13  40.16  40.28  41.37
    via msort:   6.20   6.23   6.23   6.43   6.98

Sorting on field 'Web' -- 6632 records
    via  sort:  46.75  46.77  46.86  46.87  47.05
    via msort:  36.44  36.66  36.69  36.69  36.96

'Profile' is slower than the rest for samplesort because the strings it's
comparing are Yahoo URLs with a long common prefix -- the compares just take
longer in that case.  I'm not sure why 'Exchange' takes so long for
samplesort (it's a case with lots of duplicate primary keys, but the
distribution is highly skewed, not uniform as in ~sort).  In all cases now,
msort is a major-to-killer win, or a small (but real) loss.

I'll upload a new patch and new timsort.txt next.  Then I'm taking a week
off!  No, I wish it were for fun <wink/sigh>.



From Jack.Jansen@oratrix.com  Sun Jul 28 22:03:30 2002
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Sun, 28 Jul 2002 23:03:30 +0200
Subject: [Python-Dev] python.org/switch/
In-Reply-To: <20020726223911.T70962-100000@onion.valueclick.com>
Message-ID: <782E9B13-A26D-11D6-83B1-003065517236@oratrix.com>

On zaterdag, juli 27, 2002, at 07:40 , Ask Bjoern Hansen wrote:

>
> As presented on the Perl Lightning talks here at OSCON: Switch
> movies.
>
> You guys will dig Nathan's (nat.mov and nat.mpg).
>
> http://www.perl.org/tpc/2002/movies/switch/

They're all pretty good, but I think I liked David best, he 
actually seemed to mean what he said:-)
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- 
Emma Goldman -



From greg@cosc.canterbury.ac.nz  Mon Jul 29 00:43:05 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 29 Jul 2002 11:43:05 +1200 (NZST)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <082b01c234b0$c33564e0$e000a8c0@thomasnotebook>
Message-ID: <200207282343.g6SNh55G016683@kuku.cosc.canterbury.ac.nz>

Thomas Heller <thomas.heller@ion-tof.com>:

>    This PEP proposes an extension to the buffer interface called the
>    'safe buffer interface'.

I don't understand the need for this. The C-level buffer
interface is already safe as long as you use it properly --
which means using it to fetch the pointer each time it's
needed.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From barry@zope.com  Mon Jul 29 00:51:38 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Sun, 28 Jul 2002 19:51:38 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib/email/test test_email_codecs.py,1.1,1.2
References: <15676.16356.112688.518256@anthem.wooz.org>
 <LNBBLJKPBEHFEDALKOLCOEJPAGAB.tim.one@comcast.net>
 <15676.24360.88972.449273@anthem.wooz.org>
 <200207272156.g6RLuU826463@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15684.33674.169550.228083@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    >> It's a bit uglier than that because since Lib/test gets
    >> magically added to sys.path during regrtest by virtue of
    >> running "python Lib/test/regrtest.py".

    GvR> Perhaps regrtest.py can specifically remove its own directory
    GvR> from sys.path?  (Please don't just remove sys.path[0] or '';
    GvR> look in sys.argv[0] and deduce from there.)

Good idea:

-------------------- snip snip --------------------
mydir = os.path.dirname(sys.argv[0])
sys.path.remove(mydir)
-------------------- snip snip --------------------

I also followed up to Guido privately, re: the motivation for this
change.  Also, Neal's right, I missed some of the relative imports of
test_support and I'm ready to commit those fixes once Guido gives the
go ahead.

-Barry


From xscottg@yahoo.com  Mon Jul 29 00:57:12 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Sun, 28 Jul 2002 16:57:12 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <200207282343.g6SNh55G016683@kuku.cosc.canterbury.ac.nz>
Message-ID: <20020728235712.41025.qmail@web40112.mail.yahoo.com>

--- Greg Ewing <greg@cosc.canterbury.ac.nz> wrote:
> Thomas Heller <thomas.heller@ion-tof.com>:
> 
> >    This PEP proposes an extension to the buffer interface called the
> >    'safe buffer interface'.
> 
> I don't understand the need for this. The C-level buffer
> interface is already safe as long as you use it properly --
> which means using it to fetch the pointer each time it's
> needed.
> 

This is not my PEP, but let me defend it anyway.

The need for this derives from wanting to do more than one thing at a time
in Python (multiple processors with multiple threas, asynchronous I/O, DMA
transers, ???).

One thread grabs the pointer from the "safe buffer interface" and then
releases the GIL while it works on that pointer.  Now another thread is
free to acquire the GIL and run concurrently with the first.  (The
asynchronous I/O case applies even on single processor machines...)

I believe you were the one to explain to me why an extension can't release
the GIL while it works with the PyBufferProcs acquired pointer.  This PEP
tries to allow the extension to do just that.






__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From zhaoqiang@neusoft.com  Mon Jul 29 01:15:41 2002
From: zhaoqiang@neusoft.com (zhaoq)
Date: Mon, 29 Jul 2002 08:15:41 +0800
Subject: [Python-Dev] Please remove me from the mailing list
References: <000a01c23453$fc4b04e0$3745fea9@ibm1499>
Message-ID: <010701c23695$633acc60$4a01010a@xpprofessional>

This is a multi-part message in MIME format.

--Boundary_(ID_yqfil81HVg0jUZY0v5uUNg)
Content-type: text/plain; charset=iso-8859-1
Content-transfer-encoding: 7BIT

Please remove me from the mailing list

zhaoqiang@neusoft.com

thanks
  ----- Original Message ----- 
  From: Rick Farrer 
  To: Python-Dev@python.org 
  Sent: Friday, July 26, 2002 11:24 AM
  Subject: [Python-Dev] Please remove me from the mailing list


  Please remove me from the mailing list.

  rf@avisionone.com

  Thanks,
  Rick


--Boundary_(ID_yqfil81HVg0jUZY0v5uUNg)
Content-type: text/html; charset=iso-8859-1
Content-transfer-encoding: 7BIT

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2716.2200" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2>Please remove me from the mailing list</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2><A 
href="mailto:zhaoqiang@neusoft.com">zhaoqiang@neusoft.com</A></FONT></DIV>
<DIV><FONT face=&#23435;&#20307; size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=&#23435;&#20307; size=2>thanks</FONT></DIV>
<BLOCKQUOTE dir=ltr 
style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">
  <DIV style="FONT: 9pt &#23435;&#20307;">----- Original Message ----- </DIV>
  <DIV style="BACKGROUND: #e4e4e4; FONT: 9pt &#23435;&#20307;; font-color: black"><B>From:</B> 
  <A title=rfarrer@avisionone.com href="mailto:rfarrer@avisionone.com">Rick 
  Farrer</A> </DIV>
  <DIV style="FONT: 9pt &#23435;&#20307;"><B>To:</B> <A title=Python-Dev@python.org 
  href="mailto:Python-Dev@python.org">Python-Dev@python.org</A> </DIV>
  <DIV style="FONT: 9pt &#23435;&#20307;"><B>Sent:</B> Friday, July 26, 2002 11:24 AM</DIV>
  <DIV style="FONT: 9pt &#23435;&#20307;"><B>Subject:</B> [Python-Dev] Please remove me from 
  the mailing list</DIV>
  <DIV><BR></DIV>
  <DIV><FONT face=Arial size=2>Please remove me from the mailing 
  list.</FONT></DIV>
  <DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
  <DIV><FONT face=Arial size=2><A 
  href="mailto:rf@avisionone.com">rf@avisionone.com</A></FONT></DIV>
  <DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
  <DIV><FONT face=Arial size=2>Thanks,</FONT></DIV>
  <DIV><FONT face=Arial size=2>Rick</FONT></DIV>
  <DIV>&nbsp;</DIV></BLOCKQUOTE></BODY></HTML>

--Boundary_(ID_yqfil81HVg0jUZY0v5uUNg)--


From xscottg@yahoo.com  Mon Jul 29 01:29:57 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Sun, 28 Jul 2002 17:29:57 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <003701c23636$b25b0a80$3da48490@neil>
Message-ID: <20020729002957.74716.qmail@web40101.mail.yahoo.com>

--- Neil Hodgson <nhodgson@bigpond.net.au> wrote:
> 
>    Would this mean that the explicit locking completely defines the
> validity of the address or is the address valid until the 'view' buffer 
> object is garbage collected? I would like the gapped_buffer to be put 
> back into gapped mode as soon as possible and depending on the lifetime 
> of a view buffer object is not that robust in the face of alternate
> Python implementations that use non-reference-counted GC implementations
> (Jython / Python .Net).
>

If you're worried about exactly when the object is released, you could add
a specific release() method to your object indicating that you don't intend
to use it anymore.

My point was that, with Thomas Heller's safe buffer protocol (or my bytes
object), you would have a pointer that could be manipulated independently
of the GIL, but that putting locking semantics into your gapped_buffer is
something you could add on top without complicating the core.

In other words, his PEP (or mine) allows you to do something you couldn't
necessarily do previously, and it doesn't sound like there is anything you
want to do that you won't be able to.
 
> 
>    By locking, I want to change state on the buffer from having a gap and
> allowing resizes to having a static size and address which will remain
> valid until an unlock. The lock and unlock are not treating the buffer as
> a mutex (I'd call the operations 'acquire' and 'release' then) although
> mutexes may be needed for safety in the lock and unlock implementations.
> It is likely that the lock and unlock would be counted (it can be locked
> twice and then won't be expandable until it is unlocked twice) and that
> exceptions would be thrown for length changing operations while locked.
> 

You could easily implement the a counting (recursive) mutex as described
above, and it might be the case that throwing an exception on the length
changing operations keeps the dead lock from occurring.  I'm still a bit
confused though.  When thread A locks (acquires) the buffer, and thread B
tries to do a resize and it generates an exception, what is thread B
supposed to do next?  I assume that the resize was due to something like
the user typing somewhere in the buffer.  From a user interface point of
view, you can't just ignore their request to insert text.  Would you just
try the same operation again after catching the exception?  How long would
you wait?

>
>    If you think my particular use is out of the scope of what you are
> trying to achieve then that is fine.
> 

It is definitely up to Thomas Heller to decide what he wants his scope to
be, and I don't want to step on his toes at all.  Especially since the
reason for his PEP getting written is that I didn't want to add this stuff
to mine. :-)

I'm just trying to point out two things:

  1) With his PEP, there is a way to get the behavior you desire with out
adding the complexity to the core of Python.  And with recursive/counting
mutexes, the behavior you want is getting more complicated.  The "safe
buffer protocol" is likely to cater to a wide class of users.  I could be
wrong, but the "lockable gapped buffer protocol" probably appeals to a much
smaller set.

  2) Any time you go from one lock (mutex, GIL, semaphore) to multiple
locks, you can introduce deadlock states.  Without my understanding your
design fully, your use case sounds to me like it either has the potential
for deadlock, or the potential for polling.  There are ways to avoid this
of course, but then everyone has to follow a more complicated set of rules
(for instance build a hierarchy describing the order of locks to acquire). 
Since Thomas's PEP doesn't introduce any new types of locks, it sidesteps
these problems.



Cheers,
    -Scott


__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From Rick Farrer" <rfarrer@avisionone.com  Mon Jul 29 01:41:57 2002
From: Rick Farrer" <rfarrer@avisionone.com (Rick Farrer)
Date: Sun, 28 Jul 2002 19:41:57 -0500
Subject: [Python-Dev] Remove from mailing list
Message-ID: <000c01c23698$bf2e32c0$3745fea9@ibm1499>

This is a multi-part message in MIME format.

------=_NextPart_000_0009_01C2366E.D543FBA0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

For the last time. Please remove me from your mailing =
list!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


------=_NextPart_000_0009_01C2366E.D543FBA0
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.2600.0" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>For the last time. Please remove me =
from your=20
mailing list!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!</FONT></DIV>
<DIV>&nbsp;</DIV></BODY></HTML>

------=_NextPart_000_0009_01C2366E.D543FBA0--



From greg@cosc.canterbury.ac.nz  Mon Jul 29 03:13:23 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 29 Jul 2002 14:13:23 +1200 (NZST)
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172
In-Reply-To: <3D441818.83FD38F7@metaslash.com>
Message-ID: <200207290213.g6T2DN2U017001@kuku.cosc.canterbury.ac.nz>

Neal Norwitz <neal@metaslash.com>:

> The intent was to convert an int/long to a double in the case of
> '%g' et al and from a double to an int in the case of '%d'.

Are you sure the latter part of that is a good idea?  As a general
principle, I don't think float->int conversions should be done
automatically. What is the Python philosophy on that?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From neal@metaslash.com  Mon Jul 29 03:31:39 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Sun, 28 Jul 2002 22:31:39 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects
 stringobject.c,2.171,2.172
References: <200207290213.g6T2DN2U017001@kuku.cosc.canterbury.ac.nz>
Message-ID: <3D44A90B.421E97DA@metaslash.com>

Greg Ewing wrote:
> 
> Neal Norwitz <neal@metaslash.com>:
> 
> > The intent was to convert an int/long to a double in the case of
> > '%g' et al and from a double to an int in the case of '%d'.
> 
> Are you sure the latter part of that is a good idea?  As a general
> principle, I don't think float->int conversions should be done
> automatically. What is the Python philosophy on that?

This is consistent with versions back to 1.5.2:

	Python 1.5.2 (#1, Jul  5 2001, 03:02:19)  [GCC 2.96 20000731 
			(Red Hat Linux 7.1 2 on linux-i386
	Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
	>>> '%d' % 1.8
	'1'

Neal


From guido@python.org  Mon Jul 29 03:40:35 2002
From: guido@python.org (Guido van Rossum)
Date: Sun, 28 Jul 2002 22:40:35 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects stringobject.c,2.171,2.172
In-Reply-To: Your message of "Mon, 29 Jul 2002 14:13:23 +1200."
 <200207290213.g6T2DN2U017001@kuku.cosc.canterbury.ac.nz>
References: <200207290213.g6T2DN2U017001@kuku.cosc.canterbury.ac.nz>
Message-ID: <200207290240.g6T2eZH25272@pcp02138704pcs.reston01.va.comcast.net>

> > The intent was to convert an int/long to a double in the case of
> > '%g' et al and from a double to an int in the case of '%d'.
> 
> Are you sure the latter part of that is a good idea?  As a general
> principle, I don't think float->int conversions should be done
> automatically. What is the Python philosophy on that?

I fully agree, but unfortunately, in a dark past, I was given a patch
that did many good things, but as a side effect, made the PyArg_Parse*
family silently truncate floats to ints.  Two examples:

>>> "%d" % 3.14
'3'
>>> a = []
>>> a.insert(0.9, 42)
>>> a
[42]
>>> 

I find the second example more aggravating than the first.  This
touches upon a recent discussion, where one of the suggestions was to
use __index__ rather than __int__ in this case.  I think that's not
the right solution; perhaps instead, floats and float-like types
should support __truncate__ and __round__ to convert them to ints in
certain ways.  (Of course then we can argue about whether to round to
even, and what to do if the float is so large that its smallest unit
of precision is larger than one.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg@cosc.canterbury.ac.nz  Mon Jul 29 03:47:28 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 29 Jul 2002 14:47:28 +1200 (NZST)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <20020728235712.41025.qmail@web40112.mail.yahoo.com>
Message-ID: <200207290247.g6T2lSHV017233@kuku.cosc.canterbury.ac.nz>

Scott Gilbert <xscottg@yahoo.com>:

> The need for this derives from wanting to do more than one thing at a time
> in Python (multiple processors with multiple threas, asynchronous I/O, DMA
> transers, ???).

In any situation like that, you should be using some form
of locking on the object concerned. The Python buffer
interface is not the right place to deal with these
issues.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From tim.one@comcast.net  Mon Jul 29 03:55:45 2002
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 28 Jul 2002 22:55:45 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects
 stringobject.c,2.171,2.172
In-Reply-To: <200207290213.g6T2DN2U017001@kuku.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEDEAIAB.tim.one@comcast.net>

[Neal Norwitz]
> The intent was to convert an int/long to a double in the case of
> '%g' et al and from a double to an int in the case of '%d'.

[Greg Ewing]
> Are you sure the latter part of that is a good idea?  As a general
> principle, I don't think float->int conversions should be done
> automatically. What is the Python philosophy on that?

The philosophy for format codes is looser than elsewhere, else, e.g.,

    "%s" % object

would raise TypeError whenever object was a number or list, etc.  I've often
used %d with floats when I want them rounded to int and don't want to bother
remembering how to trick a float format into suppressing the decimal point.
Unfortunately, that's not quite what %d does (it truncates).  Whatever, %s
is like invoking str(), %r like invoking repr(), %d like invoking long(),
and %g/e/f like invoking float() (although these are variants of long() and
float() that refuse string arguments -- that's the exception that makes the
rule easy to remember <wink>).



From skip@pobox.com  Mon Jul 29 04:07:06 2002
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 28 Jul 2002 22:07:06 -0500
Subject: [Python-Dev] Remove from mailing list
In-Reply-To: <000c01c23698$bf2e32c0$3745fea9@ibm1499>
References: <000c01c23698$bf2e32c0$3745fea9@ibm1499>
Message-ID: <15684.45402.132334.108285@localhost.localdomain>

    Rick> For the last time. Please remove me from your mailing list!

Try sending a note to python-dev-admin@python.org.  Better yet, try using
the interface Mailman provides for you:

    http://mail.python.org/mailman/listinfo/python-dev

-- 
Skip Montanaro
skip@pobox.com
consulting: http://manatee.mojam.com/~skip/resume.html


From aahz@pythoncraft.com  Mon Jul 29 04:17:24 2002
From: aahz@pythoncraft.com (Aahz)
Date: Sun, 28 Jul 2002 23:17:24 -0400
Subject: [Python-Dev] Floats as indexes
In-Reply-To: <200207290240.g6T2eZH25272@pcp02138704pcs.reston01.va.comcast.net>
References: <200207290213.g6T2DN2U017001@kuku.cosc.canterbury.ac.nz> <200207290240.g6T2eZH25272@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020729031724.GA20797@panix.com>

On Sun, Jul 28, 2002, Guido van Rossum wrote:
>
> >>> "%d" % 3.14
> '3'
> >>> a = []
> >>> a.insert(0.9, 42)
> >>> a
> [42]
> >>> 
> 
> I find the second example more aggravating than the first.  This
> touches upon a recent discussion, where one of the suggestions was
> to use __index__ rather than __int__ in this case.  I think that's
> not the right solution; perhaps instead, floats and float-like types
> should support __truncate__ and __round__ to convert them to ints in
> certain ways.  (Of course then we can argue about whether to round to
> even, and what to do if the float is so large that its smallest unit
> of precision is larger than one.)

Blech.  I believe that floats and similar objects should never be
implicitly converted to indexes.  There are too many ways for silent
errors to get propagated.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From xscottg@yahoo.com  Mon Jul 29 04:23:03 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Sun, 28 Jul 2002 20:23:03 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <200207290247.g6T2lSHV017233@kuku.cosc.canterbury.ac.nz>
Message-ID: <20020729032303.28931.qmail@web40108.mail.yahoo.com>

--- Greg Ewing <greg@cosc.canterbury.ac.nz> wrote:
> 
> > The need for this derives from wanting to do more than one thing at a
> > time in Python (multiple processors with multiple threas, asynchronous
> > I/O, DMA transers, ???).
> 
> In any situation like that, you should be using some form
> of locking on the object concerned. The Python buffer
> interface is not the right place to deal with these
> issues.
> 

I humbly disagree with you, and I like his proposal.  His PEP is simple and
the locking business could lead to a mess if everyone involved is not very
careful.

However, I'll let him champion his PEP.  I've got my own stuff to worry
about, and this is part of why I didn't want to add new protocol to the PEP
I've been working on.


Cheers,
    -Scott


__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From martin@v.loewis.de  Mon Jul 29 07:39:48 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 29 Jul 2002 08:39:48 +0200
Subject: [Python-Dev] Please remove me from the mailing list
In-Reply-To: <010701c23695$633acc60$4a01010a@xpprofessional>
References: <000a01c23453$fc4b04e0$3745fea9@ibm1499>
 <010701c23695$633acc60$4a01010a@xpprofessional>
Message-ID: <m31y9nox57.fsf@mira.informatik.hu-berlin.de>

zhaoq <zhaoqiang@neusoft.com> writes:

> Please remove me from the mailing list

You have subscribed yourself by deliberate action, so you need to
actively unsubscribe yourself as well.

What mailing list are you talking about, anyway?

Regards,
Martin


From ville.vainio@swisslog.com  Mon Jul 29 09:10:58 2002
From: ville.vainio@swisslog.com (Ville Vainio)
Date: Mon, 29 Jul 2002 11:10:58 +0300
Subject: [Python-Dev] Re: Multiline string constants, include in the standard library?
References: <20020725194802.22949.82629.Mailman@mail.python.org> <3D40F62D.7000106@swisslog.com> <oq4rem34xo.fsf@carouge.sram.qc.ca>
Message-ID: <3D44F892.6090401@swisslog.com>

Fran=E7ois Pinard wrote:

>>>    >>> def stripIndent( s ):
>>>    ...     indent =3D len(s) - len(s.lstrip())
>>>    ...     sLines =3D s.split('\n')
>>>    ...     resultLines =3D [ line[indent:] for line in sLines ]
>>>    ...     return ''.join( resultLines )
>>>     =20
>>>
>
>
> =20
>
>>Something like this should really be available somewhere in the standar=
d
>>library (string module [yeah, predeprecation, I know], string
>>   =20
>>
>In fact, I like my doc-strings and other triple-quoted strings flushed l=
eft.
>So, I can see them in the code exactly as they will appear on the screen.
>
Enabling one to strip the indentation wouldn't hurt this practice of=20
yours one bit (nobody forces you to use it). To my eyes left-flushing=20
the blocks disrupts the natural "flow" of the code, and breaks the=20
intuitive block structure of the program.

>If I used artificial margins in Python so my doc-strings appeared to be
>indented more than the surrounding, and wrote my code this way, it would
>appear artificially constricted on the left once printed.  It's not wort=
h.
>
Could you axplain what you mean by artificially constricted? Of course=20
only the amount of space in the left margin would be removed,=20
indentation would work exactly the same.

Which one looks better:
++++++++++++++++++++++++
def usage():
    if 1:
        print """\
        You should have done this
        and that
        """.stripindent()
+++++++++++++++++++++++++
def usage():
    if 1:
        print """\
You should have done this
and that
"""
++++++++++++++++++++++++++

When you are scanning code, the non-stripindent version of the 3-quoted=20
string jumps at your face as a "top-level" construct, even if it is only=20
associated with the usage() function.

>My opinion is that it is nice this way.  Don't touch the thing! :-)
>
Again, the change would not influence your code or practices one bit.

-- Ville



From mal@lemburg.com  Mon Jul 29 10:02:43 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jul 2002 11:02:43 +0200
Subject: [Python-Dev] Re: Multiline string constants, include in the standard
 library?
References: <20020725194802.22949.82629.Mailman@mail.python.org> <3D40F62D.7000106@swisslog.com> <oq4rem34xo.fsf@carouge.sram.qc.ca> <3D44F892.6090401@swisslog.com>
Message-ID: <3D4504B3.3030608@lemburg.com>

Ville Vainio wrote:
> Which one looks better:
> ++++++++++++++++++++++++
> def usage():
>    if 1:
>        print """\
>        You should have done this
>        and that
>        """.stripindent()
> +++++++++++++++++++++++++
> def usage():
>    if 1:
>        print """\
> You should have done this
> and that
> """
> ++++++++++++++++++++++++++
> 
> When you are scanning code, the non-stripindent version of the 3-quoted 
> string jumps at your face as a "top-level" construct, even if it is only 
> associated with the usage() function.

I think everybody has their own way of formatting multi-line
strings and/or comments. There's no one-fits-all strategy.

So instead of trying to find a compromise, why don't you write up
a flexible helper function for the new textwrap module ?

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From ville.vainio@swisslog.com  Mon Jul 29 10:27:37 2002
From: ville.vainio@swisslog.com (Ville Vainio)
Date: Mon, 29 Jul 2002 12:27:37 +0300
Subject: [Python-Dev] Re: Multiline string constants, include in the standard
 library?
References: <20020725194802.22949.82629.Mailman@mail.python.org> <3D40F62D.7000106@swisslog.com> <oq4rem34xo.fsf@carouge.sram.qc.ca> <3D44F892.6090401@swisslog.com> <3D4504B3.3030608@lemburg.com>
Message-ID: <3D450A89.7050400@swisslog.com>

M.-A. Lemburg wrote:

> I think everybody has their own way of formatting multi-line
> strings and/or comments. There's no one-fits-all strategy.

Yep, but having a standard solution available to a one, very sensible 
strategy would be nice.

>
> So instead of trying to find a compromise, why don't you write up
> a flexible helper function for the new textwrap module ?

I don't think there is all that much implementation to do: 
inspect.getdoc() already has an implementation that seems to do the 
right thing, it's just that the stripping is embedded into the getdoc 
function, instead of having it available as a seperate function. 
textwrap might be a good place to put it, considering that the string 
module is going away - even if no actual wrapping takes place.

--------------------------------------------------
def getdoc(object):
    """Get the documentation string for an object.

    All tabs are expanded to spaces.  To clean up docstrings that are
    indented to line up with blocks of code, any whitespace than can be
    uniformly removed from the second line onwards is removed."""
    try:
        doc = object.__doc__
    except AttributeError:
        return None
    if not isinstance(doc, (str, unicode)):
        return None
    try:
        lines = string.split(string.expandtabs(doc), '\n')
    except UnicodeError:
        return None
    else:
        margin = None
        for line in lines[1:]:
            content = len(string.lstrip(line))
            if not content: continue
            indent = len(line) - content
            if margin is None: margin = indent
            else: margin = min(margin, indent)
        if margin is not None:
            for i in range(1, len(lines)): lines[i] = lines[i][margin:]
        return string.join(lines, '\n')
------------------------------------------



From mal@lemburg.com  Mon Jul 29 10:44:37 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jul 2002 11:44:37 +0200
Subject: [Python-Dev] Re: Multiline string constants, include in the standard
 library?
References: <20020725194802.22949.82629.Mailman@mail.python.org> <3D40F62D.7000106@swisslog.com> <oq4rem34xo.fsf@carouge.sram.qc.ca> <3D44F892.6090401@swisslog.com> <3D4504B3.3030608@lemburg.com> <3D450A89.7050400@swisslog.com>
Message-ID: <3D450E85.5090806@lemburg.com>

Ville Vainio wrote:
> M.-A. Lemburg wrote:
> 
>> I think everybody has their own way of formatting multi-line
>> strings and/or comments. There's no one-fits-all strategy.
> 
> 
> Yep, but having a standard solution available to a one, very sensible 
> strategy would be nice.
> 
>>
>> So instead of trying to find a compromise, why don't you write up
>> a flexible helper function for the new textwrap module ?
> 
> 
> I don't think there is all that much implementation to do: 
> inspect.getdoc() already has an implementation that seems to do the 
> right thing, it's just that the stripping is embedded into the getdoc 
> function, instead of having it available as a seperate function. 
> textwrap might be a good place to put it, considering that the string 
> module is going away - even if no actual wrapping takes place.

Oh, I think it is worthwhile applying some optional wrapping
for overly long doc-strings as well. But there you go again:
people simply don't match up when it comes to text formatting.
It's all a matter of taste and style (e.g. in the US it is
very common to indent the first line of a paragraph while in
most of Europe is not).

How about starting with a simple textwrap.dedent() API and then
moving on towards the full monty textwrap.reformat() API with tons
of options ?!

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From ville.vainio@swisslog.com  Mon Jul 29 11:22:54 2002
From: ville.vainio@swisslog.com (Ville Vainio)
Date: Mon, 29 Jul 2002 13:22:54 +0300
Subject: [Python-Dev] Re: Multiline string constants, include in the standard
 library?
References: <20020725194802.22949.82629.Mailman@mail.python.org> <3D40F62D.7000106@swisslog.com> <oq4rem34xo.fsf@carouge.sram.qc.ca> <3D44F892.6090401@swisslog.com> <3D4504B3.3030608@lemburg.com> <3D450A89.7050400@swisslog.com> <3D450E85.5090806@lemburg.com>
Message-ID: <3D45177E.6030503@swisslog.com>

M.-A. Lemburg wrote:

> How about starting with a simple textwrap.dedent() API and then
> moving on towards the full monty textwrap.reformat() API with tons of 
> options ?!

Fine with me - at least w/ dedent() everyone can agree with the right 
behaviour (except handling of the first line?), and it would be general 
enough to be useful for everybody (no need for options/customization) - 
hence the justification for a position in the std lib. I haven't had 
much use for intricate wrapping/reformatting yet, but I guess I will 
once it hits the std lib ;-).

-- Ville




From rwgk@yahoo.com  Mon Jul 29 14:02:00 2002
From: rwgk@yahoo.com (Ralf W. Grosse-Kunstleve)
Date: Mon, 29 Jul 2002 06:02:00 -0700 (PDT)
Subject: [Python-Dev] pickling of large arrays
Message-ID: <20020729130200.73932.qmail@web20201.mail.yahoo.com>

We are using Boost.Python to expose reference-counted C++ container
types (similar to std::vector<>) to Python. E.g.:

from arraytbx import shared
d = shared.double(1000000) # double array with a million elements
c = shared.complex_double(100) # std::complex<double> array
# and many more types, incl. several custom C++ types

We need a way to pickle these arrays. Since they can easily be
converted to tuples we could just define functions like:

  def __getstate__(self):
    return tuple(self)

However, since the arrays are potentially huge this could incur
a large overhead (e.g. a tuple of a million Python float).
Next idea:

  def __getstate__(self):
    return iter(self)

Unfortunately (but not unexpectedly) pickle is telling me:
'can't pickle iterator objects'

Attached is a short Python script (tested with 2.2.1) with a prototype
implementation of a pickle helper ("piece_meal") for large arrays.
piece_meal's __getstate__ converts a block of a given size to a Python
list and returns a tuple with that list and a new piece_meal instance
which knows how to generate the next chunk. I.e. piece_meal instances
are created recursively until the input sequence is exhausted. The
corresponding __setstate__ puts the pieces back together again
(uncomment the print statement to see the pieces).

I am wondering if a similar mechanism could be used to enable pickling
of iterators, or maybe special "pickle_iterators", which would
immediately enable pickling of our large arrays or any other object
that can be iterated over (e.g. Numpy arrays which are currently
pickled as potentially huge strings). Has this been discussed already?
Are there better ideas?

Ralf


import pickle

class piece_meal:

  block_size = 4

  def __init__(self, sequence, position):
    self.sequence = sequence
    self.position = position

  def __getstate__(self):
    next_position = self.position - piece_meal.block_size
    if (next_position <= 0):
      return (self.sequence[:self.position], 0)
    return (self.sequence[next_position:self.position],
            piece_meal(self.sequence, next_position))

  def __setstate__(self, state):
    #print "piece_meal:", state
    if (state[1] == 0):
      self.sequence = state[0]
    else:
      self.sequence = state[1].sequence + state[0]

class array:

  def __init__(self, n):
    self.elems = [i for i in xrange(n)]

  def __getstate__(self):
    return piece_meal(self.elems, len(self.elems))

  def __setstate__(self, state):
    self.elems = state.sequence

def exercise():
  for i in xrange(11):
    a = array(i)
    print a.elems
    s = pickle.dumps(a)
    b = pickle.loads(s)
    print b.elems

if (__name__ == "__main__"):
  exercise()


__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From nhodgson@bigpond.net.au  Mon Jul 29 14:52:40 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Mon, 29 Jul 2002 23:52:40 +1000
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729002957.74716.qmail@web40101.mail.yahoo.com>
Message-ID: <00c601c23707$35819a20$3da48490@neil>

Scott Gilbert:

> You could easily implement the a counting (recursive) mutex as described
> above, and it might be the case that throwing an exception on the length
> changing operations keeps the dead lock from occurring.  I'm still a bit
> confused though.

   Not as confused as I am. I don't think deadlocks or threads are that
relevant to me. The most likely situations in which I would use the buffer
interface is to perform large I/O operations without copying or when
performing asynchronous I/O to load or save documents while continuing to
run styling or linting tasks. I think its likely that the pieces of code
accessing the buffer will not be real threads, but instead be cooperating
contexts within a single-threaded UI framework so using semaphores will not
be possible.

>   1) With his PEP, there is a way to get the behavior you desire with out
> adding the complexity to the core of Python.  And with recursive/counting
> mutexes, the behavior you want is getting more complicated.

   I don't want counting mutexes. I'm not defining behaviour that needs
them.

> The "safe
> buffer protocol" is likely to cater to a wide class of users.  I could be
> wrong, but the "lockable gapped buffer protocol" probably appeals to a
much
> smaller set.

   Its not that a "lockable gapped buffer protocol" is needed. It is that
the problem with the old buffer was that the lifetime of the pointer is not
well defined. The proposal changes that by making the lifetime of the
pointer be the same as the underlying object. This restricts the set of
objects that can be buffers to statically sized objects. I'd prefer that
dynamically resizable objects be able to be buffers.

>   2) Any time you go from one lock (mutex, GIL, semaphore) to multiple
> locks, you can introduce deadlock states.

   My defined behaviour was "Upon receiving a lock call, it could collapse
the gap and return a stable pointer to its contents and then revert to its
normal behaviour on receiving an unlock". Where is a semaphore involved?
Without a semaphore (or equivalent) there can be no deadlock.

   Neil



From guido@python.org  Mon Jul 29 15:19:00 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 10:19:00 -0400
Subject: [Python-Dev] Re: Multiline string constants, include in the standard library?
In-Reply-To: Your message of "Mon, 29 Jul 2002 12:27:37 +0300."
 <3D450A89.7050400@swisslog.com>
References: <20020725194802.22949.82629.Mailman@mail.python.org> <3D40F62D.7000106@swisslog.com> <oq4rem34xo.fsf@carouge.sram.qc.ca> <3D44F892.6090401@swisslog.com> <3D4504B3.3030608@lemburg.com>
 <3D450A89.7050400@swisslog.com>
Message-ID: <200207291419.g6TEJ0m26497@pcp02138704pcs.reston01.va.comcast.net>

> > I think everybody has their own way of formatting multi-line
> > strings and/or comments. There's no one-fits-all strategy.
> 
> Yep, but having a standard solution available to a one, very sensible 
> strategy would be nice.

Can you move this discussion to c.l.py please?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Mon Jul 29 15:34:46 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Mon, 29 Jul 2002 16:34:46 +0200
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil>
Message-ID: <06f301c2370d$16941060$e000a8c0@thomasnotebook>

[Scott]
> > The "safe
> > buffer protocol" is likely to cater to a wide class of users.  I could be
> > wrong, but the "lockable gapped buffer protocol" probably appeals to a
> much
> > smaller set.
> 
[Neil]
>    Its not that a "lockable gapped buffer protocol" is needed. It is that
> the problem with the old buffer was that the lifetime of the pointer is not
> well defined. The proposal changes that by making the lifetime of the
> pointer be the same as the underlying object.

That's exactly what *I* need, ...

>  This restricts the set of
> objects that can be buffers to statically sized objects. I'd prefer that
> dynamically resizable objects be able to be buffers.
> 

..., but I understand Neil's requirements.

Can they be fulfilled by adding some kind of UnlockObject()
call to the 'safe buffer interface', which should mean 'I won't
use the pointer received by getsaferead/writebufferproc any more'?

Thomas




From guido@python.org  Mon Jul 29 16:00:51 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 11:00:51 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: Your message of "Fri, 26 Jul 2002 16:28:50 +0200."
 <082b01c234b0$c33564e0$e000a8c0@thomasnotebook>
References: <082b01c234b0$c33564e0$e000a8c0@thomasnotebook>
Message-ID: <200207291500.g6TF0pM26852@pcp02138704pcs.reston01.va.comcast.net>

Thomas,

I like your PEP.  Could you clean it up (changing 'large' into 'safe'
etc.) and send it to Barry?  Some comments:

> Backward Compatibility
> 
>     There are no backward compatibility problems.

That's a simplification of the truth -- you're adding two new fields
to an existing struct.  But the flag bit you add makes that old and
new versions of the struct can be distinguished.

>     It may be a good idea to expose the following convenience functions:
> 
>         int PyObject_AsSafeReadBuffer(PyObject *obj,
>                                       void **buffer,
>                                       size_t *buffer_len);
> 
>         int PyObject_AsSafeWriteBuffer(PyObject *obj,
>                                        void **buffer,
>                                        size_t *buffer_len);
> 
>     These functions return 0 on success, set buffer to the memory
>     location and buffer_len to the length of the memory block in
>     bytes. On failure, they return -1 and set an exception.

Please make these a manadatory part of the proposal.

Please also try to summarize the discussion so far here.  My personal
opinion: locking seems the wrong approach, given the danger of
deadlock; Scintilla can use the existing buffer protocol, assuming its
buffer doesn't move as long as you don't release the GIL and don't
make calls into Scintilla.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Jul 29 16:09:22 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 11:09:22 -0400
Subject: [Python-Dev] pickling of large arrays
In-Reply-To: Your message of "Mon, 29 Jul 2002 06:02:00 PDT."
 <20020729130200.73932.qmail@web20201.mail.yahoo.com>
References: <20020729130200.73932.qmail@web20201.mail.yahoo.com>
Message-ID: <200207291509.g6TF9MX26908@pcp02138704pcs.reston01.va.comcast.net>

> We are using Boost.Python to expose reference-counted C++ container
> types (similar to std::vector<>) to Python. E.g.:
> 
> from arraytbx import shared
> d = shared.double(1000000) # double array with a million elements
> c = shared.complex_double(100) # std::complex<double> array
> # and many more types, incl. several custom C++ types
> 
> We need a way to pickle these arrays. Since they can easily be
> converted to tuples we could just define functions like:
> 
>   def __getstate__(self):
>     return tuple(self)
> 
> However, since the arrays are potentially huge this could incur
> a large overhead (e.g. a tuple of a million Python float).
> Next idea:
> 
>   def __getstate__(self):
>     return iter(self)
> 
> Unfortunately (but not unexpectedly) pickle is telling me:
> 'can't pickle iterator objects'
> 
> Attached is a short Python script (tested with 2.2.1) with a prototype
> implementation of a pickle helper ("piece_meal") for large arrays.

That's a neat trick, unfortunately it only helps when the pickle is
being written directly to disk; when it is returned as a string, you
still get the entire array in memory.

> piece_meal's __getstate__ converts a block of a given size to a Python
> list and returns a tuple with that list and a new piece_meal instance
> which knows how to generate the next chunk. I.e. piece_meal instances
> are created recursively until the input sequence is exhausted. The
> corresponding __setstate__ puts the pieces back together again
> (uncomment the print statement to see the pieces).
> 
> I am wondering if a similar mechanism could be used to enable pickling
> of iterators, or maybe special "pickle_iterators", which would
> immediately enable pickling of our large arrays or any other object
> that can be iterated over (e.g. Numpy arrays which are currently
> pickled as potentially huge strings). Has this been discussed already?

I think pickling iterators is the wrong idea.  An iterator doesn't
represent data, it represents a single pass over data.  Iterators may
represent infinite series.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz@pythoncraft.com  Mon Jul 29 16:51:25 2002
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 29 Jul 2002 11:51:25 -0400
Subject: [Python-Dev] pickling of large arrays
In-Reply-To: <20020729130200.73932.qmail@web20201.mail.yahoo.com>
References: <20020729130200.73932.qmail@web20201.mail.yahoo.com>
Message-ID: <20020729155125.GA5765@panix.com>

On Mon, Jul 29, 2002, Ralf W. Grosse-Kunstleve wrote:
>
> We need a way to pickle these arrays. 

See PEP 296 and read the back discussion on python-dev in the archives.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From guido@python.org  Mon Jul 29 17:05:49 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 12:05:49 -0400
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: Your message of "Fri, 26 Jul 2002 19:26:38 PDT."
 <20020727022638.86727.qmail@web40101.mail.yahoo.com>
References: <20020727022638.86727.qmail@web40101.mail.yahoo.com>
Message-ID: <200207291605.g6TG5o428945@pcp02138704pcs.reston01.va.comcast.net>

> Even if I'm wrong about the need for this, at the very least, the
> additional functionality can be added later.  I really just want to push
> through a simple, usable, bytes object for the time being.  We can easily
> add, we can't easily take away.

Hi Scott,

I've followed this discussion and it looks like the PEP is ready for
another round of refinements based upon the discussion (e.g. to use
size_t).  Do you have time to do that?

And then the next thing would be a prototype implementation.

I like where this is going!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From xscottg@yahoo.com  Mon Jul 29 17:39:13 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Mon, 29 Jul 2002 09:39:13 -0700 (PDT)
Subject: [Python-Dev] PEP 296 - The Buffer Problem
In-Reply-To: <200207291605.g6TG5o428945@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020729163913.46117.qmail@web40102.mail.yahoo.com>

--- Guido van Rossum <guido@python.org> wrote:
> 
> I've followed this discussion and it looks like the PEP is ready for
> another round of refinements based upon the discussion (e.g. to use
> size_t).  Do you have time to do that?
> 
> And then the next thing would be a prototype implementation.
> 
> I like where this is going!
> 

Very cool.  I'm glad to hear it.  I'll integrate the new changes to the
text tonight and post the next version to python-dev and comp.lang.python
tomorrow.  Implementation is in progress, but not far enough along that I
can swag a done date yet.  It shouldn't take too long, but like you
indicated before, I may need some help on doing the pickling correctly.


Cheers,
    -Scott



__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From mcherm@destiny.com  Mon Jul 29 17:42:08 2002
From: mcherm@destiny.com (Michael Chermside)
Date: Mon, 29 Jul 2002 12:42:08 -0400
Subject: [Python-Dev] Re: PEP 295 - Interpretation of multiline string constants
Message-ID: <3D457060.4060505@destiny.com>

> So...  What you (and others) think about just adding flag 'i' to string
> constants (that will strip indentation etc.)?  This doesn't affect
> existing code, but it will be useful (at least for me ;-)  Motivation
> was posted here by Michael Chermside, but I don't like his solutions.

Please understand that the motivation I posted was an attempt to 
describe YOUR possible motivation for desiring the change. I wouldn't 
like this feature, myself. I was just trying to point out that it could 
all be achieved with somewhere between 1 character and 5 lines worth of 
code.

The solution to this (so-called) "problem" simply does not belong in the 
language itself, despite the fact that you don't like my solutions.

However, if you have a particular reason why you don't like these 
solutions, send me an email (don't CC the list), and I'll see if I can 
come up with a different solution you DO like.

-- Michael Chermside




From xscottg@yahoo.com  Mon Jul 29 17:45:32 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Mon, 29 Jul 2002 09:45:32 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <00c601c23707$35819a20$3da48490@neil>
Message-ID: <20020729164532.48588.qmail@web40110.mail.yahoo.com>

--- Neil Hodgson <nhodgson@bigpond.net.au> wrote:
> Scott Gilbert:
> 
> > You could easily implement the a counting (recursive) mutex as
> > described above, and it might be the case that throwing an exception
> > on the length changing operations keeps the dead lock from occurring.
> > I'm still a bit confused though.
> 
>    Not as confused as I am. I don't think deadlocks or threads are that
> relevant to me. The most likely situations in which I would use the
> buffer interface is to perform large I/O operations without copying or
> when performing asynchronous I/O to load or save documents while
> continuing to run styling or linting tasks. I think its likely that the
> pieces of code accessing the buffer will not be real threads, but instead
> be cooperating contexts within a single-threaded UI framework so using
> semaphores will not be possible.
> 

What happens when you've locked the buffer and passed a pointer to the I/O
system for an asynchronous operation, but before that operation has
completed, your main program wants to resize the buffer due to a user
generated event?

I had written responses/questions to other parts of your message, but I
found that I was just asking the same question above over and over, so I've
chopped them out.  If you can explain this to me, and there aren't any
problems with deadlock or polling, then I'll quit interfering and let you
and Thomas decide if you really think the locking semantics are useful to a
wide enough audience that it should be included in the core.


> 
>    I don't want counting mutexes. I'm not defining behavior that needs
> them.
> 

You said you wanted the locks to keep a count.  So that you could call
acquire() multiple times and have the buffer not truly become unlocked
until release() was called the same amount of times.  I'm willing to adopt
any terminology you want for the purpose of this discussion.  I think I
understand the semantics or the counting operation, but I want to
understand more what actually happens when the buffer is locked.





__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From thomas.heller@ion-tof.com  Mon Jul 29 17:52:17 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Mon, 29 Jul 2002 18:52:17 +0200
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <082b01c234b0$c33564e0$e000a8c0@thomasnotebook>  <200207291500.g6TF0pM26852@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <091701c23720$4c3c3310$e000a8c0@thomasnotebook>

From: "Guido van Rossum" <guido@python.org>
> Thomas,
> 
> I like your PEP.  Could you clean it up (changing 'large' into 'safe'
> etc.) and send it to Barry?  Some comments:
Great.
I have changed it to your reqeusts, and also included Greg's and
Neil's points.

Thomas



From xscottg@yahoo.com  Mon Jul 29 17:54:19 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Mon, 29 Jul 2002 09:54:19 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <06f301c2370d$16941060$e000a8c0@thomasnotebook>
Message-ID: <20020729165419.31643.qmail@web40111.mail.yahoo.com>

--- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> 
> > This restricts the set of objects that can be buffers to statically
> > sized objects. I'd prefer that dynamically resizable objects be able to
> > be buffers.
> > 
> 
> ..., but I understand Neil's requirements.
> 
> Can they be fulfilled by adding some kind of UnlockObject()
> call to the 'safe buffer interface', which should mean 'I won't
> use the pointer received by getsaferead/writebufferproc any more'?
> 

I assume this means any call to getsafereadpointer()/getsafewritepointer()
will increment the lock count.  So the UnlockObject() calls will be
mandatory.  Either that, or you'll have an explicit LockObject() call as
well.  What behavior should happen when a resise is attempted while the
lock count is positive?




__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From thomas.heller@ion-tof.com  Mon Jul 29 18:03:30 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Mon, 29 Jul 2002 19:03:30 +0200
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729165419.31643.qmail@web40111.mail.yahoo.com>
Message-ID: <093b01c23721$dd908680$e000a8c0@thomasnotebook>

From: "Scott Gilbert" <xscottg@yahoo.com>
> 
> --- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> > 
> > > This restricts the set of objects that can be buffers to statically
> > > sized objects. I'd prefer that dynamically resizable objects be able to
> > > be buffers.
> > > 
> > 
> > ..., but I understand Neil's requirements.
> > 
> > Can they be fulfilled by adding some kind of UnlockObject()
> > call to the 'safe buffer interface', which should mean 'I won't
> > use the pointer received by getsaferead/writebufferproc any more'?
> > 
> 
> I assume this means any call to getsafereadpointer()/getsafewritepointer()
> will increment the lock count.  So the UnlockObject() calls will be
> mandatory.  Either that, or you'll have an explicit LockObject() call as
> well.  What behavior should happen when a resise is attempted while the
> lock count is positive?

This question is not difficult to answer;-) The resize should fail.
That's the only possibility.
If this can be handled robust enough by the object is another
question.

Probably this all is too complicated to be solved by the
safe buffer interface, and it should be left out?

Thomas



From guido@python.org  Mon Jul 29 18:03:55 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 13:03:55 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: Your message of "Mon, 29 Jul 2002 09:54:19 PDT."
 <20020729165419.31643.qmail@web40111.mail.yahoo.com>
References: <20020729165419.31643.qmail@web40111.mail.yahoo.com>
Message-ID: <200207291703.g6TH3tk29997@pcp02138704pcs.reston01.va.comcast.net>

> --- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> > 
> > > This restricts the set of objects that can be buffers to statically
> > > sized objects. I'd prefer that dynamically resizable objects be able to
> > > be buffers.
> > > 
> > 
> > ..., but I understand Neil's requirements.
> > 
> > Can they be fulfilled by adding some kind of UnlockObject()
> > call to the 'safe buffer interface', which should mean 'I won't
> > use the pointer received by getsaferead/writebufferproc any more'?
> > 
> 
> I assume this means any call to getsafereadpointer()/getsafewritepointer()
> will increment the lock count.  So the UnlockObject() calls will be
> mandatory.  Either that, or you'll have an explicit LockObject() call as
> well.  What behavior should happen when a resise is attempted while the
> lock count is positive?

I don't like where this is going.  Let's not add locking to the buffer
protocol.  If an object's buffer isn't allocated for the object's life
when the object is created, it should not support the "safe" version
of the protocol (maybe a different name would be better), and users
should not release the GIL while using on to the pointer.  (Exactly
which other API calls are safe while using the pointer is not clear;
probably nothing that could possibly invoke the Python interpreter
recursively, since that might release the GIL.  This would generally
mean that calls to Py_DECREF() are unsafe while holding on to a buffer
pointer!)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Mon Jul 29 18:08:11 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Mon, 29 Jul 2002 19:08:11 +0200
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729165419.31643.qmail@web40111.mail.yahoo.com>  <200207291703.g6TH3tk29997@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <095701c23722$84e06770$e000a8c0@thomasnotebook>

From: "Guido van Rossum" <guido@python.org>
>   If an object's buffer isn't allocated for the object's life
> when the object is created, it should not support the "safe" version
> of the protocol (maybe a different name would be better), and users
> should not release the GIL while using on to the pointer.

'Persistent' buffer interface? Too long?

Thomas



From oren-py-d@hishome.net  Mon Jul 29 18:08:24 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 29 Jul 2002 20:08:24 +0300
Subject: [Python-Dev] patch: try/finally in generators
Message-ID: <20020729200824.A5391@hishome.net>

http://www.python.org/sf/584626

This patch removes the limitation of not allowing yield in the try part
of a try/finally. The dealloc function of a generator checks if the 
generator is still alive and resumes it one last time from the return 
instruction at the end of the code, causing any try/finally blocks to be 
triggered. Any exceptions raised are treated just like exceptions in a
__del__ finalizer (printed and ignored).

	Oren


From guido@python.org  Mon Jul 29 18:10:44 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 13:10:44 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: Your message of "Mon, 29 Jul 2002 19:08:11 +0200."
 <095701c23722$84e06770$e000a8c0@thomasnotebook>
References: <20020729165419.31643.qmail@web40111.mail.yahoo.com> <200207291703.g6TH3tk29997@pcp02138704pcs.reston01.va.comcast.net>
 <095701c23722$84e06770$e000a8c0@thomasnotebook>
Message-ID: <200207291710.g6THAin30057@pcp02138704pcs.reston01.va.comcast.net>

> >   If an object's buffer isn't allocated for the object's life
> > when the object is created, it should not support the "safe" version
> > of the protocol (maybe a different name would be better), and users
> > should not release the GIL while using on to the pointer.
> 
> 'Persistent' buffer interface? Too long?

No, persistent typically refers to things that survive longer than a
process.  Maybe 'static' buffer interface would work.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Mon Jul 29 18:14:51 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Mon, 29 Jul 2002 19:14:51 +0200
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729165419.31643.qmail@web40111.mail.yahoo.com> <200207291703.g6TH3tk29997@pcp02138704pcs.reston01.va.comcast.net>              <095701c23722$84e06770$e000a8c0@thomasnotebook>  <200207291710.g6THAin30057@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <098e01c23723$73867590$e000a8c0@thomasnotebook>

> > >   If an object's buffer isn't allocated for the object's life
> > > when the object is created, it should not support the "safe" version
> > > of the protocol (maybe a different name would be better), and users
> > > should not release the GIL while using on to the pointer.
> > 
> > 'Persistent' buffer interface? Too long?
> 
> No, persistent typically refers to things that survive longer than a
> process.  Maybe 'static' buffer interface would work.
> 
Ahem, right.
Maybe Barry can change it before committing this?

Thomas



From guido@python.org  Mon Jul 29 18:34:01 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 13:34:01 -0400
Subject: [Python-Dev] patch: try/finally in generators
In-Reply-To: Your message of "Mon, 29 Jul 2002 20:08:24 +0300."
 <20020729200824.A5391@hishome.net>
References: <20020729200824.A5391@hishome.net>
Message-ID: <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net>

> http://www.python.org/sf/584626
> 
> This patch removes the limitation of not allowing yield in the try part
> of a try/finally. The dealloc function of a generator checks if the 
> generator is still alive and resumes it one last time from the return 
> instruction at the end of the code, causing any try/finally blocks to be 
> triggered. Any exceptions raised are treated just like exceptions in a
> __del__ finalizer (printed and ignored).

I'm not sure I understand what it does.  The return instruction at the
end of the code, if I take this literally, isn't enclosed in any
try/finally blocks.  So how can this have the desired effect?

Have you verified that Jython can implement these semantics too?

Do you *really* need this?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From DavidA@ActiveState.com  Mon Jul 29 19:01:46 2002
From: DavidA@ActiveState.com (David Ascher)
Date: Mon, 29 Jul 2002 11:01:46 -0700
Subject: [Python-Dev] python.org/switch/
References: <782E9B13-A26D-11D6-83B1-003065517236@oratrix.com>
Message-ID: <3D45830A.6090207@ActiveState.com>

Jack Jansen wrote:

> They're all pretty good, but I think I liked David best, he actually 
> seemed to mean what he said:-)

I_do_,_it's_why_you_haven't_seen_me_much_around_these_parts_recently...
--david
(those_who_saw_the_ad_may_understand_my_typing_oddities).



From xscottg@yahoo.com  Mon Jul 29 19:13:38 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Mon, 29 Jul 2002 11:13:38 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <098e01c23723$73867590$e000a8c0@thomasnotebook>
Message-ID: <20020729181338.59568.qmail@web40107.mail.yahoo.com>

--- Thomas Heller and Guido wrote:

> > > >   If an object's buffer isn't allocated for the object's life
> > > > when the object is created, it should not support the "safe"
> > > > version of the protocol (maybe a different name would be better),
> > > > and users should not release the GIL while using on to the pointer.
> > > 
> > > 'Persistent' buffer interface? Too long?
> > 
> > No, persistent typically refers to things that survive longer than a
> > process.  Maybe 'static' buffer interface would work.
> > 

I'll just chime in with the name "Fixed" Buffer Interface.  They aren't
really static either, and fixed applies in at least two senses.  :-)







__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From guido@python.org  Mon Jul 29 19:24:41 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 14:24:41 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: Your message of "Mon, 29 Jul 2002 11:13:38 PDT."
 <20020729181338.59568.qmail@web40107.mail.yahoo.com>
References: <20020729181338.59568.qmail@web40107.mail.yahoo.com>
Message-ID: <200207291824.g6TIOfq30468@pcp02138704pcs.reston01.va.comcast.net>

> > > > 'Persistent' buffer interface? Too long?
> > > 
> > > No, persistent typically refers to things that survive longer than a
> > > process.  Maybe 'static' buffer interface would work.
> 
> I'll just chime in with the name "Fixed" Buffer Interface.  They aren't
> really static either, and fixed applies in at least two senses.  :-)

Nice!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Mon Jul 29 19:36:56 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Mon, 29 Jul 2002 20:36:56 +0200
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729181338.59568.qmail@web40107.mail.yahoo.com>
Message-ID: <0a9e01c2372e$ea80fb60$e000a8c0@thomasnotebook>

From: "Scott Gilbert" <xscottg@yahoo.com>
> --- Thomas Heller and Guido wrote:
> 
> > > > >   If an object's buffer isn't allocated for the object's life
> > > > > when the object is created, it should not support the "safe"
> > > > > version of the protocol (maybe a different name would be better),
> > > > > and users should not release the GIL while using on to the pointer.
> > > > 
> > > > 'Persistent' buffer interface? Too long?
> > > 
> > > No, persistent typically refers to things that survive longer than a
> > > process.  Maybe 'static' buffer interface would work.
> > > 
> 
> I'll just chime in with the name "Fixed" Buffer Interface.  They aren't
> really static either, and fixed applies in at least two senses.  :-)
> 

Yup. I'll change it.

Thanks,

Thomas



From barry@zope.com  Mon Jul 29 19:38:06 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Mon, 29 Jul 2002 14:38:06 -0400
Subject: [Python-Dev] PEP 1, PEP Purpose and Guidelines
Message-ID: <15685.35726.678832.241665@anthem.wooz.org>

It has been a while since I posted a copy of PEP 1 to the mailing
lists and newsgroups.  I've recently done some updating of a few
sections, so in the interest of gaining wider community participation
in the Python development process, I'm posting the latest revision of
PEP 1 here.  A version of the PEP is always available on-line at

    http://www.python.org/peps/pep-0001.html

Enjoy,
-Barry

-------------------- snip snip --------------------
PEP: 1
Title: PEP Purpose and Guidelines
Version: $Revision: 1.36 $
Last-Modified: $Date: 2002/07/29 18:34:59 $
Author: Barry A. Warsaw, Jeremy Hylton
Status: Active
Type: Informational
Created: 13-Jun-2000
Post-History: 21-Mar-2001, 29-Jul-2002


What is a PEP?

    PEP stands for Python Enhancement Proposal.  A PEP is a design
    document providing information to the Python community, or
    describing a new feature for Python.  The PEP should provide a
    concise technical specification of the feature and a rationale for
    the feature.

    We intend PEPs to be the primary mechanisms for proposing new
    features, for collecting community input on an issue, and for
    documenting the design decisions that have gone into Python.  The
    PEP author is responsible for building consensus within the
    community and documenting dissenting opinions.

    Because the PEPs are maintained as plain text files under CVS
    control, their revision history is the historical record of the
    feature proposal[1].
    

Kinds of PEPs

    There are two kinds of PEPs.  A standards track PEP describes a
    new feature or implementation for Python.  An informational PEP
    describes a Python design issue, or provides general guidelines or
    information to the Python community, but does not propose a new
    feature.  Informational PEPs do not necessarily represent a Python
    community consensus or recommendation, so users and implementors
    are free to ignore informational PEPs or follow their advice.


PEP Work Flow

    The PEP editor, Barry Warsaw <peps@python.org>, assigns numbers
    for each PEP and changes its status.

    The PEP process begins with a new idea for Python.  It is highly
    recommended that a single PEP contain a single key proposal or new
    idea.  The more focussed the PEP, the more successfully it tends
    to be.  The PEP editor reserves the right to reject PEP proposals
    if they appear too unfocussed or too broad.  If in doubt, split
    your PEP into several well-focussed ones.

    Each PEP must have a champion -- someone who writes the PEP using
    the style and format described below, shepherds the discussions in
    the appropriate forums, and attempts to build community consensus
    around the idea.  The PEP champion (a.k.a. Author) should first
    attempt to ascertain whether the idea is PEP-able.  Small
    enhancements or patches often don't need a PEP and can be injected
    into the Python development work flow with a patch submission to
    the SourceForge patch manager[2] or feature request tracker[3].

    The PEP champion then emails the PEP editor <peps@python.org> with
    a proposed title and a rough, but fleshed out, draft of the PEP.
    This draft must be written in PEP style as described below.

    If the PEP editor approves, he will assign the PEP a number, label
    it as standards track or informational, give it status 'draft',
    and create and check-in the initial draft of the PEP.  The PEP
    editor will not unreasonably deny a PEP.  Reasons for denying PEP
    status include duplication of effort, being technically unsound,
    not providing proper motivation or addressing backwards
    compatibility, or not in keeping with the Python philosophy.  The
    BDFL (Benevolent Dictator for Life, Guido van Rossum) can be
    consulted during the approval phase, and is the final arbitrator
    of the draft's PEP-ability.

    If a pre-PEP is rejected, the author may elect to take the pre-PEP
    to the comp.lang.python newsgroup (a.k.a. python-list@python.org
    mailing list) to help flesh it out, gain feedback and consensus
    from the community at large, and improve the PEP for
    re-submission.

    The author of the PEP is then responsible for posting the PEP to
    the community forums, and marshaling community support for it.  As
    updates are necessary, the PEP author can check in new versions if
    they have CVS commit permissions, or can email new PEP versions to
    the PEP editor for committing.

    Standards track PEPs consists of two parts, a design document and
    a reference implementation.  The PEP should be reviewed and
    accepted before a reference implementation is begun, unless a
    reference implementation will aid people in studying the PEP.
    Standards Track PEPs must include an implementation - in the form
    of code, patch, or URL to same - before it can be considered
    Final.

    PEP authors are responsible for collecting community feedback on a
    PEP before submitting it for review.  A PEP that has not been
    discussed on python-list@python.org and/or python-dev@python.org
    will not be accepted.  However, wherever possible, long open-ended
    discussions on public mailing lists should be avoided.  Strategies
    to keep the discussions efficient include, setting up a separate
    SIG mailing list for the topic, having the PEP author accept
    private comments in the early design phases, etc.  PEP authors
    should use their discretion here.

    Once the authors have completed a PEP, they must inform the PEP
    editor that it is ready for review.  PEPs are reviewed by the BDFL
    and his chosen consultants, who may accept or reject a PEP or send
    it back to the author(s) for revision.

    Once a PEP has been accepted, the reference implementation must be
    completed.  When the reference implementation is complete and
    accepted by the BDFL, the status will be changed to `Final.'

    A PEP can also be assigned status `Deferred.'  The PEP author or
    editor can assign the PEP this status when no progress is being
    made on the PEP.  Once a PEP is deferred, the PEP editor can
    re-assign it to draft status.

    A PEP can also be `Rejected'.  Perhaps after all is said and done
    it was not a good idea.  It is still important to have a record of
    this fact.

    PEPs can also be replaced by a different PEP, rendering the
    original obsolete.  This is intended for Informational PEPs, where
    version 2 of an API can replace version 1.

    PEP work flow is as follows:

        Draft -> Accepted -> Final -> Replaced
          ^
          +----> Rejected
          v
        Deferred

    Some informational PEPs may also have a status of `Active' if they
    are never meant to be completed.  E.g. PEP 1.


What belongs in a successful PEP?

    Each PEP should have the following parts:

    1. Preamble -- RFC822 style headers containing meta-data about the
       PEP, including the PEP number, a short descriptive title
       (limited to a maximum of 44 characters), the names, and
       optionally the contact info for each author, etc.

    2. Abstract -- a short (~200 word) description of the technical
       issue being addressed.

    3. Copyright/public domain -- Each PEP must either be explicitly
       labelled as placed in the public domain (see this PEP as an
       example) or licensed under the Open Publication License[4].

    4. Specification -- The technical specification should describe
       the syntax and semantics of any new language feature.  The
       specification should be detailed enough to allow competing,
       interoperable implementations for any of the current Python
       platforms (CPython, JPython, Python .NET).

    5. Motivation -- The motivation is critical for PEPs that want to
       change the Python language.  It should clearly explain why the
       existing language specification is inadequate to address the
       problem that the PEP solves.  PEP submissions without
       sufficient motivation may be rejected outright.

    6. Rationale -- The rationale fleshes out the specification by
       describing what motivated the design and why particular design
       decisions were made.  It should describe alternate designs that
       were considered and related work, e.g. how the feature is
       supported in other languages.

       The rationale should provide evidence of consensus within the
       community and discuss important objections or concerns raised
       during discussion.

    7. Backwards Compatibility -- All PEPs that introduce backwards
       incompatibilities must include a section describing these
       incompatibilities and their severity.  The PEP must explain how
       the author proposes to deal with these incompatibilities.  PEP
       submissions without a sufficient backwards compatibility
       treatise may be rejected outright.

    8. Reference Implementation -- The reference implementation must
       be completed before any PEP is given status 'Final,' but it
       need not be completed before the PEP is accepted.  It is better
       to finish the specification and rationale first and reach
       consensus on it before writing code.

       The final implementation must include test code and
       documentation appropriate for either the Python language
       reference or the standard library reference.


PEP Template

    PEPs are written in plain ASCII text, and should adhere to a
    rigid style.  There is a Python script that parses this style and
    converts the plain text PEP to HTML for viewing on the web[5].
    PEP 9 contains a boilerplate[7] template you can use to get
    started writing your PEP.

    Each PEP must begin with an RFC822 style header preamble.  The
    headers must appear in the following order.  Headers marked with
    `*' are optional and are described below.  All other headers are
    required.

        PEP: <pep number>
        Title: <pep title>
        Version: <cvs version string>
        Last-Modified: <cvs date string>
        Author: <list of authors' real names and optionally, email addrs>
      * Discussions-To: <email address>
        Status: <Draft | Active | Accepted | Deferred | Final | Replaced>
        Type: <Informational | Standards Track>
      * Requires: <pep numbers>
        Created: <date created on, in dd-mmm-yyyy format>
      * Python-Version: <version number>
        Post-History: <dates of postings to python-list and python-dev>
      * Replaces: <pep number>
      * Replaced-By: <pep number>

    The Author: header lists the names and optionally, the email
    addresses of all the authors/owners of the PEP.  The format of the
    author entry should be

        address@dom.ain (Random J. User)

    if the email address is included, and just

        Random J. User

    if the address is not given.  If there are multiple authors, each
    should be on a separate line following RFC 822 continuation line
    conventions.  Note that personal email addresses in PEPs will be
    obscured as a defense against spam harvesters.

    Standards track PEPs must have a Python-Version: header which
    indicates the version of Python that the feature will be released
    with.  Informational PEPs do not need a Python-Version: header.

    While a PEP is in private discussions (usually during the initial
    Draft phase), a Discussions-To: header will indicate the mailing
    list or URL where the PEP is being discussed.  No Discussions-To:
    header is necessary if the PEP is being discussed privately with
    the author, or on the python-list or python-dev email mailing
    lists.  Note that email addresses in the Discussions-To: header
    will not be obscured.

    Created: records the date that the PEP was assigned a number,
    while Post-History: is used to record the dates of when new
    versions of the PEP are posted to python-list and/or python-dev.
    Both headers should be in dd-mmm-yyyy format, e.g. 14-Aug-2001.

    PEPs may have a Requires: header, indicating the PEP numbers that
    this PEP depends on.

    PEPs may also have a Replaced-By: header indicating that a PEP has
    been rendered obsolete by a later document; the value is the
    number of the PEP that replaces the current document.  The newer
    PEP must have a Replaces: header containing the number of the PEP
    that it rendered obsolete.


PEP Formatting Requirements

    PEP headings must begin in column zero and the initial letter of
    each word must be capitalized as in book titles.  Acronyms should
    be in all capitals.  The body of each section must be indented 4
    spaces.  Code samples inside body sections should be indented a
    further 4 spaces, and other indentation can be used as required to
    make the text readable.  You must use two blank lines between the
    last line of a section's body and the next section heading.

    You must adhere to the Emacs convention of adding two spaces at
    the end of every sentence.  You should fill your paragraphs to
    column 70, but under no circumstances should your lines extend
    past column 79.  If your code samples spill over column 79, you
    should rewrite them.

    Tab characters must never appear in the document at all.  A PEP
    should include the standard Emacs stanza included by example at
    the bottom of this PEP.

    A PEP must contain a Copyright section, and it is strongly
    recommended to put the PEP in the public domain.

    When referencing an external web page in the body of a PEP, you
    should include the title of the page in the text, with a
    footnote reference to the URL.  Do not include the URL in the body
    text of the PEP.  E.g.

        Refer to the Python Language web site [1] for more details.
        ...
        [1] http://www.python.org

    When referring to another PEP, include the PEP number in the body
    text, such as "PEP 1".  The title may optionally appear.  Add a
    footnote reference that includes the PEP's title and author.  It
    may optionally include the explicit URL on a separate line, but
    only in the References section.  Note that the pep2html.py script
    will calculate URLs automatically, e.g.:

            ...
            Refer to PEP 1 [7] for more information about PEP style
            ...

        References

            [7] PEP 1, PEP Purpose and Guidelines, Warsaw, Hylton
                http://www.python.org/peps/pep-0001.html

    If you decide to provide an explicit URL for a PEP, please use
    this as the URL template:

        http://www.python.org/peps/pep-xxxx.html

    PEP numbers in URLs must be padded with zeros from the left, so as
    to be exactly 4 characters wide, however PEP numbers in text are
    never padded.


Reporting PEP Bugs, or Submitting PEP Updates

    How you report a bug, or submit a PEP update depends on several
    factors, such as the maturity of the PEP, the preferences of the
    PEP author, and the nature of your comments.  For the early draft
    stages of the PEP, it's probably best to send your comments and
    changes directly to the PEP author.  For more mature, or finished
    PEPs you may want to submit corrections to the SourceForge bug
    manager[6] or better yet, the SourceForge patch manager[2] so that
    your changes don't get lost.  If the PEP author is a SF developer,
    assign the bug/patch to him, otherwise assign it to the PEP
    editor.

    When in doubt about where to send your changes, please check first
    with the PEP author and/or PEP editor.

    PEP authors who are also SF committers, can update the PEPs
    themselves by using "cvs commit" to commit their changes.
    Remember to also push the formatted PEP text out to the web by
    doing the following:

    % python pep2html.py -i NUM

    where NUM is the number of the PEP you want to push out.  See

    % python pep2html.py --help

    for details.


Transferring PEP Ownership

    It occasionally becomes necessary to transfer ownership of PEPs to
    a new champion.  In general, we'd like to retain the original
    author as a co-author of the transferred PEP, but that's really up
    to the original author.  A good reason to transfer ownership is
    because the original author no longer has the time or interest in
    updating it or following through with the PEP process, or has
    fallen off the face of the 'net (i.e. is unreachable or not
    responding to email).  A bad reason to transfer ownership is
    because you don't agree with the direction of the PEP.  We try to
    build consensus around a PEP, but if that's not possible, you can
    always submit a competing PEP.

    If you are interested assuming ownership of a PEP, send a message
    asking to take over, addressed to both the original author and the
    PEP editor <peps@python.org>.  If the original author doesn't
    respond to email in a timely manner, the PEP editor will make a
    unilateral decision (it's not like such decisions can be
    reversed. :).


References and Footnotes

    [1] This historical record is available by the normal CVS commands
    for retrieving older revisions.  For those without direct access
    to the CVS tree, you can browse the current and past PEP revisions
    via the SourceForge web site at

    http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/python/nondist/peps/?cvsroot=python

    [2] http://sourceforge.net/tracker/?group_id=5470&atid=305470

    [3] http://sourceforge.net/tracker/?atid=355470&group_id=5470&func=browse

    [4] http://www.opencontent.org/openpub/

    [5] The script referred to here is pep2html.py, which lives in
        the same directory in the CVS tree as the PEPs themselves.
        Try "pep2html.py --help" for details.

        The URL for viewing PEPs on the web is
        http://www.python.org/peps/

    [6] http://sourceforge.net/tracker/?group_id=5470&atid=305470

    [7] PEP 9, Sample PEP Template
        http://www.python.org/peps/pep-0009.html


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:


From oren-py-d@hishome.net  Mon Jul 29 20:09:44 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 29 Jul 2002 22:09:44 +0300
Subject: [Python-Dev] patch: try/finally in generators
In-Reply-To: <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jul 29, 2002 at 01:34:01PM -0400
References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020729220944.A6113@hishome.net>

On Mon, Jul 29, 2002 at 01:34:01PM -0400, Guido van Rossum wrote:
> > http://www.python.org/sf/584626
> > 
> > This patch removes the limitation of not allowing yield in the try part
> > of a try/finally. The dealloc function of a generator checks if the 
> > generator is still alive and resumes it one last time from the return 
> > instruction at the end of the code, causing any try/finally blocks to be 
> > triggered. Any exceptions raised are treated just like exceptions in a
> > __del__ finalizer (printed and ignored).
> 
> I'm not sure I understand what it does.  The return instruction at the
> end of the code, if I take this literally, isn't enclosed in any
> try/finally blocks.  So how can this have the desired effect?

They're on the block stack.  The stack unwind does the rest.
 
> Have you verified that Jython can implement these semantics too?

I don't see why not. The trick of jumping to the end was just my way to
avoid adding a flag or some magic value to signal to eval_frame that it 
needs to trigger the block stack unwind on ceval.c:2201.  There must be 
many other ways to implement this.
 
> Do you *really* need this?

I'm a plumber.  I make pipelines by chaining iterators and transformations.  
My favorite fittings are generator functions and closures so I rarely need 
to actually define a class.  One of my generator functions needed to clean 
up some stuff so I naturally used a try/finally block. When the compiler 
complained I recalled that when I first read with excitement about generator 
functions there was a comment there about some arbitrary limitation of yield 
statements in try/finally blocks...  

Anyway, I ended up creating a temporary local object just so I could take 
advantage of its __del__ method for cleanup but I really didn't like it. 
After a quick look at ceval.c I realized that it would be easy to fix this 
by having the dealloc function simulate a return statement just after the 
yield that was never resumed. So I wrote a little patch to remove something 
that I consider a wart.

	Oren


  Teaser: coming soon on the dataflow library! transparent two-way 
  interoperability between iterators and unix pipes! 



From guido@python.org  Mon Jul 29 20:30:34 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 15:30:34 -0400
Subject: [Python-Dev] HAVE_CONFIG_H
Message-ID: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net>

I see no references to HAVE_CONFIG_H in the source code (except one
#undef in readline.c), yet we #define it on the command line.  Is that
still necessary?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Jul 29 20:40:01 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 15:40:01 -0400
Subject: [Python-Dev] patch: try/finally in generators
In-Reply-To: Your message of "Mon, 29 Jul 2002 22:09:44 +0300."
 <20020729220944.A6113@hishome.net>
References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net>
 <20020729220944.A6113@hishome.net>
Message-ID: <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net>

> > > http://www.python.org/sf/584626
> > > 
> > > This patch removes the limitation of not allowing yield in the
> > > try part of a try/finally. The dealloc function of a generator
> > > checks if the generator is still alive and resumes it one last
> > > time from the return instruction at the end of the code, causing
> > > any try/finally blocks to be triggered. Any exceptions raised
> > > are treated just like exceptions in a __del__ finalizer (printed
> > > and ignored).
> > 
> > I'm not sure I understand what it does.  The return instruction at
> > the end of the code, if I take this literally, isn't enclosed in
> > any try/finally blocks.  So how can this have the desired effect?
> 
> They're on the block stack.  The stack unwind does the rest.

OK.  Your way to find the last return statement gives me the willies
though. :-(

> > Have you verified that Jython can implement these semantics too?
> 
> I don't see why not. The trick of jumping to the end was just my way
> to avoid adding a flag or some magic value to signal to eval_frame
> that it needs to trigger the block stack unwind on ceval.c:2201.
> There must be many other ways to implement this.

Please go to the Jython developers and ask their opinion.
Implementing yield in Java is a bit of a hack, and we've been careful
to make it possible at all.  I don't want to break it.

Of course, since Jython has garbage collection, your finally clause
may be executed later than you had expected it, or not at all!  Are
you sure you want this?  I don't recall all the reasons why this
restriction was added to the PEP, but I believe it wasn't just because
we couldn't figure out how to implement it -- it also had to do with
not being able to explain what exactly the semantics would be.

> > Do you *really* need this?
> 
> I'm a plumber.  I make pipelines by chaining iterators and
> transformations.  My favorite fittings are generator functions and
> closures so I rarely need to actually define a class.  One of my
> generator functions needed to clean up some stuff so I naturally
> used a try/finally block. When the compiler complained I recalled
> that when I first read with excitement about generator functions
> there was a comment there about some arbitrary limitation of yield
> statements in try/finally blocks...
> 
> Anyway, I ended up creating a temporary local object just so I could
> take advantage of its __del__ method for cleanup but I really didn't
> like it.  After a quick look at ceval.c I realized that it would be
> easy to fix this by having the dealloc function simulate a return
> statement just after the yield that was never resumed. So I wrote a
> little patch to remove something that I consider a wart.

There are a few other places that invoke Python code in a dealloc
handler (__del__ invocations in classobject.c and typeobject.c).  They
do a more complicated dance with the reference count.  Can you check
that you are doing the right thing?

I'd also like to get Neil Schemenauer's review of the code, since he
knows best how generators work under the covers.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Mon Jul 29 20:59:06 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jul 2002 21:59:06 +0200
Subject: [Python-Dev] HAVE_CONFIG_H
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3D459E8A.1050602@lemburg.com>

Guido van Rossum wrote:
> I see no references to HAVE_CONFIG_H in the source code (except one
> #undef in readline.c), yet we #define it on the command line.  Is that
> still necessary?

What about these ?

./Mac/mwerks/old/mwerks_nsgusi_config.h:
-- define HAVE_CONFIG_H
./Mac/mwerks/old/mwerks_tk_config.h:
-- define HAVE_CONFIG_H
./Mac/mwerks/old/mwerks_shgusi_config.h:
-- define HAVE_CONFIG_H
./Modules/expat/xmlparse.c:
-- #ifdef HAVE_CONFIG_H
./Modules/expat/xmltok.c:
-- #ifdef HAVE_CONFIG_H
./Modules/expat/xmlrole.c:
-- #ifdef HAVE_CONFIG_H

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From guido@python.org  Mon Jul 29 21:06:57 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 16:06:57 -0400
Subject: [Python-Dev] HAVE_CONFIG_H
In-Reply-To: Your message of "Mon, 29 Jul 2002 21:59:06 +0200."
 <3D459E8A.1050602@lemburg.com>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net>
 <3D459E8A.1050602@lemburg.com>
Message-ID: <200207292006.g6TK6wq06015@pcp02138704pcs.reston01.va.comcast.net>

> > I see no references to HAVE_CONFIG_H in the source code (except one
> > #undef in readline.c), yet we #define it on the command line.  Is that
> > still necessary?
> 
> What about these ?

> ./Mac/mwerks/old/mwerks_nsgusi_config.h:
> -- define HAVE_CONFIG_H
> ./Mac/mwerks/old/mwerks_tk_config.h:
> -- define HAVE_CONFIG_H
> ./Mac/mwerks/old/mwerks_shgusi_config.h:
> -- define HAVE_CONFIG_H

I don't have a directory Mac/mwerks/old/.  Maybe you created this
yourself?

> ./Modules/expat/xmlparse.c:
> -- #ifdef HAVE_CONFIG_H
> ./Modules/expat/xmltok.c:
> -- #ifdef HAVE_CONFIG_H
> ./Modules/expat/xmlrole.c:
> -- #ifdef HAVE_CONFIG_H

We don't pass HAVE_CONFIG_H to extension modules, only to the core
(stuff built directly by the Makefile, not by setup.py).  That's a
good thing too, becaus these include <config.h>, not "pyconfig.h".

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Jul 29 21:09:05 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 16:09:05 -0400
Subject: [Python-Dev] patch: try/finally in generators
In-Reply-To: Your message of "Mon, 29 Jul 2002 20:08:24 +0300."
 <20020729200824.A5391@hishome.net>
References: <20020729200824.A5391@hishome.net>
Message-ID: <200207292009.g6TK95i06131@pcp02138704pcs.reston01.va.comcast.net>

> http://www.python.org/sf/584626
> 
> This patch removes the limitation of not allowing yield in the try part
> of a try/finally. The dealloc function of a generator checks if the 
> generator is still alive and resumes it one last time from the return 
> instruction at the end of the code, causing any try/finally blocks to be 
> triggered. Any exceptions raised are treated just like exceptions in a
> __del__ finalizer (printed and ignored).

Try building Python in debug mode, and then run the test suite.  I get
a fatal error in test_generators (but not when that test is run in
isolation):

Fatal Python error: ../Python/ceval.c:2256 object at 0x40b05654 has negative ref count -1

--Guido van Rossum (home page: http://www.python.org/~guido/)


From oren-py-d@hishome.net  Mon Jul 29 21:14:26 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 29 Jul 2002 23:14:26 +0300
Subject: [Python-Dev] patch: try/finally in generators
In-Reply-To: <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jul 29, 2002 at 03:40:01PM -0400
References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020729231426.A7209@hishome.net>

On Mon, Jul 29, 2002 at 03:40:01PM -0400, Guido van Rossum wrote:
> > > I'm not sure I understand what it does.  The return instruction at
> > > the end of the code, if I take this literally, isn't enclosed in
> > > any try/finally blocks.  So how can this have the desired effect?
> > 
> > They're on the block stack.  The stack unwind does the rest.
> 
> OK.  Your way to find the last return statement gives me the willies
> though. :-(

Yeah, I know. I'm not too proud of it but I was looking for instant 
gratification...

> Of course, since Jython has garbage collection, your finally clause
> may be executed later than you had expected it, or not at all!  Are
> you sure you want this?  

The same question applies to the __del__ method of any local variables 
inside the suspended generator.  I tend to rely on the reference counting
semantics of CPython in much of my code and I don't feel bad about it.

> There are a few other places that invoke Python code in a dealloc
> handler (__del__ invocations in classobject.c and typeobject.c).  They
> do a more complicated dance with the reference count.  Can you check
> that you are doing the right thing?

The __del__ method gets a reference to the object so it needs to be
revived.  Generators are much simpler because the generator function does 
not have any reference to the generator object.

	Oren



From nas@python.ca  Mon Jul 29 21:25:15 2002
From: nas@python.ca (Neil Schemenauer)
Date: Mon, 29 Jul 2002 13:25:15 -0700
Subject: [Python-Dev] patch: try/finally in generators
In-Reply-To: <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net>; from guido@python.org on Mon, Jul 29, 2002 at 03:40:01PM -0400
References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20020729132515.A31926@glacier.arctrix.com>

Guido van Rossum wrote:
> I'd also like to get Neil Schemenauer's review of the code, since he
> knows best how generators work under the covers.

I'm pretty sure it can be made to work (at least for CPython).  The
proposed patch is not correct since it doesn't handle "finally" code
that creates a new reference to the generator.  Also, setting the
instruction pointer to the return statement is really ugly, IMO.  There
could be valid code out there that does not end with LOAD_CONST+RETURN.

Those are minor details though.  We need to decide if we really want
this.  For example, what happens if 'yield' is inside the finally block?
With the proposed patch:

    >>> def f():
    ...   try:
    ...     assert 0
    ...   finally:
    ...     return 1
    ... 
    >>> f()
    1
    >>> def g():
    ...   try:
    ...     assert 0
    ...   finally:
    ...     yield 1
    ... 
    >>> list(g())
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "<stdin>", line 3, in g
    AssertionError

Maybe some people whould expect [1] in the second case.

  Neil


From guido@python.org  Mon Jul 29 21:21:07 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 16:21:07 -0400
Subject: [Python-Dev] patch: try/finally in generators
In-Reply-To: Your message of "Mon, 29 Jul 2002 23:14:26 +0300."
 <20020729231426.A7209@hishome.net>
References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net>
 <20020729231426.A7209@hishome.net>
Message-ID: <200207292021.g6TKL7u06204@pcp02138704pcs.reston01.va.comcast.net>

> Yeah, I know. I'm not too proud of it but I was looking for instant 
> gratification...

The search for instant gratification probably ties a lot of the Python
community together...

> > Of course, since Jython has garbage collection, your finally clause
> > may be executed later than you had expected it, or not at all!  Are
> > you sure you want this?  
> 
> The same question applies to the __del__ method of any local variables 
> inside the suspended generator.  I tend to rely on the reference counting
> semantics of CPython in much of my code and I don't feel bad about it.

But __del__ is in essence asynchronous.  On the other hand,
try/finally is traditionally completely synchronous.  Adding a case
where a finally clause can execute asynchronously (or not at all,
if there is a global ref or cyclical garbage keeping the generator
alive) sounds like a breach of promise almost.

> > There are a few other places that invoke Python code in a dealloc
> > handler (__del__ invocations in classobject.c and typeobject.c).  They
> > do a more complicated dance with the reference count.  Can you check
> > that you are doing the right thing?
> 
> The __del__ method gets a reference to the object so it needs to be
> revived.  Generators are much simpler because the generator function does 
> not have any reference to the generator object.

But you still have to be careful with how you incref/decref -- see my
fatal error report in debug mode.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Jul 29 21:30:36 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 16:30:36 -0400
Subject: [Python-Dev] patch: try/finally in generators
In-Reply-To: Your message of "Mon, 29 Jul 2002 13:25:15 PDT."
 <20020729132515.A31926@glacier.arctrix.com>
References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net>
 <20020729132515.A31926@glacier.arctrix.com>
Message-ID: <200207292030.g6TKUaW06234@pcp02138704pcs.reston01.va.comcast.net>

> I'm pretty sure it can be made to work (at least for CPython).  The
> proposed patch is not correct since it doesn't handle "finally" code
> that creates a new reference to the generator.

As Oren pointed out, how can you create a reference to the generator
when its reference count was 0?  There can't be a global referencing
it, and (unlike __del__) you aren't getting a pointer to yourself.

> Also, setting the instruction pointer to the return statement is
> really ugly, IMO.

Agreed. ;-)

> There could be valid code out there that does not end with
> LOAD_CONST+RETURN.

The current code generator always generates that as the final
instruction.  But someone might add an optimizer that takes that out
if it is provably unreachable...

> Those are minor details though.  We need to decide if we really want
> this.  For example, what happens if 'yield' is inside the finally block?
> With the proposed patch:
> 
>     >>> def f():
>     ...   try:
>     ...     assert 0
>     ...   finally:
>     ...     return 1
>     ... 
>     >>> f()
>     1
>     >>> def g():
>     ...   try:
>     ...     assert 0
>     ...   finally:
>     ...     yield 1
>     ... 
>     >>> list(g())
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>       File "<stdin>", line 3, in g
>     AssertionError
> 
> Maybe some people whould expect [1] in the second case.

The latter is not new; that example has no yield in the try clause.
If you'd used a for loop or next() calls, you'd have noticed the yield
got executed normally, but following next() call raises
AssertionError.

But this example behaves strangely:

>>> def f():
...  try:
...   yield 1
...   assert 0
...  finally:
...   yield 2
... 
>>> a = f()
>>> a.next()
1
>>> del a
>>>

What happens at the yield here?!?!  If I put prints before and after
it, the finally clause is entered, but not exited.  Bizarre!!!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From nas@python.ca  Mon Jul 29 21:41:12 2002
From: nas@python.ca (Neil Schemenauer)
Date: Mon, 29 Jul 2002 13:41:12 -0700
Subject: [Python-Dev] patch: try/finally in generators
In-Reply-To: <20020729132515.A31926@glacier.arctrix.com>; from nas@python.ca on Mon, Jul 29, 2002 at 01:25:15PM -0700
References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net> <20020729132515.A31926@glacier.arctrix.com>
Message-ID: <20020729134112.B31926@glacier.arctrix.com>

I wrote:
> The proposed patch is not correct since it doesn't handle "finally"
> code that creates a new reference to the generator.

It looks like that's not actually a problem since you can't get a hold
of a reference to the generator.  However, here's another bit of
nastiness:

    $ cat > bad.py
    import sys
    import gc

    def g():
        global gen
        self = gen
        try:
            yield 1
        finally:
            gen = self
            
    gen = g()
    gen.next()
    del gen
    gc.collect()
    print gen
    $ ./python bad.py
    Segmentation fault (core dumped)

Basically, the GC has to be taught that generators can have finalizers
and it may not be safe to collect them.  If we allow try/finally in
generators then they can cause uncollectible garbage.  It's not a show
stopper but something else to take into consideration.

  Neil


From guido@python.org  Mon Jul 29 21:38:12 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 16:38:12 -0400
Subject: [Python-Dev] patch: try/finally in generators
In-Reply-To: Your message of "Mon, 29 Jul 2002 13:41:12 PDT."
 <20020729134112.B31926@glacier.arctrix.com>
References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net> <20020729132515.A31926@glacier.arctrix.com>
 <20020729134112.B31926@glacier.arctrix.com>
Message-ID: <200207292038.g6TKcC806273@pcp02138704pcs.reston01.va.comcast.net>

> Basically, the GC has to be taught that generators can have finalizers
> and it may not be safe to collect them.  If we allow try/finally in
> generators then they can cause uncollectible garbage.  It's not a show
> stopper but something else to take into consideration.

I leave this in your capable hands.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Mon Jul 29 22:42:50 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 29 Jul 2002 17:42:50 -0400
Subject: [Python-Dev] patch: try/finally in generators
In-Reply-To: <20020729134112.B31926@glacier.arctrix.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEHDAIAB.tim.one@comcast.net>

[Neil Schemenauer]
> ...
> Basically, the GC has to be taught that generators can have finalizers
> and it may not be safe to collect them.  If we allow try/finally in
> generators

Note that we already allow try/finally in generators.  The only prohibition
is against having a yield stmt in the try clause of a try/finally construct
(YINTCOATFC).

> then they can cause uncollectible garbage.  It's not a show
> stopper but something else to take into consideration.

I'm concerned about semantic clarity.  A "finally" block is supposed to get
executed upon leaving its associated "try" block.  A yield stmt doesn't
leave the try block in that sense, so there's no justification for executing
the finally block unless the generator is resumed, and the try block is
exited "for real" via some other means (a return, an exception, or falling
off the end of the try block).  We could have allowed YINTCOATFC under those
rules with clarity, but it would have been a great surprise then that the
finally clause may never get executed at all.  Better to outlaw it than that
(or, as the PEP says, that would be "too much a violation of finally's
purpose to bear").

Making up new control flow out of thin air upon destructing a generator
("OK, let's pretend that the generator was actually resumed in that case,
and also pretend that a return statement immediately followed the yield") is
plainly a hack; and because it's still possible then that the finally clause
may never get executed at all (because it's possible to create an
uncollectible generator), it's too much a violation of finally's purpose to
bear even so.

When I've needed resource-cleanup in a generator, I've made the generator a
method of a class, and put the resources in instance variables.  Then
they're easy to clean up at will (even via a __del__ method, if need be; but
the uncertainty about when and whether __del__ methods get called is already
well-known, and I don't want to extend that fuzziness to 'finally' clauses
too -- we left those reliable against anything short of a system crash, and
IMO it's important to keep them that bulletproof).



From guido@python.org  Mon Jul 29 23:01:03 2002
From: guido@python.org (Guido van Rossum)
Date: Mon, 29 Jul 2002 18:01:03 -0400
Subject: [Python-Dev] patch: try/finally in generators
In-Reply-To: Your message of "Mon, 29 Jul 2002 17:42:50 EDT."
 <LNBBLJKPBEHFEDALKOLCKEHDAIAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCKEHDAIAB.tim.one@comcast.net>
Message-ID: <200207292201.g6TM14G06652@pcp02138704pcs.reston01.va.comcast.net>

[Tim]
> Note that we already allow try/finally in generators.  The only
> prohibition is against having a yield stmt in the try clause of a
> try/finally construct (YINTCOATFC).
> 
[Neil]
> > then they can cause uncollectible garbage.  It's not a show
> > stopper but something else to take into consideration.
> 
> I'm concerned about semantic clarity.  A "finally" block is supposed
> to get executed upon leaving its associated "try" block.  A yield
> stmt doesn't leave the try block in that sense, so there's no
> justification for executing the finally block unless the generator
> is resumed, and the try block is exited "for real" via some other
> means (a return, an exception, or falling off the end of the try
> block).  We could have allowed YINTCOATFC under those rules with
> clarity, but it would have been a great surprise then that the
> finally clause may never get executed at all.  Better to outlaw it
> than that (or, as the PEP says, that would be "too much a violation
> of finally's purpose to bear").
> 
> Making up new control flow out of thin air upon destructing a
> generator ("OK, let's pretend that the generator was actually
> resumed in that case, and also pretend that a return statement
> immediately followed the yield") is plainly a hack; and because it's
> still possible then that the finally clause may never get executed
> at all (because it's possible to create an uncollectible generator),
> it's too much a violation of finally's purpose to bear even so.
> 
> When I've needed resource-cleanup in a generator, I've made the
> generator a method of a class, and put the resources in instance
> variables.  Then they're easy to clean up at will (even via a
> __del__ method, if need be; but the uncertainty about when and
> whether __del__ methods get called is already well-known, and I
> don't want to extend that fuzziness to 'finally' clauses too -- we
> left those reliable against anything short of a system crash, and
> IMO it's important to keep them that bulletproof).

I hope that Oren will withdraw his patch based upon this explanation.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@v.loewis.de  Mon Jul 29 23:46:03 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 30 Jul 2002 00:46:03 +0200
Subject: [Python-Dev] pickling of large arrays
In-Reply-To: <20020729130200.73932.qmail@web20201.mail.yahoo.com>
References: <20020729130200.73932.qmail@web20201.mail.yahoo.com>
Message-ID: <m3k7nekv9w.fsf@mira.informatik.hu-berlin.de>

"Ralf W. Grosse-Kunstleve" <rwgk@yahoo.com> writes:

> We are using Boost.Python to expose reference-counted C++ container
> types (similar to std::vector<>) to Python. E.g.:
> 
> from arraytbx import shared
> d = shared.double(1000000) # double array with a million elements
> c = shared.complex_double(100) # std::complex<double> array
> # and many more types, incl. several custom C++ types

I recommend to implement pickling differently, e.g. by returning a
byte string with the underlying memory representation. If producing a
duplicate is still not acceptable, I recommend to inherit from the
Pickler class.

Regards,
Martin


From tim.one@comcast.net  Mon Jul 29 23:49:01 2002
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 29 Jul 2002 18:49:01 -0400
Subject: [Python-Dev] test_imaplib failing elsewhere?
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHJAIAB.tim.one@comcast.net>

On Windows:

> python ../lib/test/test_imaplib.py
incorrect result when converting (2033, 5, 18, 3, 33, 20, 2, 138, 0)
incorrect result when converting '"18-May-2033 13:33:20 +1000"'
>

IOW, it tries two things, and fails on both.

Beefing up its

    if t1 <> t2:
        print 'incorrect result when converting', `t`

by adding

        print '                          t1 was', `t1`
        print '                          t2 was', `t2`

yields

incorrect result when converting (2033, 5, 18, 3, 33, 20, 2, 138, 0)
                          t1 was '"18-May-2033 03:33:20 -0500"'
                          t2 was '"18-May-2033 04:33:20 -0400"'
incorrect result when converting '"18-May-2033 13:33:20 +1000"'
                          t1 was '"18-May-2033 13:33:20 +1000"'
                          t2 was '"17-May-2033 23:33:20 -0400"'

I'm not sure when it started failing, but within the last week ... OK, rev
1.3 of test_imaplib.py worked here, and rev 1.4 broke it, checked in 2-3
days ago.



From pinard@iro.umontreal.ca  Tue Jul 30 00:05:56 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 29 Jul 2002 19:05:56 -0400
Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines
In-Reply-To: <15685.35726.678832.241665@anthem.wooz.org>
References: <15685.35726.678832.241665@anthem.wooz.org>
Message-ID: <oqznwaqgmj.fsf@titan.progiciels-bpi.ca>

[Barry A. Warsaw]

> It has been a while since I posted a copy of PEP 1 to the mailing
> lists and newsgroups.

Thanks for giving me this opportunity.  There is a tiny detail that
bothers me:

>     The format of the author entry should be
>         address@dom.ain (Random J. User)
>     if the email address is included, and just
>         Random J. User
>     if the address is not given.

This makes me jump fifteen years behind (or so, I do not remember times),
at the time of the great push so the Internet prefers:

       Random J. User <address@dom.ain>

It is more reasonable to always give the real name, optionally followed by
an email, that to consider that the real name is a mere comment for the
email address.  Oh, I know some hackers who praise themselves as login
names or dream having positronic brains :-), but most of us are humans
before anything else!

Could the PEP be reformulated, at least, for leaving the choice opened?

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From martin@v.loewis.de  Mon Jul 29 23:52:57 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 30 Jul 2002 00:52:57 +0200
Subject: [Python-Dev] HAVE_CONFIG_H
In-Reply-To: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> I see no references to HAVE_CONFIG_H in the source code (except one
> #undef in readline.c), yet we #define it on the command line.  Is that
> still necessary?

It's autoconf tradition to use that; it would replace DEFS to either
many -D options, or -DHAVE_CONFIG_H (if AC_CONFIG_HEADER appears).

I don't think we need this, and it can safely be removed.

Regards,
Martin


From martin@v.loewis.de  Tue Jul 30 00:22:44 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 30 Jul 2002 01:22:44 +0200
Subject: [Python-Dev] test_imaplib failing elsewhere?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEHJAIAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCGEHJAIAB.tim.one@comcast.net>
Message-ID: <m34rei5dbv.fsf@mira.informatik.hu-berlin.de>

Tim Peters <tim.one@comcast.net> writes:

> On Windows:
> 
> > python ../lib/test/test_imaplib.py
> incorrect result when converting (2033, 5, 18, 3, 33, 20, 2, 138, 0)
> incorrect result when converting '"18-May-2033 13:33:20 +1000"'
> >
> 
> IOW, it tries two things, and fails on both.

It fails on Linux and Solaris as well.

Regards,
Martin


From pinard@iro.umontreal.ca  Tue Jul 30 00:30:30 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 29 Jul 2002 19:30:30 -0400
Subject: [Python-Dev] Re: HAVE_CONFIG_H
In-Reply-To: <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net>
 <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de>
Message-ID: <oqptx6qfhl.fsf@titan.progiciels-bpi.ca>

[Martin v. Loewis]

> Guido van Rossum <guido@python.org> writes:

> > I see no references to HAVE_CONFIG_H in the source code (except one
> > #undef in readline.c), yet we #define it on the command line.  Is that
> > still necessary?

> It's autoconf tradition to use that; it would replace DEFS to either
> many -D options, or -DHAVE_CONFIG_H (if AC_CONFIG_HEADER appears).

> I don't think we need this, and it can safely be removed.

The many `-D' options which appear when `AC_CONFIG_HEADER' is not used
are rather inelegant, they create a lot, really a lot of clumsiness in
`make' output.  The idea, but you surely know it, was to regroup all
auto-configured definitions into a single header file, and limit the `-D'
to the sole `HAVE_CONFIG_H', or almost.  While the:

#if HAVE_CONFIG_H
# include <config.h>
#endif

idiom, for some widely used sources, was to cope with `AC_CONFIG_HEADER'
being defined in some projects, and not in others.  There is no need to
include `config.h', nor to create it, if all `#define's have been already
done through a litany of `-D' options.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From nhodgson@bigpond.net.au  Tue Jul 30 00:37:18 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Tue, 30 Jul 2002 09:37:18 +1000
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook>
Message-ID: <029801c23758$e13594b0$3da48490@neil>

Thomas Heller:

> ..., but I understand Neil's requirements.
> 
> Can they be fulfilled by adding some kind of UnlockObject()
> call to the 'safe buffer interface', which should mean 'I won't
> use the pointer received by getsaferead/writebufferproc any more'?

   Yes, that is exactly what I want.

   Neil




From nhodgson@bigpond.net.au  Tue Jul 30 00:50:43 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Tue, 30 Jul 2002 09:50:43 +1000
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729164532.48588.qmail@web40110.mail.yahoo.com>
Message-ID: <02a201c2375a$c1299f70$3da48490@neil>

Scott Gilbert:

> What happens when you've locked the buffer and passed a pointer to the I/O
> system for an asynchronous operation, but before that operation has
> completed, your main program wants to resize the buffer due to a user
> generated event?

   That is up to the application or class designer. There are three
reasonable responses I see: throw an exception, buffer the user event, or
ignore the user event. The only thing guaranteed by providing the safe
buffer interface is that the pointer will remain valid.

> >    I don't want counting mutexes. I'm not defining behavior that needs
> > them.
> >
>
> You said you wanted the locks to keep a count.  So that you could call
> acquire() multiple times and have the buffer not truly become unlocked
> until release() was called the same amount of times.  I'm willing to adopt
> any terminology you want for the purpose of this discussion.  I think I
> understand the semantics or the counting operation, but I want to
> understand more what actually happens when the buffer is locked.

   When the buffer is locked, it returns a pointer and promises that the
pointer will remain valid until the buffer is unlocked.

   The buffer interface could be defined either to allow multiple (counted)
locks or to fail further lock attempts. Counted locks would be applicable in
more circumstances but require more implementation. I would prefer counted
but it is not that important as a counting layer can be implemented over a
single lock interface if needed.

   Neil




From nhodgson@bigpond.net.au  Tue Jul 30 01:02:53 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Tue, 30 Jul 2002 10:02:53 +1000
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729165419.31643.qmail@web40111.mail.yahoo.com>
Message-ID: <02c001c2375c$74037de0$3da48490@neil>

Scott Gilbert:

> I assume this means any call to getsafereadpointer()/getsafewritepointer()
> will increment the lock count.  So the UnlockObject() calls will be
> mandatory.

   The UnlockObject call will be needed if you do want to permit resizing
(again). It will not be needed for statically sized objects, including all
the types that are included in the PEP currently, or where you have an
object that will no longer need to be resizable. For example: you construct
a sound buffer, fill it with noise, then lock it so that a pointer to its
data can be given to the asynch sound playing function. If you don't need to
write to the sound buffer again, it doesn't need to be unlocked.

> Either that, or you'll have an explicit LockObject() call as
> well.  What behavior should happen when a resise is attempted while the
> lock count is positive?

   The most common response will be some form of failure, probably throwing
an exception. Other responses, such as buffering the resize, may be sensible
in particular circumstances.

   Neil




From greg@cosc.canterbury.ac.nz  Tue Jul 30 01:21:42 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 30 Jul 2002 12:21:42 +1200 (NZST)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <00c601c23707$35819a20$3da48490@neil>
Message-ID: <200207300021.g6U0LgOG018189@kuku.cosc.canterbury.ac.nz>

> This restricts the set of objects that can be buffers to statically
> sized objects. I'd prefer that dynamically resizable objects be able
> to be buffers.

That's what bothers me about the proposal -- I suspect that this
restriction will turn out to be too restrictive to make it useful.

But maybe locking could be built into the safe-buffer protocol?

Resizable objects wanting to support the safe buffer protocol would be
required to maintain a lock count which is incremented on each
getsafebufferptr call. There would also have to be a
releasesafebufferptr call to decrement the lock count. As long as the
lock count is nonzero, attempting to resize the object would raise an
exception.

That way, resizable objects could be used as asynchronous I/O buffers
as long as you didn't try to resize them while actually doing I/O.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Tue Jul 30 02:12:19 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 30 Jul 2002 13:12:19 +1200 (NZST)
Subject: [Python-Dev] Generator cleanup idea (patch: try/finally in generators)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEHDAIAB.tim.one@comcast.net>
Message-ID: <200207300112.g6U1CJoO018210@kuku.cosc.canterbury.ac.nz>

> but it would have been a great surprise then that the finally clause
> may never get executed at all.  Better to outlaw it than that (or, as
> the PEP says, that would be "too much a violation of finally's purpose
> to bear").

I don't think you'd really be breaking any promises.
After all, if someone wrote

  def asdf():
    try:
      something_that_never_returns()
    finally:
      ...

they wouldn't have much ground for complaint that the
finally never got executed. The case we're talking about
seems much the same situation.

> When I've needed resource-cleanup in a generator, I've made the generator a
> method of a class, and put the resources in instance variables.  Then
> they're easy to clean up at will (even via a __del__ method, if need
> be;

I take it you usually provide a method for explicit cleanup.
How about giving generator-iterators one, then, called
maybe close() or abort(). The effect would be to raise
an appropriate exception at the point of the yield,
triggering any except or finally blocks.

This method could even be added to the general iterator
protocol (implementing it would be optional). It would
then provide a standard name for people to use for
cleanup methods in their own iterator classes.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Tue Jul 30 02:25:44 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 30 Jul 2002 13:25:44 +1200 (NZST)
Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines
In-Reply-To: <oqznwaqgmj.fsf@titan.progiciels-bpi.ca>
Message-ID: <200207300125.g6U1PiGC018255@kuku.cosc.canterbury.ac.nz>

pinard@iro.umontreal.ca:

> It is more reasonable to always give the real name, optionally
> followed by an email, that to consider that the real name is a mere
> comment for the email address.

Not necessarily -- it depends on your point of view.

I've always thought of the "To:" line as an address,
not a salutation. In other words, an instruction to the
email system as to where to send the message, not the
name of the recipient. Putting a person's name in there
at all seems to me a sop to computer-illiterate wimps
who go all wobbly at the knees when they see anything
as esoteric-looking as an email address. :-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Tue Jul 30 02:42:38 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 30 Jul 2002 13:42:38 +1200 (NZST)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <200207291703.g6TH3tk29997@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200207300142.g6U1gcSZ018273@kuku.cosc.canterbury.ac.nz>

Guido:

> I don't like where this is going.  Let's not add locking to the buffer
> protocol.

Do you still object to it even in the form I proposed in
my last message? (I.e. no separate "lock" call, locking
is implicit in the getxxxbuffer calls.)

It does make the protocol slightly more complicated to
use (must remember to make a release call when you're
finished with the pointer) but it seems like a good
tradeoff to me for the flexibility gained.

Note that there can't be any problems with deadlock,
since no blocking is involved. Maybe "locking" is even
the wrong term -- it's more a form of reference counting.

> probably nothing that could possibly invoke the Python interpreter
> recursively, since that might release the GIL.  This would generally
> mean that calls to Py_DECREF() are unsafe while holding on to a buffer
> pointer!

That could be fixed by incrementing the Python refcount as
long as a pointer is held. That could be done even without
the rest of my locking proposal. Of course, if you do that you
need a matching release call, so you might as well implement
the locking while you're at it.

Mind you, if a release call is necessary, whoever holds the
pointer must also hold a reference to the Python object,
so that they can make the release call. So incrementing the
Python refcount might not be necessary after all!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From pinard@iro.umontreal.ca  Tue Jul 30 02:46:34 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 29 Jul 2002 21:46:34 -0400
Subject: [Python-Dev] Re: Priority queue (binary heap) python code
In-Reply-To: <20020721193057.A1891@arizona.localdomain>
References: <20020624213318.A5740@arizona.localdomain>
 <200207200606.g6K66Um28510@pcp02138704pcs.reston01.va.comcast.net>
 <20020721193057.A1891@arizona.localdomain>
Message-ID: <oqd6t6q96t.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

> [...] I admire the compactness of his code.  I believe that this would make
> a good addition to the standard library, as a friend of the bisect module.
> [...]  The only change I would make would be to make heap[0] the lowest
> value rather than the highest.  I propose to call it heapq.py.

[Kevin O'Connor]

> Looks good to me.

In case you going forward with `heapq', and glancing through my notes, I see
that "Courageous" implemented a priority queue algorithm as a C extension,
and discussed it on python-list on 2000-05-29.

I'm not really expecting that you aim something else than a pure Python
version, and I'm not pushing nor pulling for it, as I do not have an opinion.
In any case, I'll keep these messages a few more days: just ask, and I'll
send you a copy of what I saved at the time.


P.S. - I'm quickly loosing interests in these bits of C code meant for
speed, as if I ever need C speed, the wonderful Pyrex tool (from Greg Ewing)
gives it to me while allowing the algorithm to be expressed in a language
close to Python.  I even wonder if Pyrex could not be a proper avenue for
the development of some parts of the Python distribution itself.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From pinard@iro.umontreal.ca  Tue Jul 30 03:33:14 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 29 Jul 2002 22:33:14 -0400
Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines
In-Reply-To: <200207300125.g6U1PiGC018255@kuku.cosc.canterbury.ac.nz>
References: <200207300125.g6U1PiGC018255@kuku.cosc.canterbury.ac.nz>
Message-ID: <oqvg6yosgl.fsf@titan.progiciels-bpi.ca>

[Greg Ewing]

> pinard@iro.umontreal.ca:

> > It is more reasonable to always give the real name, optionally
> > followed by an email, that to consider that the real name is a mere
> > comment for the email address.

> Not necessarily -- it depends on your point of view.

An email address may change over time, but one's name do not change
often.  In a lifetime of maintenance, I saw email addresses of a lot of
correspondents fluctuate more or less over time.  Only two or three persons
asked me to correct their name after they got it legalistically modified.

The contact point for a PEP is really a given human, whatever his/her
email address may currently be.  The modern Internet usage is to write
the name first, and the email address after, between angular brackets.
So, I'm suggesting that the PEP documents the popular, modern usage.

> I've always thought of the "To:" line as an address, not a salutation.

It is dual.  The human reads the civil name, the machine reads the email
address.  Many MUA's have limited space for the message summaries, and
they favour the civil name over the email address in the listings.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From sholden@holdenweb.com  Tue Jul 30 04:43:23 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Mon, 29 Jul 2002 23:43:23 -0400
Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines
References: <15685.35726.678832.241665@anthem.wooz.org> <oqznwaqgmj.fsf@titan.progiciels-bpi.ca>
Message-ID: <00eb01c2377b$41dd4340$6300000a@holdenweb.com>

----- Original Message -----
From: "François Pinard" <pinard@iro.umontreal.ca>
To: "Barry A. Warsaw" <barry@zope.com>
Cc: <python-list@python.org>; <python-dev@python.org>
Sent: Monday, July 29, 2002 7:05 PM
Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines


> [Barry A. Warsaw]
>
> > It has been a while since I posted a copy of PEP 1 to the mailing
> > lists and newsgroups.
>
> Thanks for giving me this opportunity.  There is a tiny detail that
> bothers me:
>
> >     The format of the author entry should be
> >         address@dom.ain (Random J. User)
> >     if the email address is included, and just
> >         Random J. User
> >     if the address is not given.
>
> This makes me jump fifteen years behind (or so, I do not remember times),
> at the time of the great push so the Internet prefers:
>
>        Random J. User <address@dom.ain>
>
> It is more reasonable to always give the real name, optionally followed by
> an email, that to consider that the real name is a mere comment for the
> email address.  Oh, I know some hackers who praise themselves as login
> names or dream having positronic brains :-), but most of us are humans
> before anything else!
>
> Could the PEP be reformulated, at least, for leaving the choice opened?
>

Should we instead say that any acceptable RFC822 address would be an
acceptable alternative for a simple name? If so you'd get naiive mail users
complaining that they couldn't reach "@python.org:sholden@holdenweb.com"
(for example).

I don't really see why the address format has to agree with any particular
other format: if you're going to use it in a program then there's no reason
why you shouldn't mangle it into whatever form you (or your
possibly-crippled software) requires :-)

The major benefit of the present situation is that it's well-defined. I
don't feel additional alternatived would be helpful here, especially when
the existing format is RFC822-compliant.

though-i-admit-i'm-not-up-to-speed-on-rfc2822-ly y'rs  - steve
-----------------------------------------------------------------------
Steve Holden                                 http://www.holdenweb.com/
Python Web Programming                http://pydish.holdenweb.com/pwp/
-----------------------------------------------------------------------






From sholden@holdenweb.com  Tue Jul 30 04:51:22 2002
From: sholden@holdenweb.com (Steve Holden)
Date: Mon, 29 Jul 2002 23:51:22 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729165419.31643.qmail@web40111.mail.yahoo.com> <200207291703.g6TH3tk29997@pcp02138704pcs.reston01.va.comcast.net>              <095701c23722$84e06770$e000a8c0@thomasnotebook>  <200207291710.g6THAin30057@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <013e01c2377c$5f61c2f0$6300000a@holdenweb.com>

----- Original Message -----
From: "Guido van Rossum" <guido@python.org>
To: "Thomas Heller" <thomas.heller@ion-tof.com>
Cc: "Scott Gilbert" <xscottg@yahoo.com>; "Neil Hodgson"
<nhodgson@bigpond.net.au>; <python-dev@python.org>
Sent: Monday, July 29, 2002 1:10 PM
Subject: Re: [Python-Dev] pre-PEP: The Safe Buffer Interface


> > >   If an object's buffer isn't allocated for the object's life
> > > when the object is created, it should not support the "safe" version
> > > of the protocol (maybe a different name would be better), and users
> > > should not release the GIL while using on to the pointer.
> >
> > 'Persistent' buffer interface? Too long?
>
> No, persistent typically refers to things that survive longer than a
> process.  Maybe 'static' buffer interface would work.
>

"cautious"?

regards
-----------------------------------------------------------------------
Steve Holden                                 http://www.holdenweb.com/
Python Web Programming                http://pydish.holdenweb.com/pwp/
-----------------------------------------------------------------------






From just@letterror.com  Tue Jul 30 06:55:20 2002
From: just@letterror.com (Just van Rossum)
Date: Tue, 30 Jul 2002 07:55:20 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules posixmodule.c,2.247,2.248
In-Reply-To: <E17ZLjB-0008Fh-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <r01050300-1015-F09877D9A38011D6AEFA003065D5E7E4@[10.0.0.23]>

nnorwitz@users.sourceforge.net wrote:

> Update of /cvsroot/python/python/dist/src/Modules
> In directory usw-pr-cvs1:/tmp/cvs-serv31715/Modules
> 
> Modified Files:
>   posixmodule.c 
> Log Message:
> Use PyArg_ParseTuple() instead of PyArg_Parse() which is deprecated
> 
> Index: posixmodule.c
> ===================================================================
[ ... ]
> !     else if (!PyArg_Parse(arg, "(ll)", &atime, &mtime)) {
[ ... ]
> !     else if (!PyArg_ParseTuple(arg, "ll", &atime, &mtime)) {
[ ... ]

Probably no biggie here, but I'd like to point out that there is a significant
difference between the two calls: the former will allow any sequence for 'arg',
but the latter insists on a tuple. For that reason I always use PyArg_Parse() to
parse coordinate pairs and the like: it greatly enhanced the usability in those
cases. Examples of this usage can be found in the Mac subtree.

Just


From xscottg@yahoo.com  Tue Jul 30 07:10:16 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Mon, 29 Jul 2002 23:10:16 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <02a201c2375a$c1299f70$3da48490@neil>
Message-ID: <20020730061016.32588.qmail@web40103.mail.yahoo.com>

--- Neil Hodgson <nhodgson@bigpond.net.au> wrote:
> Scott Gilbert:
> 
> > What happens when you've locked the buffer and passed a pointer to the
> > I/O system for an asynchronous operation, but before that operation has
> > completed, your main program wants to resize the buffer due to a user
> > generated event?
> 
>    That is up to the application or class designer. There are three
> reasonable responses I see: throw an exception, buffer the user event, or
> ignore the user event. The only thing guaranteed by providing the safe
> buffer interface is that the pointer will remain valid.
> 

The guarantee about the pointer remaining valid while the acquire_count is
positive is clear.  I'm concerned about what the other thread (the one that
wants to resize it) is going to do while the lock count is positive.

You've listed three possibilities, but lets narrow it down to the strategy
that you intend to use in Scintilla (a real use case).  I believe all three
strategies lead to something undesirable (be it polling, deadlock, a
confused user, or ???), but I don't want to exhaustively scrutinize all
possibilities until we come up with one good example that you intend to use
(it would bore you to read them, and me to type them).

So what exactly would you do in Scintilla?  (Or pick another good use case
if you prefer.)


> 
>    The buffer interface could be defined either to allow multiple
> (counted) locks or to fail further lock attempts. Counted locks would be
> applicable in more circumstances but require more implementation. I would
> prefer counted but it is not that important as a counting layer can be
> implemented over a single lock interface if needed.
> 

A single lock interface can be implemented over an object without any
locking.  Have the lockable object return simple "fixed buffer objects"
with a limited lifespan.






__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From xscottg@yahoo.com  Tue Jul 30 07:10:26 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Mon, 29 Jul 2002 23:10:26 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <200207300142.g6U1gcSZ018273@kuku.cosc.canterbury.ac.nz>
Message-ID: <20020730061026.33569.qmail@web40106.mail.yahoo.com>

--- Greg Ewing <greg@cosc.canterbury.ac.nz> wrote:
> Guido:
> 
> > I don't like where this is going.  Let's not add locking to the buffer
> > protocol.
> 
> Do you still object to it even in the form I proposed in
> my last message? (I.e. no separate "lock" call, locking
> is implicit in the getxxxbuffer calls.)
> 
> It does make the protocol slightly more complicated to
> use (must remember to make a release call when you're
> finished with the pointer) but it seems like a good
> tradeoff to me for the flexibility gained.
> 

I realize this wasn't addressed to me, and that I said I would butt out
when you were in favor of canning the proposal altogether, but I won't let
that get in the way.  :-)

We haven't seen a semi-thorough use case where the locking behavior is
beneficial yet.  While I appreciate and agree with the intent of trying to
get a more flexible object, I think there is at least one of several
problems buried down a little further than you and Neil are looking.

I'm concerned that this is very much like the segment count features of the
current PyBufferProcs.  It was apparently designed for more generality, and
while no one uses it, everyone has to check that the segment count is one
or raise an exception.  If there is no realizable benefit to the
acquire/release semantics of the new interface, then this is just extra
burden too.  Lets find a realizable benefit before we muck up Thomas's good
simple proposal with this stuff.

In the current Python core, I can think of the following objects that would
need a retrofit to this new interface (there may be more):

    string
    unicode
    mmap
    array

The string, unicode, and mmap objects do not resize or reallocate by
design.  So for them the extra acquire/release requirements are burden with
no benefit.

The array object does resize (via the extend method among others).  So lets
say that an array object gets passed to an extension that locks the buffer
and grabs the pointer.  The extension releases the GIL so that another
thread can work on the array object.  Another thread comes in and wants to
do a resize (via the extend method).  (We don't need to introduce threads
for this since the asynchronous I/O case is just the same.)

If extend() is called while thread 1 has the array locked, it can:

   A) raise an exception or return an error
   B) block until the lock count returns to zero
   C) ???
   .)
   .)

Case A is troublesome because depending on thread scheduling/disk
performance, you will or won't get the exception.  So you've got a weird
race condition where an operation might have been valid if it had only
executed a split second later, but due to misfortune it raised an
exception.  I think this non-determinism is ugly at the very least. 
However since it's recoverable, you could try again (polling), or ignore
the request completely (odd behavior).  I think this is what both you and
Neil are proposing, and I don't see how this is terribly useful.

While I don't think B is the strategy anyone is proposing, it means you
have two blocking objects in effect (the GIL and whatever the array uses to
implement blocking).  If we're not extremely careful, we can get deadlock
here.

I'm still looking for any good examples that fall into cases C and beyond. 
Neil offered a third example that might fit.  He says that he could buffer
the user event that led to the resize operation.  If that is his strategy,
I'd like to see it explained further.  It sounds like taking the event and
not processing it until the asynchronous I/O operation has completed.  At
which point I wonder what using asynchronous I/O achieved since the resize
operation had to wait synchronously for the I/O to complete.  This also
sounds suspiciously like blocking the resize thread, but I won't argue that
point.







__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From Jack.Jansen@cwi.nl  Tue Jul 30 10:07:56 2002
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Tue, 30 Jul 2002 11:07:56 +0200
Subject: [Python-Dev] HAVE_CONFIG_H
In-Reply-To: <3D459E8A.1050602@lemburg.com>
Message-ID: <D66F271A-A39B-11D6-A307-0030655234CE@cwi.nl>

On Monday, July 29, 2002, at 09:59 , M.-A. Lemburg wrote:

> Guido van Rossum wrote:
>> I see no references to HAVE_CONFIG_H in the source code (except one
>> #undef in readline.c), yet we #define it on the command line.  Is that
>> still necessary?
>
> What about these ?
>
> ./Mac/mwerks/old/mwerks_nsgusi_config.h:
> -- define HAVE_CONFIG_H
[...]

They're turds, they can go.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -



From mwh@python.net  Tue Jul 30 10:33:31 2002
From: mwh@python.net (Michael Hudson)
Date: 30 Jul 2002 10:33:31 +0100
Subject: [Python-Dev] patch: try/finally in generators
In-Reply-To: Guido van Rossum's message of "Mon, 29 Jul 2002 16:30:36 -0400"
References: <20020729200824.A5391@hishome.net> <200207291734.g6THY1k30119@pcp02138704pcs.reston01.va.comcast.net> <20020729220944.A6113@hishome.net> <200207291940.g6TJe1005489@pcp02138704pcs.reston01.va.comcast.net> <20020729132515.A31926@glacier.arctrix.com> <200207292030.g6TKUaW06234@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <2m1y9llfv8.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> > There could be valid code out there that does not end with
> > LOAD_CONST+RETURN.
> 
> The current code generator always generates that as the final
> instruction.  But someone might add an optimizer that takes that out
> if it is provably unreachable...

The bytecodehacks has one of them :) It would probably scream and run
away if presented with a generator, but that's just a matter of
bitrot.

Cheers,
M.

-- 
  All obscurity will buy you is time enough to contract venereal
  diseases.                                  -- Tim Peters, python-dev


From nhodgson@bigpond.net.au  Tue Jul 30 10:48:44 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Tue, 30 Jul 2002 19:48:44 +1000
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020730061016.32588.qmail@web40103.mail.yahoo.com>
Message-ID: <005d01c237ae$4b2f6670$3da48490@neil>

Scott Gilbert:

> You've listed three possibilities, but lets narrow it down to the strategy
> that you intend to use in Scintilla (a real use case).  I believe all
three
> strategies lead to something undesirable (be it polling, deadlock, a
> confused user, or ???), but I don't want to exhaustively scrutinize all
> possibilities until we come up with one good example that you intend to
use
> (it would bore you to read them, and me to type them).
>
> So what exactly would you do in Scintilla?  (Or pick another good use case
> if you prefer.)

   I'd prefer to ignore the input. Unfortunately users prefer a higher
degree of friendliness :-(

   Since Scintilla is a component within a user interface, it shares this
responsibility with the container application with the application being the
main determinant. If I was writing a Windows-specific application that used
Scintilla, and I wanted to use Asynchronous I/O then my preferred technique
would be to change the message processing loop to leave the UI input
messages in the queue until the I/O had completed.
   Once the I/O had completed then the message loop would change back to
processing all messages which would allow the banked up input to come
through.
   If I was feeling ambitious I may try to process some UI messages,
possible detecting pressing Escape to abort a file load if it turned out the
read was taking too long.

> A single lock interface can be implemented over an object without any
> locking.  Have the lockable object return simple "fixed buffer objects"
> with a limited lifespan.

   This returns to the possibility of indeterminate lifespan as mentioned
earlier in the thread.

> At which point I wonder what using asynchronous I/O achieved since the
> resize operation had to wait synchronously for the I/O to complete.  This
> also sounds suspiciously like blocking the resize thread, but I won't
argue
> that point.

   There may be other tasks that the application can perform while waiting
for the I/O to complete, such as displaying, styling or line-wrapping
whatever text has already arrived (assuming that there are some facilities
for discovering this) or performing similar tasks for other windows.

   Neil



From smurf@noris.de  Tue Jul 30 11:24:05 2002
From: smurf@noris.de (Matthias Urlichs)
Date: Tue, 30 Jul 2002 12:24:05 +0200
Subject: [Python-Dev] Generator cleanup idea (patch: try/finally in
 generators)
Message-ID: <p05111704b96c158e85c7@[192.109.102.36]>

Greg:
>  I take it you usually provide a method for explicit cleanup.
>  How about giving generator-iterators one, then, called
>  maybe close() or abort(). The effect would be to raise
>  an appropriate exception at the point of the yield,
>  triggering any except or finally blocks.

Objects already have a perfectly valid cleanup method -- "__del__".

If your code is so complicated that it needs a try/yield/finally, it 
would make much more sense to convert the thing to an iterator 
object. It probably would make the code a whole lot more 
understandable, too. (It did happen with mine.)

Stated another way: functions which yield stuff are special. If that 
specialness gets buried in nested try/except/finally/whatever 
constructs, things tend to get messy. Better make that messiness 
explicit by packaging the code in an object with well-defined methods.

This is actually easy to do because of the existence of iterators, 
because this code

def some_iter(foo):
	prepare(foo)

	try:
		for i in foo:
			yield something(i)
	finally:
		cleanup(foo)

painlessly transmutes to this:

class some_iter(object):
	def __init__(foo):
		prepare(foo)

		self.foo = foo
		self.it = foo.__iter__()

	def next(self):
		i = self.it.next()
		return something(i)

	def __del__(self):
		cleanup(self.foo)

Personally I think the latter version is more readable because the 
important thing, i.e. how the next element is obtained, is clearly 
separated from the rest of the code (and one level dedented, compared 
to the first version).
-- 
Matthias Urlichs


From mwh@python.net  Tue Jul 30 11:27:11 2002
From: mwh@python.net (Michael Hudson)
Date: 30 Jul 2002 11:27:11 +0100
Subject: [Python-Dev] seeing off SET_LINENO
Message-ID: <2mvg6xjytc.fsf@starship.python.net>

I've submitted a(nother) patch to sf that removes SET_LINENO:

    http://www.python.org/sf/587993

It supports tracing by digging around in the c_lnotab[*] to see when
execution moves onto a different line.

I think it's more or less sound but any changes to the interpreter
main loop are going to be subtle, so I have a few points to raise
here.  In no particular order:

1) this is a change I'd like to see anyway:  
   the use of f->f_lasti in the main loop is confusing.  let's just set
   it at the start of opcode dispatch and leave it the hell alone.

   there's actually what is probably a very old bug in the
   implementation of SET_LINENO.  It does more or less this:

     f->f_lasti = INSTR_OFFSET();
     /* call the trace function */

   It should do this:

     f->f_lasti = INSTR_OFFSET() - 3;
     /* call the trace function */

   The field is called f_LASTi, after all...

2) As I say in the patch, I will buy anyone a beer who can explain
   (without using LLTRACE or reading a lot of dis.py output) why we
   don't call the trace function on POP_TOP opcodes.

3) The patch changes behaviour -- for the better!  You're now rather
   less likely to get the trace function called several times per
   line.

4) The patch installs a descriptor for f_lineno so that there is no
   incompatibility for Python code.  The question is what to do with
   the f_lineno field in the C struct?  Remove it?  That would
   (probably) mean bumping PY_API_VERSION.  Leave it in?  Then its
   contents would usually be meaningless (keeping it up to date would
   rather defeat the point of this patch).

5) We've already bumped the MAGIC for 2.3a0, so we probably don't need
   to do that again.

6) Someone should teach dis.py how to find line breaks from the
   c_lnotab.  I can do this, but not right now....

7) The changes tickle what may be a very old bug in freeze:

    http://www.python.org/sf/588452

8) I haven't measured the performance impact of the changes to code
   that is tracing or code that isn't.  There's a possible
   optimization mentioned in the patch for traced code.  For not
   traced code it MAY be worthwhile putting the tracing support code
   in a static function somewhere so there's less code to jump over in
   the main loop (for i-caches and such).

9) This patch stops LLTRACE telling you when execution moves onto a
   different line.  This could be restored, but 
 
   a) I expect I'm the only persion to have used LLTRACE recently
      (debugging this patch).
   b) This will cause obfuscation, so I'd prefer to do it last.

Comments welcome!

Cheers,
M.

[*] I've cheated with my sigmonster:
-- 
34. The string is a stark data structure and everywhere it is
    passed there is much duplication of process.  It is a perfect
    vehicle for hiding information.
  -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html


From barry@python.org  Tue Jul 30 13:14:05 2002
From: barry@python.org (Barry A. Warsaw)
Date: Tue, 30 Jul 2002 08:14:05 -0400
Subject: [Python-Dev] seeing off SET_LINENO
References: <2mvg6xjytc.fsf@starship.python.net>
Message-ID: <15686.33549.262832.740505@anthem.wooz.org>

>>>>> "MH" == Michael Hudson <mwh@python.net> writes:

    MH> 3) The patch changes behaviour -- for the better!  You're now
    MH> rather less likely to get the trace function called several
    MH> times per line.

Does this change affect debugging?  Have you tested how this change
might interact with e.g. hotshot?

-Barry


From neal@metaslash.com  Tue Jul 30 13:19:20 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Tue, 30 Jul 2002 08:19:20 -0400
Subject: [Python-Dev] PyArg_ParseTuple vs. PyArg_Parse
References: <r01050300-1015-F09877D9A38011D6AEFA003065D5E7E4@[10.0.0.23]>
Message-ID: <3D468448.45C22891@metaslash.com>

Just van Rossum wrote:
> 
> nnorwitz@users.sourceforge.net wrote:
> 
> > Use PyArg_ParseTuple() instead of PyArg_Parse() which is deprecated
> >
> > Index: posixmodule.c
> > ===================================================================
> [ ... ]
> > !     else if (!PyArg_Parse(arg, "(ll)", &atime, &mtime)) {
> [ ... ]
> > !     else if (!PyArg_ParseTuple(arg, "ll", &atime, &mtime)) {
> [ ... ]
> 
> Probably no biggie here, but I'd like to point out that there is a significant
> difference between the two calls: the former will allow any sequence for 'arg',
> but the latter insists on a tuple. For that reason I always use PyArg_Parse() to
> parse coordinate pairs and the like: it greatly enhanced the usability in those
> cases. Examples of this usage can be found in the Mac subtree.

I'll back out this change.  But this raises the question should 
PyArg_Parse() be deprecated or should just METH_OLDARGS be deprecated?

Neal


From mwh@python.net  Tue Jul 30 13:31:53 2002
From: mwh@python.net (Michael Hudson)
Date: 30 Jul 2002 13:31:53 +0100
Subject: [Python-Dev] seeing off SET_LINENO
In-Reply-To: barry@python.org's message of "Tue, 30 Jul 2002 08:14:05 -0400"
References: <2mvg6xjytc.fsf@starship.python.net> <15686.33549.262832.740505@anthem.wooz.org>
Message-ID: <2md6t5ieh2.fsf@starship.python.net>

barry@python.org (Barry A. Warsaw) writes:

> >>>>> "MH" == Michael Hudson <mwh@python.net> writes:
> 
>     MH> 3) The patch changes behaviour -- for the better!  You're now
>     MH> rather less likely to get the trace function called several
>     MH> times per line.
> 
> Does this change affect debugging?

Hmm, I hadn't actually dared to run pdb with my patch... have now, and
it seems OK.

There is a difference:

The bytecode for, say,

def f():
    print 1

begins with two SET_LINENO's.  One is for the line containing "def
f():", one is for "print 1".  My patch means the debugger doesn't stop
on the "def f():" line -- unsurprisingly, given that no execution ever
takes place on that line.

It would be possible to force a call to the trace function on entry to
the function.  In fact, there's a commented out block for this in my
patch.  Another approach would presuambly be for pdb to stop on 'call'
trace events as well as 'line' ones.  I don't really understand, or
use all that often, pdb.

Also, you currently stop twice on the first line of a for loop, but
only once with my patch.  There are probably other situations of
excessive SET_LINENO emission.  I know Skip (think it was him) killed
a couple last week.  Bug compatibility is possible here too, but I
don't see the advantage.

> Have you tested how this change might interact with e.g. hotshot?

test_hotshot was very important to me as evidence I was making
progress!

It currently fails due to the not-calling-trace-on-def-line issue, but
as I said, I think this is a *good* thing...

Cheers,
M.

-- 
  The ability to quote is a serviceable substitute for wit.
                                                -- W. Somerset Maugham


From mal@lemburg.com  Tue Jul 30 13:42:19 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 30 Jul 2002 14:42:19 +0200
Subject: [Python-Dev] seeing off SET_LINENO
References: <2mvg6xjytc.fsf@starship.python.net> <15686.33549.262832.740505@anthem.wooz.org> <2md6t5ieh2.fsf@starship.python.net>
Message-ID: <3D4689AB.2020107@lemburg.com>

Michael Hudson wrote:
> barry@python.org (Barry A. Warsaw) writes:
> 
> 
>>>>>>>"MH" == Michael Hudson <mwh@python.net> writes:
>>>>>>
>>    MH> 3) The patch changes behaviour -- for the better!  You're now
>>    MH> rather less likely to get the trace function called several
>>    MH> times per line.
>>
>>Does this change affect debugging?
> 
> 
> Hmm, I hadn't actually dared to run pdb with my patch... have now, and
> it seems OK.
> 
> There is a difference:
> 
> The bytecode for, say,
> 
> def f():
>     print 1
> 
> begins with two SET_LINENO's.  One is for the line containing "def
> f():", one is for "print 1".  My patch means the debugger doesn't stop
> on the "def f():" line -- unsurprisingly, given that no execution ever
> takes place on that line.

This might be used in debugging application to setup some
environment *before* diving into the function itself.

Note that many C debuggers stop at the declare line of
a function as well (because they execute stack setup code),
so a sudden change in this would probably confuse users of
todays Python IDEs.

> It would be possible to force a call to the trace function on entry to
> the function.  In fact, there's a commented out block for this in my
> patch.  Another approach would presuambly be for pdb to stop on 'call'
> trace events as well as 'line' ones.  I don't really understand, or
> use all that often, pdb.
> 
> Also, you currently stop twice on the first line of a for loop, but
> only once with my patch.  There are probably other situations of
> excessive SET_LINENO emission.  I know Skip (think it was him) killed
> a couple last week.  Bug compatibility is possible here too, but I
> don't see the advantage.
> 
> 
>>Have you tested how this change might interact with e.g. hotshot?
> 
> 
> test_hotshot was very important to me as evidence I was making
> progress!
> 
> It currently fails due to the not-calling-trace-on-def-line issue, but
> as I said, I think this is a *good* thing...

Have you also tested this with the commonly used Python IDEs
out there ? E.g. IDLE, IDLE-fork, PythonWorks, WingIDE, Emacs,
BlackAdder, BOA Constructor, etc. etc.



-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From mwh@python.net  Tue Jul 30 13:58:10 2002
From: mwh@python.net (Michael Hudson)
Date: 30 Jul 2002 13:58:10 +0100
Subject: [Python-Dev] seeing off SET_LINENO
In-Reply-To: "M.-A. Lemburg"'s message of "Tue, 30 Jul 2002 14:42:19 +0200"
References: <2mvg6xjytc.fsf@starship.python.net> <15686.33549.262832.740505@anthem.wooz.org> <2md6t5ieh2.fsf@starship.python.net> <3D4689AB.2020107@lemburg.com>
Message-ID: <2mado9id99.fsf@starship.python.net>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> > begins with two SET_LINENO's.  One is for the line containing "def
> > f():", one is for "print 1".  My patch means the debugger doesn't stop
> > on the "def f():" line -- unsurprisingly, given that no execution ever
> > takes place on that line.
> 
> This might be used in debugging application to setup some
> environment *before* diving into the function itself.

So do that when you get the 'call' trace function call!  That's what
it's there for.

> Note that many C debuggers stop at the declare line of
> a function as well (because they execute stack setup code),
> so a sudden change in this would probably confuse users of
> todays Python IDEs.

However, sudden changes here are *very* likely to confuse, I agree.
Perhaps bug-compatibility is something to aim for.

[...]
> >>Have you tested how this change might interact with e.g. hotshot?
> > 
> > 
> > test_hotshot was very important to me as evidence I was making
> > progress!
> > 
> > It currently fails due to the not-calling-trace-on-def-line issue, but
> > as I said, I think this is a *good* thing...
> 
> Have you also tested this with the commonly used Python IDEs
> out there ? E.g. IDLE, IDLE-fork, PythonWorks, WingIDE, Emacs,
> BlackAdder, BOA Constructor, etc. etc.

No.

Don't think it's relavent to IDLE (at least, I can't see any calls to
settrace in there that aren't commented out).  Python-mode's pdbtrack
should just carry on working.  Don't have easy access to the others.
I'd be amazed if other IDE's were severely adversely affected.
Anyway, isn't this what alphas are for?  I have no problem emailing a
relavent person for each of the above IDEs and pointing out that this
change may affect them.

Cheers,
M.

-- 
  If a train station is a place where a train stops, what's a
  workstation?                            -- unknown (to me, at least)


From barry@python.org  Tue Jul 30 16:16:05 2002
From: barry@python.org (Barry A. Warsaw)
Date: Tue, 30 Jul 2002 11:16:05 -0400
Subject: [Python-Dev] seeing off SET_LINENO
References: <2mvg6xjytc.fsf@starship.python.net>
 <15686.33549.262832.740505@anthem.wooz.org>
 <2md6t5ieh2.fsf@starship.python.net>
Message-ID: <15686.44469.22988.913649@anthem.wooz.org>

>>>>> "MH" == Michael Hudson <mwh@python.net> writes:

    MH> Hmm, I hadn't actually dared to run pdb with my patch... have
    MH> now, and it seems OK.

Cool.

    MH> There is a difference:

    MH> The bytecode for, say,

    | def f():
    |     print 1

    MH> begins with two SET_LINENO's.  One is for the line containing
    MH> "def f():", one is for "print 1".  My patch means the debugger
    MH> doesn't stop on the "def f():" line -- unsurprisingly, given
    MH> that no execution ever takes place on that line.

    MH> It would be possible to force a call to the trace function on
    MH> entry to the function.  In fact, there's a commented out block
    MH> for this in my patch.  Another approach would presuambly be
    MH> for pdb to stop on 'call' trace events as well as 'line' ones.
    MH> I don't really understand, or use all that often, pdb.

I can't decide whether it would be good to stop on the def or not.
Not doing so makes pdb act more like gdb, which also only stops on the
first executable line, so maybe that's a good thing.

    MH> Also, you currently stop twice on the first line of a for
    MH> loop, but only once with my patch.

That /is/ a good thing!
    
    >> Have you tested how this change might interact with
    >> e.g. hotshot?

    MH> test_hotshot was very important to me as evidence I was making
    MH> progress!

:)

    MH> It currently fails due to the not-calling-trace-on-def-line
    MH> issue, but as I said, I think this is a *good* thing...

So maybe we need two different behaviors depending on whether we're
debugging or profiling.  That might get a bit kludgy if we're using
the same trace mechanism for both, but I'm sure it's tractable.

-Barry


From guido@python.org  Tue Jul 30 16:26:23 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 30 Jul 2002 11:26:23 -0400
Subject: [Python-Dev] PyArg_ParseTuple vs. PyArg_Parse
In-Reply-To: Your message of "Tue, 30 Jul 2002 08:19:20 EDT."
 <3D468448.45C22891@metaslash.com>
References: <r01050300-1015-F09877D9A38011D6AEFA003065D5E7E4@[10.0.0.23]>
 <3D468448.45C22891@metaslash.com>
Message-ID: <200207301526.g6UFQNZ09835@odiug.zope.com>

> I'll back out this change.  But this raises the question should 
> PyArg_Parse() be deprecated or should just METH_OLDARGS be deprecated?

Only METH_OLDARGS.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@python.org  Tue Jul 30 16:27:46 2002
From: barry@python.org (Barry A. Warsaw)
Date: Tue, 30 Jul 2002 11:27:46 -0400
Subject: [Python-Dev] seeing off SET_LINENO
References: <2mvg6xjytc.fsf@starship.python.net>
 <15686.33549.262832.740505@anthem.wooz.org>
 <2md6t5ieh2.fsf@starship.python.net>
 <3D4689AB.2020107@lemburg.com>
 <2mado9id99.fsf@starship.python.net>
Message-ID: <15686.45170.12110.403625@anthem.wooz.org>

>>>>> "MH" == Michael Hudson <mwh@python.net> writes:

    MH> Python-mode's pdbtrack should just carry on working.

Yup, because it is basically just looking for the pdb prompt, so it
shouldn't care.

-Barry


From guido@python.org  Tue Jul 30 16:32:24 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 30 Jul 2002 11:32:24 -0400
Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines
In-Reply-To: Your message of "Mon, 29 Jul 2002 23:43:23 EDT."
 <00eb01c2377b$41dd4340$6300000a@holdenweb.com>
References: <15685.35726.678832.241665@anthem.wooz.org> <oqznwaqgmj.fsf@titan.progiciels-bpi.ca>
 <00eb01c2377b$41dd4340$6300000a@holdenweb.com>
Message-ID: <200207301532.g6UFWOt09871@odiug.zope.com>

> > This makes me jump fifteen years behind (or so, I do not remember times),
> > at the time of the great push so the Internet prefers:
> >
> >        Random J. User <address@dom.ain>
> >
> > It is more reasonable to always give the real name, optionally followed by
> > an email, that to consider that the real name is a mere comment for the
> > email address.  Oh, I know some hackers who praise themselves as login
> > names or dream having positronic brains :-), but most of us are humans
> > before anything else!
> >
> > Could the PEP be reformulated, at least, for leaving the choice opened?

Yes.  The rule will be Name first, Email second.  We won't convert all
200 existing PEPs to that format yet, but if someone with commit
privileges wants to volunteer, be our guest.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@zope.com  Tue Jul 30 16:36:13 2002
From: barry@zope.com (Barry A. Warsaw)
Date: Tue, 30 Jul 2002 11:36:13 -0400
Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines
References: <15685.35726.678832.241665@anthem.wooz.org>
 <oqznwaqgmj.fsf@titan.progiciels-bpi.ca>
Message-ID: <15686.45677.421287.717866@anthem.wooz.org>

>>>>> "FP" =3D=3D Fran=E7ois Pinard <pinard@iro.umontreal.ca> writes:

    >> It has been a while since I posted a copy of PEP 1 to the
    >> mailing lists and newsgroups.

    FP> Thanks for giving me this opportunity.  There is a tiny detail
    FP> that bothers me:

    >> The format of the author entry should be address@dom.ain
    >> (Random J. User) if the email address is included, and just
    >> Random J. User if the address is not given.

    FP> This makes me jump fifteen years behind (or so, I do not
    FP> remember times), at the time of the great push so the Internet
    FP> prefers:

    FP>        Random J. User <address@dom.ain>

    FP> It is more reasonable to always give the real name, optionally
    FP> followed by an email, that to consider that the real name is a
    FP> mere comment for the email address.

This is a good point.  Originally we thought it was more important to
be able to contact the author, but there are quite a few reasons to
revise this intention.  As pointed out, email addresses change.  Also,
experience has shown that most of the discussions about PEPs are
conducted on the public forums (mailing lists / newsgroups), so that's
a fine way to contact the people working on the PEP.  And of course,
we allow the PEP authors to obfuscate or omit their email addresses
altogether.

    FP> Could the PEP be reformulated, at least, for leaving the
    FP> choice opened?

I'd rather have one preferred way of writing the header, so I'm going
to change PEP 1 to mandate "Random J. User <address@dom.ain>" with the
email address optional.  However, I'm going to let the old style
remain for historical purposes since I don't think it's worth changing
the existing PEPs.

Thanks,
-Barry


From guido@python.org  Tue Jul 30 16:37:36 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 30 Jul 2002 11:37:36 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: Your message of "Tue, 30 Jul 2002 13:42:38 +1200."
 <200207300142.g6U1gcSZ018273@kuku.cosc.canterbury.ac.nz>
References: <200207300142.g6U1gcSZ018273@kuku.cosc.canterbury.ac.nz>
Message-ID: <200207301537.g6UFbad09910@odiug.zope.com>

> > I don't like where this is going.  Let's not add locking to the buffer
> > protocol.
> 
> Do you still object to it even in the form I proposed in
> my last message? (I.e. no separate "lock" call, locking
> is implicit in the getxxxbuffer calls.)

Yes, I still object.  Having to make a call to release a resource with
a function call is extremely error-prone, as we've seen with reference
counting.  There are too many cases where some early exit from a piece
of code doesn't make the release call.

> It does make the protocol slightly more complicated to
> use (must remember to make a release call when you're
> finished with the pointer) but it seems like a good
> tradeoff to me for the flexibility gained.

I'm not sure I see the use case.  The main data types for which I
expect this will be used would be strings and the new 'bytes' type,
and both have fixed buffers that never move.

> > probably nothing that could possibly invoke the Python interpreter
> > recursively, since that might release the GIL.  This would generally
> > mean that calls to Py_DECREF() are unsafe while holding on to a buffer
> > pointer!
> 
> That could be fixed by incrementing the Python refcount as
> long as a pointer is held. That could be done even without
> the rest of my locking proposal. Of course, if you do that you
> need a matching release call, so you might as well implement
> the locking while you're at it.

I think you misunderstand what I wrote.  A py_DECREF() for an
*unrelated* object can invoke Python code (if it ends up deleting a
class instance with a __del__ method).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Jul 30 16:39:30 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 30 Jul 2002 11:39:30 -0400
Subject: [Python-Dev] Re: HAVE_CONFIG_H
In-Reply-To: Your message of "Mon, 29 Jul 2002 19:30:30 EDT."
 <oqptx6qfhl.fsf@titan.progiciels-bpi.ca>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de>
 <oqptx6qfhl.fsf@titan.progiciels-bpi.ca>
Message-ID: <200207301539.g6UFdUS09930@odiug.zope.com>

> > > I see no references to HAVE_CONFIG_H in the source code (except one
> > > #undef in readline.c), yet we #define it on the command line.  Is that
> > > still necessary?
> 
> > It's autoconf tradition to use that; it would replace DEFS to either
> > many -D options, or -DHAVE_CONFIG_H (if AC_CONFIG_HEADER appears).
> 
> > I don't think we need this, and it can safely be removed.
> 
> The many `-D' options which appear when `AC_CONFIG_HEADER' is not used
> are rather inelegant, they create a lot, really a lot of clumsiness in
> `make' output.  The idea, but you surely know it, was to regroup all
> auto-configured definitions into a single header file, and limit the `-D'
> to the sole `HAVE_CONFIG_H', or almost.  While the:
> 
> #if HAVE_CONFIG_H
> # include <config.h>
> #endif
> 
> idiom, for some widely used sources, was to cope with `AC_CONFIG_HEADER'
> being defined in some projects, and not in others.  There is no need to
> include `config.h', nor to create it, if all `#define's have been already
> done through a litany of `-D' options.

Since we don't use this idiom, we can safely remove the
-DHAVE_CONFIG_H (if we can find where it is set).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Tue Jul 30 17:09:40 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Tue, 30 Jul 2002 18:09:40 +0200
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020730061016.32588.qmail@web40103.mail.yahoo.com> <005d01c237ae$4b2f6670$3da48490@neil>
Message-ID: <025a01c237e3$82eb7c90$e000a8c0@thomasnotebook>

[Scott]
> > A single lock interface can be implemented over an object without any
> > locking.  Have the lockable object return simple "fixed buffer objects"
> > with a limited lifespan.
> 
[Neil]
>    This returns to the possibility of indeterminate lifespan as mentioned
> earlier in the thread.
> 

Can't you do something like this (maybe this is what Scott has in mind):

static void _unlock(void *ptr, MyObject *self)
{
    /* do whatever needed to unlock the object */
    self->locked--;
    Py_DECREF(self);
}

static PyObject*
MyObject_GetBuffer(MyObject *self)
{
    /* Do whatever needed to lock the object */
    self->lock++;
    Py_INCREF(self);
    return PyCObject_FromVoidPtrAndDesc(self->ptr,
                                        self,
                                        _unlock)
}

In plain text:
Provide a method which returns a 'view' into your object's
buffer after locking the object. The view holds a reference
to object, the objects is unlocked and decref'd when the
view is destroyed.
In practice something better than a PyCObject will be used,
and this one can even implement the 'fixed buffer' interface.

Thomas



From guido@python.org  Tue Jul 30 17:22:11 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 30 Jul 2002 12:22:11 -0400
Subject: [Python-Dev] Re: HAVE_CONFIG_H
In-Reply-To: Your message of "Tue, 30 Jul 2002 11:39:30 EDT."
 <200207301539.g6UFdUS09930@odiug.zope.com>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de> <oqptx6qfhl.fsf@titan.progiciels-bpi.ca>
 <200207301539.g6UFdUS09930@odiug.zope.com>
Message-ID: <200207301622.g6UGMBl17143@odiug.zope.com>

> Since we don't use this idiom, we can safely remove the
> -DHAVE_CONFIG_H (if we can find where it is set).

I looked.  It's generated by AC_OUTPUT.  I don't think I can get rid
of it.  So never mind. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Jul 30 17:39:00 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 30 Jul 2002 12:39:00 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: Your message of "Tue, 30 Jul 2002 09:37:18 +1000."
 <029801c23758$e13594b0$3da48490@neil>
References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook>
 <029801c23758$e13594b0$3da48490@neil>
Message-ID: <200207301639.g6UGd1S17363@odiug.zope.com>

> > ..., but I understand Neil's requirements.
> > 
> > Can they be fulfilled by adding some kind of UnlockObject()
> > call to the 'safe buffer interface', which should mean 'I won't
> > use the pointer received by getsaferead/writebufferproc any more'?
> 
>    Yes, that is exactly what I want.

I guess I still don't understand Neil's requirements.  What can't be
done with the existing buffer interface (which requires you to hold
the GIL while using the pointer)?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From oren-py-d@hishome.net  Tue Jul 30 17:39:27 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 30 Jul 2002 12:39:27 -0400
Subject: [Python-Dev] Generator cleanup idea (patch: try/finally in generators)
In-Reply-To: <p05111704b96c158e85c7@[192.109.102.36]>
References: <p05111704b96c158e85c7@[192.109.102.36]>
Message-ID: <20020730163927.GA63620@hishome.net>

On Tue, Jul 30, 2002 at 12:24:05PM +0200, Matthias Urlichs wrote:
> def some_iter(foo):
> 	prepare(foo)
> 
> 	try:
> 		for i in foo:
> 			yield something(i)
> 	finally:
> 		cleanup(foo)
> 
> painlessly transmutes to this:
> 
> class some_iter(object):
> 	def __init__(foo):
> 		prepare(foo)
> 
> 		self.foo = foo
> 		self.it = foo.__iter__()
> 
> 	def next(self):
> 		i = self.it.next()
> 		return something(i)
> 
> 	def __del__(self):
> 		cleanup(self.foo)

Bad example.  Generators are useful precisely because some types of code
are quite painful to change to this form.

Anyway, it appears that generators can create reference loops if someone 
was peverted enough to keep a reference to the generator inside the 
generator.  It doesn't seem to be worth the effort of making generators 
into GC objects just for this.

	Oren


From pinard@iro.umontreal.ca  Tue Jul 30 17:44:06 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 30 Jul 2002 12:44:06 -0400
Subject: [Python-Dev] Re: HAVE_CONFIG_H
In-Reply-To: <200207301539.g6UFdUS09930@odiug.zope.com>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net>
 <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de>
 <oqptx6qfhl.fsf@titan.progiciels-bpi.ca>
 <200207301539.g6UFdUS09930@odiug.zope.com>
Message-ID: <oqfzy1i2sp.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

> Since we don't use this idiom, we can safely remove the
> -DHAVE_CONFIG_H (if we can find where it is set).

I guess you will have to override some `m4' macro within `configure.in', or
related machinery.  If things did not change too much, this probably means
diving into `acgeneral.m4', to find out how and where this is best done.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From nas@python.ca  Tue Jul 30 17:56:58 2002
From: nas@python.ca (Neil Schemenauer)
Date: Tue, 30 Jul 2002 09:56:58 -0700
Subject: [Python-Dev] Generator cleanup idea (patch: try/finally in generators)
In-Reply-To: <20020730163927.GA63620@hishome.net>; from oren-py-d@hishome.net on Tue, Jul 30, 2002 at 12:39:27PM -0400
References: <p05111704b96c158e85c7@[192.109.102.36]> <20020730163927.GA63620@hishome.net>
Message-ID: <20020730095658.A3196@glacier.arctrix.com>

Oren Tirosh wrote:
> It doesn't seem to be worth the effort of making generators 
> into GC objects just for this.

What do you mean.  They are already GC objects.

  Neil


From thomas.heller@ion-tof.com  Tue Jul 30 17:51:41 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Tue, 30 Jul 2002 18:51:41 +0200
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook>              <029801c23758$e13594b0$3da48490@neil>  <200207301639.g6UGd1S17363@odiug.zope.com>
Message-ID: <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook>

From: "Guido van Rossum" <guido@python.org>
> > > ..., but I understand Neil's requirements.
> > > 
> > > Can they be fulfilled by adding some kind of UnlockObject()
> > > call to the 'safe buffer interface', which should mean 'I won't
> > > use the pointer received by getsaferead/writebufferproc any more'?
> > 
> >    Yes, that is exactly what I want.
> 
> I guess I still don't understand Neil's requirements.  What can't be
> done with the existing buffer interface (which requires you to hold
> the GIL while using the pointer)?

Processing in Python :-(.

Thoms



From pinard@iro.umontreal.ca  Tue Jul 30 17:53:38 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 30 Jul 2002 12:53:38 -0400
Subject: [Python-Dev] Re: HAVE_CONFIG_H
In-Reply-To: <200207301622.g6UGMBl17143@odiug.zope.com>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net>
 <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de>
 <oqptx6qfhl.fsf@titan.progiciels-bpi.ca>
 <200207301539.g6UFdUS09930@odiug.zope.com>
 <200207301622.g6UGMBl17143@odiug.zope.com>
Message-ID: <oqbs8pi2ct.fsf@titan.progiciels-bpi.ca>

[Guido van Rossum]

> > Since we don't use this idiom, we can safely remove the
> > -DHAVE_CONFIG_H (if we can find where it is set).

> I looked.  It's generated by AC_OUTPUT.  I don't think I can get rid
> of it.  So never mind. :-)

Maybe AC_OUTPUT, or macros called by AC_OUTPUT, can be overridden.  If this
is not easy to do, you might want to discuss the matter with Akim, Cc:ed.
Maybe he could tear down AC_OUTPUT in parts so the overriding gets easier?

I know my friend Akim as good, helping and nice fellow!  Don't fear him! :-)

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From thomas.heller@ion-tof.com  Tue Jul 30 18:37:19 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Tue, 30 Jul 2002 19:37:19 +0200
Subject: [Python-Dev] PEP 298 - the Fixed Buffer Interface
Message-ID: <04da01c237ef$c103ac30$e000a8c0@thomasnotebook>

Here is PEP 298 - the Fixed Buffer Interface, posted to
get feedback from the Python community.
Enjoy!

Thomas

PS: I'll going to a 2 weeks vacation at the end of this week,
so don't hold your breath on replies from me if you post
after, let's say, thursday.

-----
PEP: 298
Title: The Fixed Buffer Interface
Version: $Revision: 1.3 $
Last-Modified: $Date: 2002/07/30 16:52:53 $
Author: Thomas Heller <theller@python.net>
Status: Draft
Type: Standards Track
Created: 26-Jul-2002
Python-Version: 2.3
Post-History:


Abstract

    This PEP proposes an extension to the buffer interface called the
    'fixed buffer interface'.

    The fixed buffer interface fixes the flaws of the 'old' buffer
    interface as defined in Python versions up to and including 2.2,
    see [1]:

        The lifetime of the retrieved pointer is clearly defined.

        The buffer size is returned as a 'size_t' data type, which
        allows access to large buffers on platforms where sizeof(int)
        != sizeof(void *).


Specification

    The fixed buffer interface exposes new functions which return the
    size and the pointer to the internal memory block of any python
    object which chooses to implement this interface.

    The size and pointer returned must be valid as long as the object
    is alive (has a positive reference count).  So, only objects which
    never reallocate or resize the memory block are allowed to
    implement this interface.

    The fixed buffer interface omits the memory segment model which is
    present in the old buffer interface - only a single memory block
    can be exposed.


Implementation

    Define a new flag in Include/object.h:

        /* PyBufferProcs contains bf_getfixedreadbuffer
           and bf_getfixedwritebuffer */
        #define Py_TPFLAGS_HAVE_GETFIXEDBUFFER (1L<<15)


    This flag would be included in Py_TPFLAGS_DEFAULT:

        #define Py_TPFLAGS_DEFAULT  ( \
                             ....
                             Py_TPFLAGS_HAVE_GETFIXEDBUFFER | \
                             ....
                            0)


    Extend the PyBufferProcs structure by new fields in
    Include/object.h:

        typedef size_t (*getfixedreadbufferproc)(PyObject *, void **);
        typedef size_t (*getfixedwritebufferproc)(PyObject *, void **);

        typedef struct {
                getreadbufferproc bf_getreadbuffer;
                getwritebufferproc bf_getwritebuffer;
                getsegcountproc bf_getsegcount;
                getcharbufferproc bf_getcharbuffer;
                /* fixed buffer interface functions */
                getfixedreadbufferproc bf_getfixedreadbufferproc;
                getfixedwritebufferproc bf_getfixedwritebufferproc;
        } PyBufferProcs;


    The new fields are present if the Py_TPFLAGS_HAVE_GETFIXEDBUFFER
    flag is set in the object's type.

    The Py_TPFLAGS_HAVE_GETFIXEDBUFFER flag implies the
    Py_TPFLAGS_HAVE_GETCHARBUFFER flag.

    The getfixedreadbufferproc and getfixedwritebufferproc functions
    return the size in bytes of the memory block on success, and fill
    in the passed void * pointer on success.  If these functions fail
    - either because an error occurs or no memory block is exposed -
    they must set the void * pointer to NULL and raise an exception.
    The return value is undefined in these cases and should not be
    used.

    Usually the getfixedwritebufferproc and getfixedreadbufferproc
    functions aren't called directly, they are called through
    convenience functions declared in Include/abstract.h:

        int PyObject_AsFixedReadBuffer(PyObject *obj,
                                      void **buffer,
                                      size_t *buffer_len);

        int PyObject_AsFixedWriteBuffer(PyObject *obj,
                                       void **buffer,
                                       size_t *buffer_len);

    These functions return 0 on success, set buffer to the memory
    location and buffer_len to the length of the memory block in
    bytes. On failure, or if the fixed buffer interface is not
    implemented by obj, they return -1 and set an exception.


Backward Compatibility

    The size of the PyBufferProcs structure changes if this proposal
    is implemented, but the type's tp_flags slot can be used to
    determine if the additional fields are present.


Reference Implementation

    Will be uploaded to the SourceForge patch manager by the author.


Additional Notes/Comments

    Python strings, Unicode strings, mmap objects, and maybe other
    types would expose the fixed buffer interface, but the array type
    would *not*, because its memory block may be reallocated during
    its lifetime.


Community Feedback

    Greg Ewing doubts the fixed buffer interface is needed at all, he
    thinks the normal buffer interface could be used if the pointer is
    (re)fetched each time it's used.  This seems to be dangerous,
    because even innocent looking calls to the Python API like
    Py_DECREF() may trigger execution of arbitrary Python code.

    Neil Hodgson wants to expose pointers to memory blocks with
    limited lifetime: do some kind of lock operation on the object,
    retrieve the pointer, use it, and unlock the object again.  While
    the author sees the need for this, it cannot be addressed by this
    proposal.  Beeing required to call a function after not using the
    pointer received by the getfixedbufferprocs any more seems too
    error prone.


Credits

    Scott Gilbert came up with the name 'fixed buffer interface'.


References

    [1] The buffer interface
        http://mail.python.org/pipermail/python-dev/2000-October/009974.html

    [2] The Buffer Problem
        http://www.python.org/peps/pep-0296.html


Copyright

    This document has been placed in the public domain.



Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:






From martin@v.loewis.de  Tue Jul 30 18:55:59 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: 30 Jul 2002 19:55:59 +0200
Subject: [Python-Dev] Re: HAVE_CONFIG_H
In-Reply-To: <200207301622.g6UGMBl17143@odiug.zope.com>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net>
 <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de>
 <oqptx6qfhl.fsf@titan.progiciels-bpi.ca>
 <200207301539.g6UFdUS09930@odiug.zope.com>
 <200207301622.g6UGMBl17143@odiug.zope.com>
Message-ID: <m3y9btrtg0.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> I looked.  It's generated by AC_OUTPUT.  I don't think I can get rid
> of it.  So never mind. :-)

Just remove the @DEFS@ from Makefile.pre.in.

Regards,
Martin


From oren-py-d@hishome.net  Tue Jul 30 19:13:08 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 30 Jul 2002 21:13:08 +0300
Subject: [Python-Dev] Generator cleanup idea (patch: try/finally in generators)
In-Reply-To: <20020730095658.A3196@glacier.arctrix.com>; from nas-dated-1028480220.f9673d@python.ca on Tue, Jul 30, 2002 at 09:56:58AM -0700
References: <p05111704b96c158e85c7@[192.109.102.36]> <20020730163927.GA63620@hishome.net> <20020730095658.A3196@glacier.arctrix.com>
Message-ID: <20020730211308.A27690@hishome.net>

On Tue, Jul 30, 2002 at 09:56:58AM -0700, Neil Schemenauer wrote:
> Oren Tirosh wrote:
> > It doesn't seem to be worth the effort of making generators 
> > into GC objects just for this.
> 
> What do you mean.  They are already GC objects.

Ooops. 

	Oren



From guido@python.org  Tue Jul 30 19:57:00 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 30 Jul 2002 14:57:00 -0400
Subject: [Python-Dev] Re: HAVE_CONFIG_H
In-Reply-To: Your message of "Tue, 30 Jul 2002 12:44:06 EDT."
 <oqfzy1i2sp.fsf@titan.progiciels-bpi.ca>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de> <oqptx6qfhl.fsf@titan.progiciels-bpi.ca> <200207301539.g6UFdUS09930@odiug.zope.com>
 <oqfzy1i2sp.fsf@titan.progiciels-bpi.ca>
Message-ID: <200207301857.g6UIv0G17893@odiug.zope.com>

> > Since we don't use this idiom, we can safely remove the
> > -DHAVE_CONFIG_H (if we can find where it is set).
> 
> I guess you will have to override some `m4' macro within `configure.in', or
> related machinery.  If things did not change too much, this probably means
> diving into `acgeneral.m4', to find out how and where this is best done.

I haven't the guts.  Would you mind sending a patch?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Jul 30 19:59:06 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 30 Jul 2002 14:59:06 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: Your message of "Tue, 30 Jul 2002 18:51:41 +0200."
 <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook>
References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> <200207301639.g6UGd1S17363@odiug.zope.com>
 <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook>
Message-ID: <200207301859.g6UIx6117906@odiug.zope.com>

> From: "Guido van Rossum" <guido@python.org>
> > > > ..., but I understand Neil's requirements.
> > > > 
> > > > Can they be fulfilled by adding some kind of UnlockObject()
> > > > call to the 'safe buffer interface', which should mean 'I won't
> > > > use the pointer received by getsaferead/writebufferproc any more'?
> > > 
> > >    Yes, that is exactly what I want.
> > 
> > I guess I still don't understand Neil's requirements.  What can't be
> > done with the existing buffer interface (which requires you to hold
> > the GIL while using the pointer)?
> 
> Processing in Python :-(.

Can you work out an example?  I don't understand what you can do in
Python, apart from passing it to something else that takes the buffer
API or converting the data to a string or a bytes buffer.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Jul 30 20:06:47 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 30 Jul 2002 15:06:47 -0400
Subject: [Python-Dev] Re: HAVE_CONFIG_H
In-Reply-To: Your message of "Tue, 30 Jul 2002 14:57:00 EDT."
 <200207301857.g6UIv0G17893@odiug.zope.com>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de> <oqptx6qfhl.fsf@titan.progiciels-bpi.ca> <200207301539.g6UFdUS09930@odiug.zope.com> <oqfzy1i2sp.fsf@titan.progiciels-bpi.ca>
 <200207301857.g6UIv0G17893@odiug.zope.com>
Message-ID: <200207301906.g6UJ6l619069@odiug.zope.com>

> > > Since we don't use this idiom, we can safely remove the
> > > -DHAVE_CONFIG_H (if we can find where it is set).
> > 
> > I guess you will have to override some `m4' macro within `configure.in', or
> > related machinery.  If things did not change too much, this probably means
> > diving into `acgeneral.m4', to find out how and where this is best done.
> 
> I haven't the guts.  Would you mind sending a patch?

Never mind.  Getting rid of DEFS from Makefile.pre.in did the trick.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Tue Jul 30 20:22:53 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Tue, 30 Jul 2002 21:22:53 +0200
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> <200207301639.g6UGd1S17363@odiug.zope.com>              <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook>  <200207301859.g6UIx6117906@odiug.zope.com>
Message-ID: <063301c237fe$80506b10$e000a8c0@thomasnotebook>

[Guido]
> > > I guess I still don't understand Neil's requirements.  What can't be
> > > done with the existing buffer interface (which requires you to hold
> > > the GIL while using the pointer)?
> > 
> > Processing in Python :-(.
> 
> Can you work out an example?
Not sure, maybe Neil could do it better.

However, you yourself pointed out to Greg that it may be unsafe
to even call Py_DECREF() on an unrelated object.

>  I don't understand what you can do in
> Python, apart from passing it to something else that takes the buffer
> API or converting the data to a string or a bytes buffer.

Or pack it into a buffer *object* and hand it to arbitrary
Python code. That's what we have now.

What does 'hold the GIL' mean in this context?
No other thread can execute: we have complete control
over what we do. But what are we *allowed* to do?

Thomas



From guido@python.org  Tue Jul 30 20:37:37 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 30 Jul 2002 15:37:37 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: Your message of "Tue, 30 Jul 2002 21:22:53 +0200."
 <063301c237fe$80506b10$e000a8c0@thomasnotebook>
References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> <200207301639.g6UGd1S17363@odiug.zope.com> <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook> <200207301859.g6UIx6117906@odiug.zope.com>
 <063301c237fe$80506b10$e000a8c0@thomasnotebook>
Message-ID: <200207301937.g6UJbb220763@odiug.zope.com>

> > > > I guess I still don't understand Neil's requirements.  What can't be
> > > > done with the existing buffer interface (which requires you to hold
> > > > the GIL while using the pointer)?
> > > 
> > > Processing in Python :-(.
> > 
> > Can you work out an example?
> Not sure, maybe Neil could do it better.
> 
> However, you yourself pointed out to Greg that it may be unsafe
> to even call Py_DECREF() on an unrelated object.

The safe rule is that you should grab the pointer and then do some I/O
on it and nothing else.

> >  I don't understand what you can do in
> > Python, apart from passing it to something else that takes the buffer
> > API or converting the data to a string or a bytes buffer.
> 
> Or pack it into a buffer *object* and hand it to arbitrary
> Python code. That's what we have now.

Since the object you're packing already supports the buffer API, I
don't see the point of packing it in a buffer object.

> What does 'hold the GIL' mean in this context?
> No other thread can execute: we have complete control
> over what we do. But what are we *allowed* to do?

When accessing a movable buffer, the safest rule is no Python API
calls.  There's a less restrictive safe rule, but it's messy because
the end goal is "don't do anything that could conceivably end up in
the Python interpreter main loop (ceval.c)" and there's no easy rule
for that -- anything that uses Py_DECREF can end up doing that.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Jul 30 20:46:41 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 30 Jul 2002 15:46:41 -0400
Subject: [Python-Dev] PEP 298 - the Fixed Buffer Interface
In-Reply-To: Your message of "Tue, 30 Jul 2002 19:37:19 +0200."
 <04da01c237ef$c103ac30$e000a8c0@thomasnotebook>
References: <04da01c237ef$c103ac30$e000a8c0@thomasnotebook>
Message-ID: <200207301946.g6UJkf520799@odiug.zope.com>

> Here is PEP 298 - the Fixed Buffer Interface, posted to
> get feedback from the Python community.
> Enjoy!

+1 from me (but you already knew that).

> Thomas
> 
> PS: I'll going to a 2 weeks vacation at the end of this week,
> so don't hold your breath on replies from me if you post
> after, let's say, thursday.
> 
> -----
> PEP: 298
> Title: The Fixed Buffer Interface
> Version: $Revision: 1.3 $
> Last-Modified: $Date: 2002/07/30 16:52:53 $
> Author: Thomas Heller <theller@python.net>
> Status: Draft
> Type: Standards Track
> Created: 26-Jul-2002
> Python-Version: 2.3
> Post-History:
> 
> 
> Abstract
> 
>     This PEP proposes an extension to the buffer interface called the
>     'fixed buffer interface'.
> 
>     The fixed buffer interface fixes the flaws of the 'old' buffer
>     interface as defined in Python versions up to and including 2.2,
>     see [1]:

(I keep reading this backwards, thinking that the following two items
list the flaws in [1]. :-)

>         The lifetime of the retrieved pointer is clearly defined.
> 
>         The buffer size is returned as a 'size_t' data type, which
>         allows access to large buffers on platforms where sizeof(int)
>         != sizeof(void *).

This second sounds like a change we could also make to the "old"
buffer interface, if we introduce another flag bit that's *not* part
of the default flags.

> Specification
> 
>     The fixed buffer interface exposes new functions which return the
>     size and the pointer to the internal memory block of any python
>     object which chooses to implement this interface.
> 
>     The size and pointer returned must be valid as long as the object
>     is alive (has a positive reference count).  So, only objects which
>     never reallocate or resize the memory block are allowed to
>     implement this interface.
> 
>     The fixed buffer interface omits the memory segment model which is
>     present in the old buffer interface - only a single memory block
>     can be exposed.
> 
> 
> Implementation
> 
>     Define a new flag in Include/object.h:
> 
>         /* PyBufferProcs contains bf_getfixedreadbuffer
>            and bf_getfixedwritebuffer */
>         #define Py_TPFLAGS_HAVE_GETFIXEDBUFFER (1L<<15)
> 
> 
>     This flag would be included in Py_TPFLAGS_DEFAULT:
> 
>         #define Py_TPFLAGS_DEFAULT  ( \
>                              ....
>                              Py_TPFLAGS_HAVE_GETFIXEDBUFFER | \
>                              ....
>                             0)
> 
> 
>     Extend the PyBufferProcs structure by new fields in
>     Include/object.h:
> 
>         typedef size_t (*getfixedreadbufferproc)(PyObject *, void **);
>         typedef size_t (*getfixedwritebufferproc)(PyObject *, void **);
> 
>         typedef struct {
>                 getreadbufferproc bf_getreadbuffer;
>                 getwritebufferproc bf_getwritebuffer;
>                 getsegcountproc bf_getsegcount;
>                 getcharbufferproc bf_getcharbuffer;
>                 /* fixed buffer interface functions */
>                 getfixedreadbufferproc bf_getfixedreadbufferproc;
>                 getfixedwritebufferproc bf_getfixedwritebufferproc;
>         } PyBufferProcs;
> 
> 
>     The new fields are present if the Py_TPFLAGS_HAVE_GETFIXEDBUFFER
>     flag is set in the object's type.
> 
>     The Py_TPFLAGS_HAVE_GETFIXEDBUFFER flag implies the
>     Py_TPFLAGS_HAVE_GETCHARBUFFER flag.
> 
>     The getfixedreadbufferproc and getfixedwritebufferproc functions
>     return the size in bytes of the memory block on success, and fill
>     in the passed void * pointer on success.  If these functions fail
>     - either because an error occurs or no memory block is exposed -
>     they must set the void * pointer to NULL and raise an exception.
>     The return value is undefined in these cases and should not be
>     used.
> 
>     Usually the getfixedwritebufferproc and getfixedreadbufferproc
>     functions aren't called directly, they are called through
>     convenience functions declared in Include/abstract.h:
> 
>         int PyObject_AsFixedReadBuffer(PyObject *obj,
>                                       void **buffer,
>                                       size_t *buffer_len);
> 
>         int PyObject_AsFixedWriteBuffer(PyObject *obj,
>                                        void **buffer,
>                                        size_t *buffer_len);
> 
>     These functions return 0 on success, set buffer to the memory
>     location and buffer_len to the length of the memory block in
>     bytes. On failure, or if the fixed buffer interface is not
>     implemented by obj, they return -1 and set an exception.
> 
> 
> Backward Compatibility
> 
>     The size of the PyBufferProcs structure changes if this proposal
>     is implemented, but the type's tp_flags slot can be used to
>     determine if the additional fields are present.
> 
> 
> Reference Implementation
> 
>     Will be uploaded to the SourceForge patch manager by the author.

I'm holding my breath now...

> 
> Additional Notes/Comments
> 
>     Python strings, Unicode strings, mmap objects, and maybe other
>     types would expose the fixed buffer interface, but the array type
>     would *not*, because its memory block may be reallocated during
>     its lifetime.
> 
> 
> Community Feedback
> 
>     Greg Ewing doubts the fixed buffer interface is needed at all, he
>     thinks the normal buffer interface could be used if the pointer is
>     (re)fetched each time it's used.  This seems to be dangerous,
>     because even innocent looking calls to the Python API like
>     Py_DECREF() may trigger execution of arbitrary Python code.
> 
>     Neil Hodgson wants to expose pointers to memory blocks with
>     limited lifetime: do some kind of lock operation on the object,
>     retrieve the pointer, use it, and unlock the object again.  While
>     the author sees the need for this, it cannot be addressed by this
>     proposal.  Beeing required to call a function after not using the
                  x

>     pointer received by the getfixedbufferprocs any more seems too
>     error prone.
> 
> 
> Credits
> 
>     Scott Gilbert came up with the name 'fixed buffer interface'.
> 
> 
> References
> 
>     [1] The buffer interface
>         http://mail.python.org/pipermail/python-dev/2000-October/009974.html
> 
>     [2] The Buffer Problem
>         http://www.python.org/peps/pep-0296.html
> 
> 
> Copyright
> 
>     This document has been placed in the public domain.
> 
> 
> 
> Local Variables:
> mode: indented-text
> indent-tabs-mode: nil
> sentence-end-double-space: t
> fill-column: 70
> End:

--Guido van Rossum (home page: http://www.python.org/~guido/)


From oren-py-d@hishome.net  Tue Jul 30 21:15:11 2002
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Tue, 30 Jul 2002 23:15:11 +0300
Subject: [Python-Dev] Valgrinding Python
Message-ID: <20020730231511.A28762@hishome.net>

I ran some tests with Julian Seward's amazing Valgrind memory debugger.
Python is remarkably clean.  Much cleaner than any other program of 
non-trivial size that I tested.

Objects/obmalloc.c:
  
   The ADDRESS_IN_RANGE macro makes references to uninitialized memory.

This produced tons of warnings so I ran the rest of the tests without 
pymalloc.

The following tests produced invalid accesses inside the external
library:

test_anydbm.py
test_bsddb.py
test_dbm.py
test_gdbm.py
test_curses.py
test_pwd.py
test_socket_ssl.py

I also got some invalid accesses in Modules/arraymodule.c:array_ass_subscr 
while running test_array and in Objects/Listobject.c:list_ass_subscript 
running test_types. For some reason I couldn't reproduce them later. 

	Oren



From jacobs@penguin.theopalgroup.com  Tue Jul 30 21:21:36 2002
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Tue, 30 Jul 2002 16:21:36 -0400 (EDT)
Subject: [Python-Dev] Valgrinding Python
In-Reply-To: <20020730231511.A28762@hishome.net>
Message-ID: <Pine.LNX.4.44.0207301618060.13919-100000@penguin.theopalgroup.com>

On Tue, 30 Jul 2002, Oren Tirosh wrote:
> I ran some tests with Julian Seward's amazing Valgrind memory debugger.
> Python is remarkably clean.  Much cleaner than any other program of 
> non-trivial size that I tested.

I've been using Python with valgrind too, and with great success.  I've
caught several non-trivial problems in some of our extension modules, though
only a few very picky things in the Python core.  Valgrind has options to
attached gdb to running processes when problems occur.  Combining this with
gdb patched to produce mixed C/Python tracebacks, and you get an awesome
memory debugger.

-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From nhodgson@bigpond.net.au  Tue Jul 30 21:55:39 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Wed, 31 Jul 2002 06:55:39 +1000
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> <200207301639.g6UGd1S17363@odiug.zope.com>              <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook>  <200207301859.g6UIx6117906@odiug.zope.com> <063301c237fe$80506b10$e000a8c0@thomasnotebook>
Message-ID: <004e01c2380b$762ef5e0$3da48490@neil>

Thomas Heller (Guido, Thomas, Guido):
> [Guido]
> > > > I guess I still don't understand Neil's requirements.  What can't be
> > > > done with the existing buffer interface (which requires you to hold
> > > > the GIL while using the pointer)?
> > >
> > > Processing in Python :-(.
> >
> > Can you work out an example?
> Not sure, maybe Neil could do it better.

   I see this interface as a bridge between objects offering generic buffer
oriented facilities (asynch or low level I/O for example) and objects that
want to make it possible to use these facilities on their data (text
buffers, multimedia buffers, numeric arrays) by yielding a pointer to their
otherwise internal data.

   The bridging code between the two objects is unrestricted Python code
that may cause memory to be moved around.

   Neil



From guido@python.org  Tue Jul 30 22:13:00 2002
From: guido@python.org (Guido van Rossum)
Date: Tue, 30 Jul 2002 17:13:00 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: Your message of "Wed, 31 Jul 2002 06:55:39 +1000."
 <004e01c2380b$762ef5e0$3da48490@neil>
References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> <200207301639.g6UGd1S17363@odiug.zope.com> <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook> <200207301859.g6UIx6117906@odiug.zope.com> <063301c237fe$80506b10$e000a8c0@thomasnotebook>
 <004e01c2380b$762ef5e0$3da48490@neil>
Message-ID: <200207302113.g6ULD0N21213@odiug.zope.com>

>    I see this interface as a bridge between objects offering generic buffer
> oriented facilities (asynch or low level I/O for example) and objects that
> want to make it possible to use these facilities on their data (text
> buffers, multimedia buffers, numeric arrays) by yielding a pointer to their
> otherwise internal data.
> 
>    The bridging code between the two objects is unrestricted Python code
> that may cause memory to be moved around.

If the buffer is relatively small, copying the data an extra time
shouldn't be a problem, and you can use the old API.

If the buffer is huge, you probably shouldn't want to move the buffer
around in memory anyway, 

So I don't think your case for needing a lockable interface is very
strong.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Tue Jul 30 22:56:35 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 30 Jul 2002 17:56:35 -0400
Subject: [Python-Dev] Generator cleanup idea (patch: try/finally in
 generators)
In-Reply-To: <200207300112.g6U1CJoO018210@kuku.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEMBAIAB.tim.one@comcast.net>

[Greg Ewing]
> I don't think you'd really be breaking any promises.
> After all, if someone wrote
>
>   def asdf():
>     try:
>       something_that_never_returns()
>     finally:
>       ...
>
> they wouldn't have much ground for complaint that the
> finally never got executed. The case we're talking about
> seems much the same situation.

Not to me -- you can't write something_that_never_returns() in Python unless
the program runs forever, you crash the system, you get the thread stuck in
deadlock or permanent starvation, or you're anti-social by calling
os._exit() (sys.exit() is fine:  it raises SystemExit, and pending finally
blocks get run then).  All of those are highly exceptional use cases;
everyone else is guaranteed their finally block will eventually run.

> I take it you usually provide a method for explicit cleanup.

Yup.

> How about giving generator-iterators one, then, called
> maybe close() or abort(). The effect would be to raise
> an appropriate exception at the point of the yield,
> triggering any except or finally blocks.

As before, I'm already happy; sharing state via instance variables is all
"the solution" I've felt a need for.  If consensus is that something needs
to be done here anyway, I'd rather think of generators more as threads of
control than as lumps of data with attributes.  From that view, I think it
would be easier to make a coherent case that generators should support a
termination protocol involving raising SystemExit.  But then that should
apply to all thread-like objects too, and there's no way now for one thread
to raise SystemExit in another (but it's arguable that there should be).

> This method could even be added to the general iterator
> protocol (implementing it would be optional). It would
> then provide a standard name for people to use for
> cleanup methods in their own iterator classes.

Generalizing from zero examples <wink>?



From tim.one@comcast.net  Tue Jul 30 23:53:04 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 30 Jul 2002 18:53:04 -0400
Subject: [Python-Dev] Valgrinding Python
In-Reply-To: <20020730231511.A28762@hishome.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEMHAIAB.tim.one@comcast.net>

[Oren Tirosh]
> I ran some tests with Julian Seward's amazing Valgrind memory debugger.
> Python is remarkably clean.  Much cleaner than any other program of
> non-trivial size that I tested.

It's been thru Purify and Insure++, off and on, several times, and we
enjoyed many wasted hours squashing suprious complaints from those <wink>.

> Objects/obmalloc.c:
>
>    The ADDRESS_IN_RANGE macro makes references to uninitialized memory.
>
> This produced tons of warnings so I ran the rest of the tests without
> pymalloc.

Ouch.  That's not going to change, so it may be worth learning how to write
a Valgrind suppression file.  ADDRESS_IN_RANGE determines whether an address
was passed out by pymalloc.  It does this by (a) reading an index from an
address computed *from* the claimant address; then (b) using that to index
into its own data structures, which record the range of addresses pymalloc
controls; then (c) comparing the claimant address to that range.  Part #a
can easily end up reading uninitialized memory. but pymalloc doesn't care (a
junk value found there can't fool it).  This is needed to determine whether
to hand off an address to the platform free() or realloc(), and in such
cases part #a may well read up any kind of trash.

> The following tests produced invalid accesses inside the external
> library:
>
> test_anydbm.py
> test_bsddb.py
> test_dbm.py
> test_gdbm.py
> test_curses.py
> test_pwd.py
> test_socket_ssl.py

Figures <wink>.

> I also got some invalid accesses in
> Modules/arraymodule.c:array_ass_subscr
> while running test_array and in Objects/Listobject.c:list_ass_subscript
> running test_types. For some reason I couldn't reproduce them later.

Another memory-debugging tool, another chance to debug a memory-debugging
tool.



From neal@metaslash.com  Wed Jul 31 00:15:34 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Tue, 30 Jul 2002 19:15:34 -0400
Subject: [Python-Dev] Valgrinding Python
References: <LNBBLJKPBEHFEDALKOLCCEMHAIAB.tim.one@comcast.net>
Message-ID: <3D471E16.A01B14D8@metaslash.com>

Tim Peters wrote:
> 
> [Oren Tirosh]
> 
> > I also got some invalid accesses in
> > Modules/arraymodule.c:array_ass_subscr
> > while running test_array and in Objects/Listobject.c:list_ass_subscript
> > running test_types. For some reason I couldn't reproduce them later.
> 
> Another memory-debugging tool, another chance to debug a memory-debugging
> tool.

Naw, cvs update can explain this one. :-)

Michael Hudson fixed this (extended slice problem) based 
on a bug report I submitted.  I ran valgrind on RedHat 7.2.

I also had problems w/pymalloc originally so I disabled it.  
I may try again.  There's somthing I found very interesting, though.

I run purify on a sparc w/gcc 2.95.3 (maybe 3.0.x too, 
I can't remember).  The problems with pymalloc and some of the dbm
problems were also reported by purify.  I've reviewed the code
and can't find any problems.  But different tools on different
architectures with somewhat different compilers report similar errors.

Neal


From greg@cosc.canterbury.ac.nz  Wed Jul 31 00:34:29 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 31 Jul 2002 11:34:29 +1200 (NZST)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <20020730061026.33569.qmail@web40106.mail.yahoo.com>
Message-ID: <200207302334.g6UNYTZ7018964@kuku.cosc.canterbury.ac.nz>

Scott Gilbert <xscottg@yahoo.com>:

> We haven't seen a semi-thorough use case where the locking behavior is
> beneficial yet. ... If there is no realizable benefit to the
> acquire/release semantics of the new interface, then this is just extra
> burden too.

The proposer of the original safe-buffer interface claimed to have a
use case where the existing buffer interface is not safe enough,
involving asynchronous I/O. I've been basing my comments on the
assumption that he does actually have a need for it.

The original proposal was restricted to non-resizable objects. I
suggested a small extension which would remove this restriction, at
what seems to me quite a small cost.

It may turn out that the restriction is easily lived with. On the
other hand, we might decide later that it's a nuisance. What worries
me is if we design a restricted safe-buffer interface now, and start
using it, and later decide that we want an unrestricted safe-buffer
interface, we'll then have two different safe-buffer interfaces
around, with lots of code that will only accept non-resizable objects
for no reason other than that it's using the old interface.

So I think it's worth putting in some thought and getting it as
right as we can from the beginning.

> I'm concerned that this is very much like the segment count features
> of the current PyBufferProcs.  It was apparently designed for more
> generality, and while no one uses it, everyone has to check that the
> segment count is one or raise an exception.

It's not as bad as that! My version of the proposal would impose *no*
burden on implementations that did not require locking, for the
following reasons:

1) Locking is an optional task performed by the getxxxbuffer
routines. Objects which do not require locking just don't
do it.

2) For objects not requiring locking, the releasebuffer
operation is a no-op. Such an object can simply not
implement this routine, and the type machinery can fill
it in with a stub.

It does place one extra burden on users of the interface, namely
calling the release routine. But I believe that this could even be
beneficial, in a way. The user is going to have to think about the
lifetime of the pointer, and be sure to keep a reference to the
underlying Python object as long as the pointer is needed.  Having to
keep it around so that you can call the release routine on it would
help to bring this into sharp focus.

> The extension releases the GIL so that another
> thread can work on the array object.

Hey, whoa right there! If you have two threads accessing this array
object simulaneously, you should be using a mutex or semaphore or
something to coordinate them. As I pointed out before, thread
synchronisation is outside the scope of my proposal.

The only purpose of the locking, in my proposal, is to ensure that an
exception occurs instead of a crash if the programmer screws up and
tries to resize an object whose internals are being messed with. It's
up to the programmer to do whatever is necessary to ensure that he
doesn't do that.

> If extend() is called while thread 1 has the array locked, it can:
> 
>    A) raise an exception or return an error

Yes. (Raise an exception.)

> Case A is troublesome because depending on thread scheduling/disk
> performance, you will or won't get the exception.

As I said before, you should be synchronising your threads
somehow *before* they operate on the object! If you don't,
you deserve whatever you get.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Wed Jul 31 01:03:55 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 31 Jul 2002 12:03:55 +1200 (NZST)
Subject: [Python-Dev] seeing off SET_LINENO
In-Reply-To: <2md6t5ieh2.fsf@starship.python.net>
Message-ID: <200207310003.g6V03tjm018993@kuku.cosc.canterbury.ac.nz>

Michael Hudson <mwh@python.net>:

> My patch means the debugger doesn't stop
> on the "def f():" line -- unsurprisingly, given that no execution ever
> takes place on that line.

If there is no code there, there shouldn't be any
need to stop there, should there?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Wed Jul 31 01:12:55 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 31 Jul 2002 12:12:55 +1200 (NZST)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <200207301537.g6UFbad09910@odiug.zope.com>
Message-ID: <200207310012.g6V0Ctj5019001@kuku.cosc.canterbury.ac.nz>

> I think you misunderstand what I wrote.  A py_DECREF() for an
> *unrelated* object can invoke Python code (if it ends up deleting a
> class instance with a __del__ method).

I don't see why that's a problem. If the unrelated object's
__del__ ends up messing with the object in question, that's
an issue for the programmer to sort out.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From tim.one@comcast.net  Wed Jul 31 01:09:59 2002
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 30 Jul 2002 20:09:59 -0400
Subject: [Python-Dev] Valgrinding Python
In-Reply-To: <3D471E16.A01B14D8@metaslash.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEMOAIAB.tim.one@comcast.net>

[Neal Norwitz]
> ...
> I also had problems w/pymalloc originally so I disabled it.
> I may try again.  There's somthing I found very interesting, though.
>
> I run purify on a sparc w/gcc 2.95.3 (maybe 3.0.x too,
> I can't remember).  The problems with pymalloc and some of the dbm
> problems were also reported by purify.  I've reviewed the code
> and can't find any problems.  But different tools on different
> architectures with somewhat different compilers report similar errors.

pymalloc does read uninitialized memory, and routinely, as explained in the
msg you're replying to.  If that occurs outside code generated for the
ADDRESS_IN_RANGE macro, though, it may be a real problem (inside code
generated by that macro, reading uninitialized memory is-- curiously
enough! --necessary for proper operation).



From greg@cosc.canterbury.ac.nz  Wed Jul 31 01:14:56 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 31 Jul 2002 12:14:56 +1200 (NZST)
Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines
In-Reply-To: <15686.45677.421287.717866@anthem.wooz.org>
Message-ID: <200207310014.g6V0EuUS019007@kuku.cosc.canterbury.ac.nz>

> Originally we thought it was more important to
> be able to contact the author, but there are quite a few reasons to
> revise this intention.  As pointed out, email addresses change.  Also,
> experience has shown that most of the discussions about PEPs are
> conducted on the public forums (mailing lists / newsgroups), so that's
> a fine way to contact the people working on the PEP.  And of course,
> we allow the PEP authors to obfuscate or omit their email addresses
> altogether.

Why not have *two* fields in the PEP, one for the real
name, and the other for an email address?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From barry@python.org  Wed Jul 31 01:49:06 2002
From: barry@python.org (Barry A. Warsaw)
Date: Tue, 30 Jul 2002 20:49:06 -0400
Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines
References: <15686.45677.421287.717866@anthem.wooz.org>
 <200207310014.g6V0EuUS019007@kuku.cosc.canterbury.ac.nz>
Message-ID: <15687.13314.271722.779762@anthem.wooz.org>

>>>>> "GE" == Greg Ewing <greg@cosc.canterbury.ac.nz> writes:

    GE> Why not have *two* fields in the PEP, one for the real
    GE> name, and the other for an email address?

I dunno, that seems like overkill.
-Barry


From greg@cosc.canterbury.ac.nz  Wed Jul 31 02:44:23 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 31 Jul 2002 13:44:23 +1200 (NZST)
Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines
In-Reply-To: <15687.13314.271722.779762@anthem.wooz.org>
Message-ID: <200207310144.g6V1iNgZ019135@kuku.cosc.canterbury.ac.nz>

Barry:

>     GE> Why not have *two* fields in the PEP, one for the real
>     GE> name, and the other for an email address?
> 
> I dunno, that seems like overkill.

It would certainly put an end to this argument, though!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+



From nhodgson@bigpond.net.au  Wed Jul 31 03:15:28 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Wed, 31 Jul 2002 12:15:28 +1000
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020729002957.74716.qmail@web40101.mail.yahoo.com> <00c601c23707$35819a20$3da48490@neil> <06f301c2370d$16941060$e000a8c0@thomasnotebook> <029801c23758$e13594b0$3da48490@neil> <200207301639.g6UGd1S17363@odiug.zope.com> <03a101c237e9$60fb3a20$e000a8c0@thomasnotebook> <200207301859.g6UIx6117906@odiug.zope.com> <063301c237fe$80506b10$e000a8c0@thomasnotebook>              <004e01c2380b$762ef5e0$3da48490@neil>  <200207302113.g6ULD0N21213@odiug.zope.com>
Message-ID: <039701c23838$26dbfab0$3da48490@neil>

Guido van Rossum:

> If the buffer is relatively small, copying the data an extra time
> shouldn't be a problem, and you can use the old API.
>
> If the buffer is huge, you probably shouldn't want to move the buffer
> around in memory anyway,

   Even large (or huge) buffers may need extension (inserting text in
Scintilla, adding a frame to a movie), leading to a reallocation and thus a
move.

   Neil




From nhodgson@bigpond.net.au  Wed Jul 31 03:01:25 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Wed, 31 Jul 2002 12:01:25 +1000
Subject: [Python-Dev] PEP 298 - the Fixed Buffer Interface
References: <04da01c237ef$c103ac30$e000a8c0@thomasnotebook>
Message-ID: <039301c23838$24a21040$3da48490@neil>

Thomas Heller:

> Abstract
>
>     This PEP proposes an extension to the buffer interface called the
>     'fixed buffer interface'.

   I'd like to see the purpose of the interface defined here rather than
rely upon a reference to an email which talks about two buffer entities, the
API and the object. Reading the email produces a purpose that could be used
here:

[the Buffer API is] intended to allow efficient
binary I/O from and (in some cases) to large objects that have a
relatively well-understood underlying memory representation

   Neil



From nhodgson@bigpond.net.au  Wed Jul 31 03:12:31 2002
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Wed, 31 Jul 2002 12:12:31 +1000
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020730061016.32588.qmail@web40103.mail.yahoo.com> <005d01c237ae$4b2f6670$3da48490@neil> <025a01c237e3$82eb7c90$e000a8c0@thomasnotebook>
Message-ID: <039401c23838$25ad8cd0$3da48490@neil>

Thomas Heller:

> In plain text:
> Provide a method which returns a 'view' into your object's
> buffer after locking the object. The view holds a reference
> to object, the objects is unlocked and decref'd when the
> view is destroyed.

   Yes, this handles the situation. However I see some problems here:
1 Explicit resource release, such as closing files, is easier to understand
and debug than implicit ref-count exhaustion.
2 On platforms such as .NET and the JVM, the view object will live for an
indeterminate time, prohibiting resizes until the VM decides to garbage
collect. While the JVM can not return pointers, and so may seem to not be a
candidate for this interface, it can return array references.
3 More complex implementation requiring a secondary view object.

   Neil




From neal@metaslash.com  Wed Jul 31 03:19:08 2002
From: neal@metaslash.com (Neal Norwitz)
Date: Tue, 30 Jul 2002 22:19:08 -0400
Subject: [Python-Dev] Valgrinding Python
References: <LNBBLJKPBEHFEDALKOLCIEMOAIAB.tim.one@comcast.net>
Message-ID: <3D47491C.B0E9E165@metaslash.com>

Tim Peters wrote:

> pymalloc does read uninitialized memory, and routinely, as explained in the
> msg you're replying to.  If that occurs outside code generated for the
> ADDRESS_IN_RANGE macro, though, it may be a real problem (inside code
> generated by that macro, reading uninitialized memory is-- curiously
> enough! --necessary for proper operation).

This is good news.  I changed ADDRESS_IN_RANGE to a function, 
then suppressed it.  There were no other uninitialized memory reads.

Valgrind does report a bunch of problems with pthreads, but
these are likely valgrind's fault.  There are some complaints
about memory leaks, but these seem to appear only to occur
when spawning/threading.  The leaks are small and short lived.

Neal


From barry@python.org  Wed Jul 31 03:22:12 2002
From: barry@python.org (Barry A. Warsaw)
Date: Tue, 30 Jul 2002 22:22:12 -0400
Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines
References: <15687.13314.271722.779762@anthem.wooz.org>
 <200207310144.g6V1iNgZ019135@kuku.cosc.canterbury.ac.nz>
Message-ID: <15687.18900.871205.963521@anthem.wooz.org>

>>>>> "GE" == Greg Ewing <greg@cosc.canterbury.ac.nz> writes:

    GE> Barry:

    >> GE> Why not have *two* fields in the PEP, one for the real GE>
    >> name, and the other for an email address?
    >> I dunno, that seems like overkill.

    GE> It would certainly put an end to this argument, though!

What argument? :)

-Barry


From aahz@pythoncraft.com  Wed Jul 31 04:36:39 2002
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 30 Jul 2002 23:36:39 -0400
Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines
In-Reply-To: <15687.18900.871205.963521@anthem.wooz.org>
References: <15687.13314.271722.779762@anthem.wooz.org> <200207310144.g6V1iNgZ019135@kuku.cosc.canterbury.ac.nz> <15687.18900.871205.963521@anthem.wooz.org>
Message-ID: <20020731033639.GB14993@panix.com>

On Tue, Jul 30, 2002, Barry A. Warsaw wrote:
> 
> >>>>> "GE" == Greg Ewing <greg@cosc.canterbury.ac.nz> writes:
> 
>     GE> Barry:
> 
>     >> GE> Why not have *two* fields in the PEP, one for the real GE>
>     >> name, and the other for an email address?
>     >> I dunno, that seems like overkill.
> 
>     GE> It would certainly put an end to this argument, though!
> 
> What argument? :)

You blithering idiot, you ought to be smacked with a fish.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/


From greg@cosc.canterbury.ac.nz  Wed Jul 31 05:18:35 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 31 Jul 2002 16:18:35 +1200 (NZST)
Subject: [Python-Dev] Re: PEP 1, PEP Purpose and Guidelines
In-Reply-To: <20020731033639.GB14993@panix.com>
Message-ID: <200207310418.g6V4IZVf019187@kuku.cosc.canterbury.ac.nz>

Aahz <aahz@pythoncraft.com>:

> >     GE> It would certainly put an end to this argument, though!
> > 
> > What argument? :)
> 
> You blithering idiot, you ought to be smacked with a fish.

No, that's abuse. Arguments are next door...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From mhammond@skippinet.com.au  Wed Jul 31 06:28:44 2002
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Wed, 31 Jul 2002 15:28:44 +1000
Subject: [Python-Dev] seeing off SET_LINENO
In-Reply-To: <200207310003.g6V03tjm018993@kuku.cosc.canterbury.ac.nz>
Message-ID: <LCEPIIGDJPKCOIHOBJEPCEHGGBAA.mhammond@skippinet.com.au>

> Michael Hudson <mwh@python.net>:
>
> > My patch means the debugger doesn't stop
> > on the "def f():" line -- unsurprisingly, given that no execution ever
> > takes place on that line.

[Greg]
> If there is no code there, there shouldn't be any
> need to stop there, should there?

[Barry in a different message]
> I can't decide whether it would be good to stop on the def or not.
> Not doing so makes pdb act more like gdb, which also only stops on the
> first executable line, so maybe that's a good thing.

IMO, the Python debugger "interface" should include function entry.  The
debugger UI (in this case pdb, but any other debugger) may choose not to
break there, but the debugger itself may be able to implement some useful
things by having the hook.

Mark.



From xscottg@yahoo.com  Wed Jul 31 07:29:50 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Tue, 30 Jul 2002 23:29:50 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <005d01c237ae$4b2f6670$3da48490@neil>
Message-ID: <20020731062950.59376.qmail@web40105.mail.yahoo.com>

--- Neil Hodgson <nhodgson@bigpond.net.au> wrote:
> 
>    Since Scintilla is a component within a user interface, it shares this
> responsibility with the container application with the application being
> the main determinant. If I was writing a Windows-specific application
that
> used Scintilla, and I wanted to use Asynchronous I/O then my preferred
> technique would be to change the message processing loop to leave the UI
> input messages in the queue until the I/O had completed.
>    Once the I/O had completed then the message loop would change back to
> processing all messages which would allow the banked up input to come
> through.
>

Cool.  This is what I was looking for.  It's a tad complicated, but it
makes a bit of sense.

Is there anything in here that can't be done if you only had the simple (no
locking) version of the fixed buffer interface?

> 
> > A single lock interface can be implemented over an object without any
> > locking.  Have the lockable object return simple "fixed buffer objects"
> > with a limited lifespan.
> 
>    This returns to the possibility of indeterminate lifespan as mentioned
> earlier in the thread.
> 

Not if you add an explicit release() method.  Just like the file object has
an explicit close() method.  Your object with the locking smarts could just
return "snapshot" views with an explicit release() method on them.

>
> > At which point I wonder what using asynchronous I/O achieved since the
> > resize operation had to wait synchronously for the I/O to complete. 
> > This also sounds suspiciously like blocking the resize thread, but I
> > won't argue that point.
> 
>    There may be other tasks that the application can perform while
> waiting for the I/O to complete, such as displaying, styling or line-
> wrapping whatever text has already arrived (assuming that there are some
> facilities for discovering this) or performing similar tasks for other
> windows.
>

All good points.  Thank you for indulging me.  Sorry to be such a PITA.


Cheers,
    -Scott






__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From xscottg@yahoo.com  Wed Jul 31 07:29:59 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Tue, 30 Jul 2002 23:29:59 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <025a01c237e3$82eb7c90$e000a8c0@thomasnotebook>
Message-ID: <20020731062959.59382.qmail@web40105.mail.yahoo.com>

--- Thomas Heller <thomas.heller@ion-tof.com> wrote:
> 
> In plain text:
> Provide a method which returns a 'view' into your object's
> buffer after locking the object. The view holds a reference
> to object, the objects is unlocked and decref'd when the
> view is destroyed.
>

Exactly.  This is just like putting an explicit close() on the file object.


Cheers,
    -Scott






__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From xscottg@yahoo.com  Wed Jul 31 07:30:58 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Tue, 30 Jul 2002 23:30:58 -0700 (PDT)
Subject: [Python-Dev] PEP 298 - the Fixed Buffer Interface
In-Reply-To: <039301c23838$24a21040$3da48490@neil>
Message-ID: <20020731063058.76595.qmail@web40103.mail.yahoo.com>

--- Neil Hodgson <nhodgson@bigpond.net.au> wrote:
> 
>    I'd like to see the purpose of the interface defined here rather than
> rely upon a reference to an email which talks about two buffer entities,
> the API and the object. Reading the email produces a purpose that could 
> be used here:
> 
> [the Buffer API is] intended to allow efficient
> binary I/O from and (in some cases) to large objects that have a
> relatively well-understood underlying memory representation
> 

It's not just for I/O.  In addition to I/O, I intend to use it for
numerical calculations that can be run independently of the GIL.


Cheers,
    -Scott








__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From xscottg@yahoo.com  Wed Jul 31 07:30:55 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Tue, 30 Jul 2002 23:30:55 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <039401c23838$25ad8cd0$3da48490@neil>
Message-ID: <20020731063055.74410.qmail@web40110.mail.yahoo.com>

--- Neil Hodgson <nhodgson@bigpond.net.au> wrote:
> Thomas Heller:
> 
> > In plain text:
> > Provide a method which returns a 'view' into your object's
> > buffer after locking the object. The view holds a reference
> > to object, the objects is unlocked and decref'd when the
> > view is destroyed.
> 
>    Yes, this handles the situation. However I see some problems here:
> 1 Explicit resource release, such as closing files, is easier to
> understand and debug than implicit ref-count exhaustion.
>

So add an explicit release() method to your object.  Just because it
supports the "Fixed Buffer API" doesn't mean you can't add other methods to
it.

>
> 2 On platforms such as .NET and the JVM, the view object will live for an
> indeterminate time, prohibiting resizes until the VM decides to garbage
> collect. While the JVM can not return pointers, and so may seem to not be
> a candidate for this interface, it can return array references.
>

This is solved with the explicit release() method above.  Just like files
solve this problem with an explicit close() method.

>
> 3 More complex implementation requiring a secondary view object.
>

It's also a more complex problem that you're trying to solve.  Putting the
complexity on the common, simple, cases may not be appropriate when the
complex cases are few and far between.


Cheers,
    -Scott






__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From xscottg@yahoo.com  Wed Jul 31 07:31:13 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Tue, 30 Jul 2002 23:31:13 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <200207302334.g6UNYTZ7018964@kuku.cosc.canterbury.ac.nz>
Message-ID: <20020731063113.74481.qmail@web40110.mail.yahoo.com>

--- Greg Ewing <greg@cosc.canterbury.ac.nz> wrote:
> Scott Gilbert <xscottg@yahoo.com>:
> 
> > We haven't seen a semi-thorough use case where the locking behavior is
> > beneficial yet. ... If there is no realizable benefit to the
> > acquire/release semantics of the new interface, then this is just extra
> > burden too.
> 
> The proposer of the original safe-buffer interface claimed to have a
> use case where the existing buffer interface is not safe enough,
> involving asynchronous I/O. I've been basing my comments on the
> assumption that he does actually have a need for it.
> 

I believe Thomas Heller's needs were met without making locking part of the
interface, but that he was willing to bend to please you and Neil.  His
original proposal did not include any notion of locking.  Nor does his
current since Guido has taken a stand on this issue.


> 
> So I think it's worth putting in some thought and getting it as
> right as we can from the beginning.
> 

Absolutely.  I just wanted to make sure that there is at least one sensible
use case before adding the complexity.  Moreover, if the sensible use cases
for locking are few and far between, then I'm still inclined to leave it
out since you can add the locking semantics at a different level.

It looks like Neil has sufficiently defined an example where it's useful. 
His use case is a bit complicated though, and I think he could get every
bit of that functionality by putting the locking in a smarter object
tailored for his application, and working with temporary "snapshot" objects
with an explicit release() method.  

What if Neil decides he needs Reader/Writer locks?  This is completely
justifiable too, since multiple threads can read an object without
interfering, but only one should be writing it.  We shouldn't arbitrarily
add complexity for the exceptional cases.


>
> > I'm concerned that this is very much like the segment count features
> > of the current PyBufferProcs.  It was apparently designed for more
> > generality, and while no one uses it, everyone has to check that the
> > segment count is one or raise an exception.
> 
> It's not as bad as that! My version of the proposal would impose *no*
> burden on implementations that did not require locking, for the
> following reasons:
>

Your use of the word *no* is different than mine.  :-)  I could similarly
claim that the segment count puts no burden on implementations that don't
need it.


> 
> 1) Locking is an optional task performed by the getxxxbuffer
> routines. Objects which do not require locking just don't
> do it.
> 
> 2) For objects not requiring locking, the releasebuffer
> operation is a no-op. Such an object can simply not
> implement this routine, and the type machinery can fill
> it in with a stub.
> 

I believe it will be a no-op in enough places that extension writers will
do it wrong without even knowing.


>
> > The extension releases the GIL so that another
> > thread can work on the array object.
> 
> Hey, whoa right there! If you have two threads accessing this array
> object simulaneously, you should be using a mutex or semaphore or
> something to coordinate them. As I pointed out before, thread
> synchronisation is outside the scope of my proposal.
> 

This is exactly Neil's use case.  He's got two threads reading it
simultaneously.  One thread (not really a thread, but the asynchronous I/O
operation) is writing to disk, and the other thread is keeping the user
interface updated.  There is no problem until the user tries to enter text
(which forces a resize) before the asynchronous I/O is complete.  Neil has
a solution for this, but I think it's less than typical.


>
> The only purpose of the locking, in my proposal, is to ensure that an
> exception occurs instead of a crash if the programmer screws up and
> tries to resize an object whose internals are being messed with. It's
> up to the programmer to do whatever is necessary to ensure that he
> doesn't do that.
> 
> > If extend() is called while thread 1 has the array locked, it can:
> > 
> >    A) raise an exception or return an error
> 
> Yes. (Raise an exception.)
> 

Which exception?  Would you introduce a standard exception that should be
raised when the user tries to do an operation that currently isn't allowed
because the buffer is locked?



Truthfully, now that Neil has given his explanation, I'm beginning to bend
on this a bit.  You're right in that it's not that much burden (however,
it's more than *no* burden :-), and someone might find it useful.  I still
think it's going to be pretty uncommon, and I still believe the locking can
be added on top of the simpler interface as needed.


Cheers,
    -Scott







__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From mhammond@skippinet.com.au  Wed Jul 31 07:43:50 2002
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Wed, 31 Jul 2002 16:43:50 +1000
Subject: [Python-Dev] Get fame and fortune from mindless editing
Message-ID: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>

An offer too good to refuse ;)

We recently deprecated the DL_EXPORT and DL_IMPORT macros, replacing them
with purpose oriented macros.  In an effort to cleanup the source, it would
be good to remove all such macros from the Python source tree.

I have already made a start on this, and only mindless editing remains.
What needs to be done is:

* Modules/*.c - all 'DL_EXPORT(void)' references (which are all module init
functions) are to be replaced with 'PyMODINIT_FUNC' - note no parens, and
not no return type is specified.

Eg, the following patch would be most suitable <wink>:
Index: timemodule.c
...
@@ -621,5 +621,5 @@


-DL_EXPORT(void)
+PyMODINIT_FUNC
 inittime(void)
 {

* Include/*.h - all public declarations need to be changed. All
'DL_IMPORT(type)' references, *including* any leading 'extern' declaration,
should be changed to either PyAPI_FUNC (for functions) or PyAPI_DATA (for
data)

For example, the following 3 lines (from various .h files):
extern DL_IMPORT(PyTypeObject) PyUnicode_Type;
extern DL_IMPORT(PyObject*) PyUnicode_FromUnicode(...);
DL_IMPORT(void) PySys_SetArgv(int, char **);

would be changed to:
PyAPI_DATA(PyTypeObject) PyUnicode_Type;
PyAPI_FUNC(PyObject*) PyUnicode_FromUnicode(...);
PyAPI_FUNC(void) PySys_SetArgv(int, char **);

Note all 'extern' declarations were removed, and PyUnicode_Type is data (and
declared as such) while the other 2 are functions.

This is all mindless editing, suitable for a day when the brain doesn't
quite seem to be firing!  The fame comes from getting your name splashed all
over the CVS logs.  The fortune... well, not all valuable things can be
measured in dollars <wink>.

Thanks,

Mark.



From kalle@lysator.liu.se  Wed Jul 31 09:19:57 2002
From: kalle@lysator.liu.se (Kalle Svensson)
Date: Wed, 31 Jul 2002 10:19:57 +0200
Subject: [Python-Dev] Get fame and fortune from mindless editing
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
References: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
Message-ID: <20020731081957.GB1161@i92.ryd.student.liu.se>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[Mark Hammond]
> An offer too good to refuse ;)

Right.  http://python.org/sf/588982

Since this is my first post here, I'll introduce myself.  I'm a first
year student in computer engineering at Linköping University, Sweden.
I've been lurking here for a few months.  My primary Python interest
at the moment is the Snake Farm project.  Otherwise, I like Unix, free
software and all that usual stuff.

Peace,
  Kalle
- -- 
Kalle Svensson, http://www.juckapan.org/~kalle/
Student, root and saint in the Church of Emacs.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.6 <http://mailcrypt.sourceforge.net/>

iD8DBQE9R52GdNeA1787sd0RAjVwAJ9/c4y8Tq0lqf6tUfgGeaD2DZIV3QCfQAvh
tBwRn/mmh52sFncmo3shxhg=
=6Z3v
-----END PGP SIGNATURE-----


From mwh@python.net  Wed Jul 31 09:22:22 2002
From: mwh@python.net (Michael Hudson)
Date: 31 Jul 2002 09:22:22 +0100
Subject: [Python-Dev] seeing off SET_LINENO
In-Reply-To: "Mark Hammond"'s message of "Wed, 31 Jul 2002 15:28:44 +1000"
References: <LCEPIIGDJPKCOIHOBJEPCEHGGBAA.mhammond@skippinet.com.au>
Message-ID: <2mu1mgfgsh.fsf@starship.python.net>

"Mark Hammond" <mhammond@skippinet.com.au> writes:

> > Michael Hudson <mwh@python.net>:
> >
> > > My patch means the debugger doesn't stop
> > > on the "def f():" line -- unsurprisingly, given that no execution ever
> > > takes place on that line.
> 
> [Greg]
> > If there is no code there, there shouldn't be any
> > need to stop there, should there?
> 
> [Barry in a different message]
> > I can't decide whether it would be good to stop on the def or not.
> > Not doing so makes pdb act more like gdb, which also only stops on the
> > first executable line, so maybe that's a good thing.
> 
> IMO, the Python debugger "interface" should include function entry.  

There goes the time machine: it does.  I just think everyone ignores
'call' messages because they're a bit redundant today (because of the
matter under discussion).

> The debugger UI (in this case pdb, but any other debugger) may
> choose not to break there, but the debugger itself may be able to
> implement some useful things by having the hook.

bdb.Bdb.user_call(), I believe.

Cheers,
M.

-- 
  One of the great skills in using any language is knowing what not
  to use, what not to say.  ... There's that simplicity thing again.
                                                       -- Ron Jeffries


From akim@epita.fr  Wed Jul 31 10:11:11 2002
From: akim@epita.fr (Akim Demaille)
Date: 31 Jul 2002 11:11:11 +0200
Subject: [Python-Dev] Re: HAVE_CONFIG_H
In-Reply-To: <oqbs8pi2ct.fsf@titan.progiciels-bpi.ca>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net>
 <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de>
 <oqptx6qfhl.fsf@titan.progiciels-bpi.ca>
 <200207301539.g6UFdUS09930@odiug.zope.com>
 <200207301622.g6UGMBl17143@odiug.zope.com>
 <oqbs8pi2ct.fsf@titan.progiciels-bpi.ca>
Message-ID: <mv44regqn2o.fsf@nostromo.lrde.epita.fr>

>>>>> "Fran=E7ois" =3D=3D Fran=E7ois Pinard <pinard@iro.umontreal.ca> wri=
tes:

Fran=E7ois> [Guido van Rossum]

Hi Guido, Hi Francois !

>> Since we don't use this idiom, we can safely remove the
>> -DHAVE_CONFIG_H (if we can find where it is set).

>> I looked.  It's generated by AC_OUTPUT.  I don't think I can get
>> rid of it.  So never mind. :-)

Fran=E7ois> Maybe AC_OUTPUT, or macros called by AC_OUTPUT, can be
Fran=E7ois> overridden.  If this is not easy to do, you might want to
Fran=E7ois> discuss the matter with Akim, Cc:ed.  Maybe he could tear
Fran=E7ois> down AC_OUTPUT in parts so the overriding gets easier?

Fran=E7ois> I know my friend Akim as good, helping and nice fellow!
Fran=E7ois> Don't fear him! :-)

I'm not sure I completely understand the question here: if
HAVE_CONFIG_H is specified, it means config.h is created.  So if you
use a config.h, why does it matter not to define HAVE_CONFIG_H?


From barry@python.org  Wed Jul 31 13:15:49 2002
From: barry@python.org (Barry A. Warsaw)
Date: Wed, 31 Jul 2002 08:15:49 -0400
Subject: [Python-Dev] seeing off SET_LINENO
References: <200207310003.g6V03tjm018993@kuku.cosc.canterbury.ac.nz>
 <LCEPIIGDJPKCOIHOBJEPCEHGGBAA.mhammond@skippinet.com.au>
Message-ID: <15687.54517.580299.350054@anthem.wooz.org>

>>>>> "MH" == Mark Hammond <mhammond@skippinet.com.au> writes:

    MH> [Barry in a different message]
    >> I can't decide whether it would be good to stop on the def or
    >> not.  Not doing so makes pdb act more like gdb, which also only
    >> stops on the first executable line, so maybe that's a good
    >> thing.

    MH> IMO, the Python debugger "interface" should include function
    MH> entry.  The debugger UI (in this case pdb, but any other
    MH> debugger) may choose not to break there, but the debugger
    MH> itself may be able to implement some useful things by having
    MH> the hook.

Good point.
-Barry


From thomas.heller@ion-tof.com  Wed Jul 31 13:32:25 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 31 Jul 2002 14:32:25 +0200
Subject: [Python-Dev] PEP 298 - the Fixed Buffer Interface
References: <04da01c237ef$c103ac30$e000a8c0@thomasnotebook>
Message-ID: <0ac201c2388e$53c0f020$e000a8c0@thomasnotebook>

> Additional Notes/Comments
> 
>     Python strings, Unicode strings, mmap objects, and maybe other
>     types would expose the fixed buffer interface, but the array type
>     would *not*, because its memory block may be reallocated during
>     its lifetime.
> 
Unfortunately it's impossible to implement the fixed buffer interface
on mmap objects - the memory mapped file can be closed at any time.
This would leave the pointers unusable.

It seems this is another use case for locking - if we want it.

Thomas




From pinard@iro.umontreal.ca  Wed Jul 31 13:41:02 2002
From: pinard@iro.umontreal.ca (=?iso-8859-1?q?Fran=E7ois?= Pinard)
Date: 31 Jul 2002 08:41:02 -0400
Subject: [Python-Dev] Re: HAVE_CONFIG_H
In-Reply-To: <mv44regqn2o.fsf@nostromo.lrde.epita.fr>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net>
 <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de>
 <oqptx6qfhl.fsf@titan.progiciels-bpi.ca>
 <200207301539.g6UFdUS09930@odiug.zope.com>
 <200207301622.g6UGMBl17143@odiug.zope.com>
 <oqbs8pi2ct.fsf@titan.progiciels-bpi.ca>
 <mv44regqn2o.fsf@nostromo.lrde.epita.fr>
Message-ID: <oqvg6w5au9.fsf@titan.progiciels-bpi.ca>

[Akim Demaille]

> I'm not sure I completely understand the question here: if HAVE_CONFIG_H
> is specified, it means config.h is created.  So if you use a config.h,
> why does it matter not to define HAVE_CONFIG_H?

Hi, Akim.  I hope life is still good to you! :-)

In the beginnings of Autoconf, the `config.h' file did not exist.
David MacKenzie added it as a way to reduce the `make' output clutter.
Nowadays, I suspect almost all packages of at least moderate size uses it.

Our traditional `lib/' modules have to work in many packages, whether
`config.h' has been created or not, this being decided on a per package
basis, and that is why there is a conditional inclusion of `config.h' in
each of these `lib/' modules.  He took a good while before we got stabilised
on the exact stanza of this inclusion (I especially remember the massive
unilateral changes by Roland McGrath introducing the BROKEN_BROKET define,
or something like that, and all the doing it later took to clean this out.)

Python (the distribution, which is what is in question here) does not
use any of our `lib/' things, it is not going to use them, and it is not
going to provide new such modules, so the distribution includes `config.h'
everywhere, by permanent choice, without any need to use `HAVE_CONFIG_H'
to decide if that inclusion is needed or not.  So, even `-DHAVE_CONFIG_H'
is useless `make' clutter in this case, and that's why the Python packagers
wanted to get rid of it.

In fact, in practice `-DHAVE_CONFIG_H' is only needed for packages using
those common `lib/' modules, but many packages do not.  Now that Autoconf
is used with projects who have a life outside GNU, this is less necessary.
Guido found, and got me to remember, that `@DEFS@' is the culprit: people
just do not have to use it in their hand-crafted Makefiles, which is the
case for Python.  For away-from-GNU packages using Automake, some Automake
option might exist so `@DEFS@' does not get generated?  The only goal here
is to get a cleaner `make' output.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard


From skip@pobox.com  Wed Jul 31 14:39:31 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 31 Jul 2002 08:39:31 -0500
Subject: [Python-Dev] Get fame and fortune from mindless editing
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
References: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
Message-ID: <15687.59539.842887.296794@localhost.localdomain>

    Mark> I have already made a start on this, and only mindless editing
    Mark> remains.

"mindless editing" ==> sed script or Emacs macros... ;-)

Skip


From skip@pobox.com  Wed Jul 31 14:52:52 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 31 Jul 2002 08:52:52 -0500
Subject: [Python-Dev] Get fame and fortune from mindless editing
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
References: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
Message-ID: <15687.60340.880139.545471@localhost.localdomain>

    Mark> We recently deprecated the DL_EXPORT and DL_IMPORT macros,
    Mark> replacing them with purpose oriented macros.  In an effort to
    Mark> cleanup the source, it would be good to remove all such macros
    Mark> from the Python source tree.

I modified the Modules/*.c and Includes/*.h files.  Is there a patch/bug
number I should attach the context diffs to for review?

Skip




From skip@pobox.com  Wed Jul 31 14:59:09 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 31 Jul 2002 08:59:09 -0500
Subject: [Python-Dev] Get fame and fortune from mindless editing
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
References: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
Message-ID: <15687.60717.533154.63118@localhost.localdomain>

What about the references to DL_IMPORT/DL_EXPORT in Includes/Python.h and
the two #ifndef DL_EXPORT lines in Modules/{cPickle.c,cStringIO.c}?

Skip


From guido@python.org  Wed Jul 31 15:18:27 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 31 Jul 2002 10:18:27 -0400
Subject: [Python-Dev] Re: HAVE_CONFIG_H
In-Reply-To: Your message of "Wed, 31 Jul 2002 11:11:11 +0200."
 <mv44regqn2o.fsf@nostromo.lrde.epita.fr>
References: <200207291930.g6TJUYi05460@pcp02138704pcs.reston01.va.comcast.net> <m3fzy2kuye.fsf@mira.informatik.hu-berlin.de> <oqptx6qfhl.fsf@titan.progiciels-bpi.ca> <200207301539.g6UFdUS09930@odiug.zope.com> <200207301622.g6UGMBl17143@odiug.zope.com> <oqbs8pi2ct.fsf@titan.progiciels-bpi.ca>
 <mv44regqn2o.fsf@nostromo.lrde.epita.fr>
Message-ID: <200207311418.g6VEIRW32518@odiug.zope.com>

> I'm not sure I completely understand the question here: if
> HAVE_CONFIG_H is specified, it means config.h is created.  So if you
> use a config.h, why does it matter not to define HAVE_CONFIG_H?

It's just clutter on the command line that we don't need.

But never mind, I found a way to lose it already.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Jul 31 15:36:37 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 31 Jul 2002 10:36:37 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: Your message of "Tue, 30 Jul 2002 23:29:50 PDT."
 <20020731062950.59376.qmail@web40105.mail.yahoo.com>
References: <20020731062950.59376.qmail@web40105.mail.yahoo.com>
Message-ID: <200207311436.g6VEabH32668@odiug.zope.com>

Based on the example of mmap (which can be closed at any time) I
agree that the fixed buffer interface needs to have "get"
and "release" methods (please pick better names).  Maybe Thomas can
update PEP 298.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Wed Jul 31 16:16:20 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 31 Jul 2002 10:16:20 -0500
Subject: [Python-Dev] imaplib test failure
Message-ID: <15687.65348.589402.540281@localhost.localdomain>

Anyone else seeing this?  I doubt it's related to the DL_EXPORT/DL_IMPORT
changes I was just testing, and my local copy of Lib/imaplib.py matches
what's in CVS.

Skip

test test_imaplib produced unexpected output:
**********************************************************************
*** lines 2-3 of actual output doesn't appear in expected output after line 1:
+ incorrect result when converting (2033, 5, 18, 3, 33, 20, 2, 138, 0)
+ incorrect result when converting '"18-May-2033 13:33:20 +1000"'
**********************************************************************


From jeremy@alum.mit.edu  Wed Jul 31 15:56:40 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Wed, 31 Jul 2002 10:56:40 -0400
Subject: [Python-Dev] Get fame and fortune from mindless editing
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
References: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
Message-ID: <15687.64168.403730.225372@slothrop.zope.com>

>>>>> "MH" == Mark Hammond <mhammond@skippinet.com.au> writes:

  MH> An offer too good to refuse ;) We recently deprecated the
  MH> DL_EXPORT and DL_IMPORT macros, replacing them with purpose
  MH> oriented macros.  In an effort to cleanup the source, it would
  MH> be good to remove all such macros from the Python source tree.

Would it make any sense to backport the new macros to the 2.2 branch?
It might ease the life of extension writers who want their code to
work with either version.  The practical problem, however, is that
their code would only work with a too-be-released 2.2.2.

Jeremy



From barry@python.org  Wed Jul 31 16:23:29 2002
From: barry@python.org (Barry A. Warsaw)
Date: Wed, 31 Jul 2002 11:23:29 -0400
Subject: [Python-Dev] imaplib test failure
References: <15687.65348.589402.540281@localhost.localdomain>
Message-ID: <15688.241.352958.223156@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    SM> Anyone else seeing this?  I doubt it's related to the
    SM> DL_EXPORT/DL_IMPORT changes I was just testing, and my local
    SM> copy of Lib/imaplib.py matches what's in CVS.

Yes, everyone is:

http://mail.python.org/pipermail/python-dev/2002-July/027056.html

but no one's stepped up to the plate yet, including pierslauder <1.4
wink>.

-Barry


From jeremy@alum.mit.edu  Wed Jul 31 16:25:37 2002
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Wed, 31 Jul 2002 11:25:37 -0400
Subject: [Python-Dev] Get fame and fortune from mindless editing
In-Reply-To: <200207311525.g6VFPRf00831@odiug.zope.com>
References: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
 <15687.64168.403730.225372@slothrop.zope.com>
 <200207311525.g6VFPRf00831@odiug.zope.com>
Message-ID: <15688.369.568227.177521@slothrop.zope.com>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

  MH> An offer too good to refuse ;) We recently deprecated the
  MH> DL_EXPORT and DL_IMPORT macros, replacing them with purpose
  MH> oriented macros.  In an effort to cleanup the source, it would
  MH> be good to remove all such macros from the Python source tree.
  >>
  >> Would it make any sense to backport the new macros to the 2.2
  >> branch?  It might ease the life of extension writers who want
  >> their code to work with either version.  The practical problem,
  >> however, is that their code would only work with a
  >> too-be-released 2.2.2.

  GvR> Maybe both the old and the new macros could be supported by
  GvR> 2.2.2?

Yes.  That's my suggestion.

Jeremy




From xscottg@yahoo.com  Wed Jul 31 16:28:32 2002
From: xscottg@yahoo.com (Scott Gilbert)
Date: Wed, 31 Jul 2002 08:28:32 -0700 (PDT)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <200207311436.g6VEabH32668@odiug.zope.com>
Message-ID: <20020731152832.99003.qmail@web40106.mail.yahoo.com>

--- Guido van Rossum <guido@python.org> wrote:
>
> Based on the example of mmap (which can be closed at any time) I
> agree that the fixed buffer interface needs to have "get"
> and "release" methods (please pick better names).  Maybe Thomas can
> update PEP 298.
> 

Wow, the tides have turned.  Fair enough.

I think Neil put forth the names "acquire" and "release".  So how about

        typedef struct {
                getreadbufferproc bf_getreadbuffer;
                getwritebufferproc bf_getwritebuffer;
                getsegcountproc bf_getsegcount;
                getcharbufferproc bf_getcharbuffer;
                /* fixed buffer interface functions */
                acquirereadbufferproc bf_acquirereadbuffer;
                acquirewritebufferproc bf_acquirewritebuffer;
                releasebufferproc bf_releasebuffer;
        } PyBufferProcs;

Whatever the actual names, should there be a bf_releasereadbuffer and
bf_releasewritebuffer?  Or just the one bf_releasebuffer?  Could also just
have one acquire function that indicates whether it is read-write or
read-only via a return parameter.  Is write-only ever useful?




Cheers,
    -Scott


__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com


From guido@python.org  Wed Jul 31 16:25:27 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 31 Jul 2002 11:25:27 -0400
Subject: [Python-Dev] Get fame and fortune from mindless editing
In-Reply-To: Your message of "Wed, 31 Jul 2002 10:56:40 EDT."
 <15687.64168.403730.225372@slothrop.zope.com>
References: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
 <15687.64168.403730.225372@slothrop.zope.com>
Message-ID: <200207311525.g6VFPRf00831@odiug.zope.com>

>   MH> An offer too good to refuse ;) We recently deprecated the
>   MH> DL_EXPORT and DL_IMPORT macros, replacing them with purpose
>   MH> oriented macros.  In an effort to cleanup the source, it would
>   MH> be good to remove all such macros from the Python source tree.
> 
> Would it make any sense to backport the new macros to the 2.2 branch?
> It might ease the life of extension writers who want their code to
> work with either version.  The practical problem, however, is that
> their code would only work with a too-be-released 2.2.2.

Maybe both the old and the new macros could be supported by 2.2.2?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Jul 31 16:37:07 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 31 Jul 2002 11:37:07 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: Your message of "Wed, 31 Jul 2002 08:28:32 PDT."
 <20020731152832.99003.qmail@web40106.mail.yahoo.com>
References: <20020731152832.99003.qmail@web40106.mail.yahoo.com>
Message-ID: <200207311537.g6VFb7r01081@odiug.zope.com>

> I think Neil put forth the names "acquire" and "release".  So how about
> 
>         typedef struct {
>                 getreadbufferproc bf_getreadbuffer;
>                 getwritebufferproc bf_getwritebuffer;
>                 getsegcountproc bf_getsegcount;
>                 getcharbufferproc bf_getcharbuffer;
>                 /* fixed buffer interface functions */
>                 acquirereadbufferproc bf_acquirereadbuffer;
>                 acquirewritebufferproc bf_acquirewritebuffer;
>                 releasebufferproc bf_releasebuffer;
>         } PyBufferProcs;
> 
> Whatever the actual names, should there be a bf_releasereadbuffer and
> bf_releasewritebuffer?  Or just the one bf_releasebuffer?

Just the one.

> Could also just have one acquire function that indicates whether it
> is read-write or read-only via a return parameter.

That loses the (weak) symmetry with the existing API.

> Is write-only ever useful?

No, write implies read.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Jul 31 16:47:46 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 31 Jul 2002 11:47:46 -0400
Subject: [Python-Dev] What to do about the Wiki?
Message-ID: <200207311547.g6VFlk601129@odiug.zope.com>

I don't know what to do about the Moinmoin Wiki on python.org.

Lots of useful information was recently moved to the Wiki, like the
editors list and Andrew Kuchling's bookstore.

But the Wiki brought the website down twice this weekend, by growing
without bounds.  To prevent this from happening again, we've disabled
the Wiki, but that's not a solution.

Juergen Hermann, Moinmoin's author, said he fixed a few things, but
also said that Moinmoin is essentially vulnerable to "recursive wget"
(e.g. someone trying to suck up the entire Wiki by following links).
Apparently this is what brought the site down this weekend -- if I
understand correctly, an in-memory log was growing too fast.  There
are a lot of links in the Wiki, e.g. for each Wiki page there's the
page itself, the edit form, the history, various other actions, etc.

I believe that Juergen has fixed the log-growing problem.  Should we
enable the Wiki again and hope for the best?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas.heller@ion-tof.com  Wed Jul 31 16:49:20 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 31 Jul 2002 17:49:20 +0200
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020731152832.99003.qmail@web40106.mail.yahoo.com>  <200207311537.g6VFb7r01081@odiug.zope.com>
Message-ID: <0cd301c238a9$d5e3a690$e000a8c0@thomasnotebook>

> > Could also just have one acquire function that indicates whether it
> > is read-write or read-only via a return parameter.
> 
> That loses the (weak) symmetry with the existing API.
> 

There's nothing a client expecting a read/write pointer could
do with a read only pointer IMO.

> > Is write-only ever useful?
> 
> No, write implies read.

Should it be named getfixedreadwritebuffer then?

Thomas




From guido@python.org  Wed Jul 31 16:54:41 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 31 Jul 2002 11:54:41 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: Your message of "Wed, 31 Jul 2002 17:49:20 +0200."
 <0cd301c238a9$d5e3a690$e000a8c0@thomasnotebook>
References: <20020731152832.99003.qmail@web40106.mail.yahoo.com> <200207311537.g6VFb7r01081@odiug.zope.com>
 <0cd301c238a9$d5e3a690$e000a8c0@thomasnotebook>
Message-ID: <200207311554.g6VFsfO01268@odiug.zope.com>

> > > Could also just have one acquire function that indicates whether it
> > > is read-write or read-only via a return parameter.
> > 
> > That loses the (weak) symmetry with the existing API.
> 
> There's nothing a client expecting a read/write pointer could
> do with a read only pointer IMO.

So we agree that it's a bad idea to have one function. :-)

> > > Is write-only ever useful?
> > 
> > No, write implies read.
> 
> Should it be named getfixedreadwritebuffer then?

No, the existing API also uses getwritebuffer implying read/write.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Wed Jul 31 16:57:08 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 31 Jul 2002 10:57:08 -0500
Subject: [Python-Dev] Get fame and fortune from mindless editing
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
References: <LCEPIIGDJPKCOIHOBJEPGEHIGBAA.mhammond@skippinet.com.au>
Message-ID: <15688.2260.68645.786641@localhost.localdomain>

    Mark> * Modules/*.c - all 'DL_EXPORT(void)' references ...

    Mark> * Include/*.h - all public declarations need to be changed ...

Context diff of these changes are attached to

    http://python.org/sf/566100

Regression tests pass on my Linux box.  See my note for a couple caveats.

Skip


From thomas.heller@ion-tof.com  Wed Jul 31 16:58:05 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 31 Jul 2002 17:58:05 +0200
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
References: <20020731062950.59376.qmail@web40105.mail.yahoo.com>  <200207311436.g6VEabH32668@odiug.zope.com>
Message-ID: <0d2b01c238ab$0e892ff0$e000a8c0@thomasnotebook>

From: "Guido van Rossum" <guido@python.org>
> Based on the example of mmap (which can be closed at any time) I
> agree that the fixed buffer interface needs to have "get"
> and "release" methods (please pick better names).  Maybe Thomas can
> update PEP 298.

The consequence: mmap objects need a 'buffer lock counter',
and cannot be closed while the count is >0. Which exception
is raised then?

Or do you have something different in mind?
The lock counter wouuld not be needed for strings and unicode...

Thomas




From guido@python.org  Wed Jul 31 17:06:13 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 31 Jul 2002 12:06:13 -0400
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: Your message of "Wed, 31 Jul 2002 17:58:05 +0200."
 <0d2b01c238ab$0e892ff0$e000a8c0@thomasnotebook>
References: <20020731062950.59376.qmail@web40105.mail.yahoo.com> <200207311436.g6VEabH32668@odiug.zope.com>
 <0d2b01c238ab$0e892ff0$e000a8c0@thomasnotebook>
Message-ID: <200207311606.g6VG6Ds01363@odiug.zope.com>

> The consequence: mmap objects need a 'buffer lock counter',
> and cannot be closed while the count is >0. Which exception
> is raised then?

Pick one -- mmap.error (== EnvironmentError) seems fine to me.

Alternately, close() could set a "please close me" flag which causes
the mmap file to be closed when the last release is called.

Of course, the acquire method should raise an exception when it's
already closed.

> Or do you have something different in mind?
> The lock counter wouuld not be needed for strings and unicode...

And the array module could have one.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Wed Jul 31 17:09:13 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 31 Jul 2002 11:09:13 -0500
Subject: [Python-Dev] Re: What to do about the Wiki?
In-Reply-To: <200207311547.g6VFlk601129@odiug.zope.com>
References: <200207311547.g6VFlk601129@odiug.zope.com>
Message-ID: <15688.2985.118330.48738@localhost.localdomain>

    Guido> Juergen Hermann, Moinmoin's author, said he fixed a few thin=
gs,
    Guido> but also said that Moinmoin is essentially vulnerable to
    Guido> "recursive wget" (e.g. someone trying to suck up the entire =
Wiki
    Guido> by following links).  Apparently this is what brought the si=
te
    Guido> down this weekend -- if I understand correctly, an in-memory=
 log
    Guido> was growing too fast.

I'm a bit confused by these statements.  MoinMoin is a CGI script.  I d=
on't
understand where "recursive wget" and "in-memory log" would come into p=
lay.
I recently fired up two Wikis on the Mojam server.  I never see any
long-running process which would suggest there's an in-memory log which=

could grow without bound.  The MoinMoin package does generate HTTP
redirects, but while they might coax wget into firing off another reque=
st,
it should be handled by a separate MoinMoin process on the server side.=
  You
should see the load grow significantly as the requests pour in, but
shouldn't see any one MoinMoin process gobbling up all sorts of resourc=
es.
J=FCrgen, can you elaborate on these themes a little more?

    Guido> I believe that Juergen has fixed the log-growing problem.  S=
hould
    Guido> we enable the Wiki again and hope for the best?

With an XS4ALL person at the ready?  Perhaps someone can keep a window =
open
on creosote running something like

    while true ; do
        ps auxww | egrep python | sort -r -n -k 5,5 | head -1
=09sleep 15
    done

I'm running out for the next few hours.  I'll be happy to run the while=
 loop
when I return.

Skip


From webmaster@python.org  Wed Jul 31 17:21:47 2002
From: webmaster@python.org (webmaster@python.org)
Date: Wed, 31 Jul 2002 12:21:47 -0400
Subject: [Python-Dev] Re: What to do about the Wiki?
References: <200207311547.g6VFlk601129@odiug.zope.com>
 <15688.2985.118330.48738@localhost.localdomain>
Message-ID: <15688.3739.1719.207581@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@pobox.com> writes:

    Guido> I believe that Juergen has fixed the log-growing problem.
    Guido> Should we enable the Wiki again and hope for the best?

I just did, by twiddling the +x bits on moinmoin

    SM> With an XS4ALL person at the ready?  Perhaps someone can keep
    SM> a window open on creosote running something like

    |     while true ; do
    |         ps auxww | egrep python | sort -r -n -k 5,5 | head -1
    | 	sleep 15
    |     done

    SM> I'm running out for the next few hours.  I'll be happy to run
    SM> the while loop when I return.

I'm doing this now, but even hitting the wiki it doesn't show up.  I'm
just going to run top for a while, but it's a fairly old version of
top. :/

-Barry


From guido@python.org  Wed Jul 31 17:16:56 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 31 Jul 2002 12:16:56 -0400
Subject: [Python-Dev] Re: What to do about the Wiki?
In-Reply-To: Your message of "Wed, 31 Jul 2002 11:09:13 CDT."
 <15688.2985.118330.48738@localhost.localdomain>
References: <200207311547.g6VFlk601129@odiug.zope.com>
 <15688.2985.118330.48738@localhost.localdomain>
Message-ID: <200207311616.g6VGGuF01886@odiug.zope.com>

>     Guido> Juergen Hermann, Moinmoin's author, said he fixed a few things,
>     Guido> but also said that Moinmoin is essentially vulnerable to
>     Guido> "recursive wget" (e.g. someone trying to suck up the entire Wiki
>     Guido> by following links).  Apparently this is what brought the site
>     Guido> down this weekend -- if I understand correctly, an in-memory log
>     Guido> was growing too fast.
> 
> I'm a bit confused by these statements.  MoinMoin is a CGI script.  I don't
> understand where "recursive wget" and "in-memory log" would come into play.
> I recently fired up two Wikis on the Mojam server.  I never see any
> long-running process which would suggest there's an in-memory log which
> could grow without bound.  The MoinMoin package does generate HTTP
> redirects, but while they might coax wget into firing off another request,
> it should be handled by a separate MoinMoin process on the server side.  You
> should see the load grow significantly as the requests pour in, but
> shouldn't see any one MoinMoin process gobbling up all sorts of resources.
> Jürgen, can you elaborate on these themes a little more?

Juergen seems offline or too busy to respond.  Here's what he wrote on
the matter.  I guess he's reading the entire log into memory and
updating it there.

| Subject: [Pydotorg] wiki
| From: Juergen Hermann <jh@web.de>
| To: "pydotorg@python.org" <pydotorg@python.org>
| Date: Mon, 29 Jul 2002 20:32:31 +0200
| Hi!
| 
| I looked into the wiki, and two things killed us:
| 
| a) apart from google hits, some $!&%$""$% did a recursive wget. And the 
| wiki spans a rather wide uri space...
| 
| b) the event log grows much faster than I'm used to, thus some 
| "simple" algorithms don't hold for this size.
| 
| 
| Solutions: 
| 
| a) I just updated the wiki software, the current cvs contains a 
| robot/wget filter that forbids any access except to "view page" URIs 
| (i.e. we remain open to google, but no more open than absolutely 
| needed). If need be, we can forbid access altogether, or only allow 
| google.
| 
| b) I'll install a cron job that rotates the logs, to keep them short.
| 
| I shortened the logs manually for now. So if you all agree, we could 
| activate the wiki again.
| 
| 
| Ciao, Jürgen

Reading this again, I think we should give it a try again.

>     Guido> I believe that Juergen has fixed the log-growing problem.  Should
>     Guido> we enable the Wiki again and hope for the best?
> 
> With an XS4ALL person at the ready?  Perhaps someone can keep a window open
> on creosote running something like
> 
>     while true ; do
>         ps auxww | egrep python | sort -r -n -k 5,5 | head -1
> 	sleep 15
>     done
> 
> I'm running out for the next few hours.  I'll be happy to run the while loop
> when I return.

We'll watch it here.  I know who to write to have it rebooted.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Wed Jul 31 17:43:40 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 31 Jul 2002 12:43:40 -0400
Subject: [Python-Dev] imaplib test failure
In-Reply-To: <15688.241.352958.223156@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPJAIAB.tim.one@comcast.net>

> Yes, everyone is:
>
> http://mail.python.org/pipermail/python-dev/2002-July/027056.html
> 
> but no one's stepped up to the plate yet, including pierslauder <1.4
> wink>.

I just reverted test_imaplib to rev 1.3, the last version that worked here.


From mal@lemburg.com  Wed Jul 31 18:02:51 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jul 2002 19:02:51 +0200
Subject: [Python-Dev] Re: What to do about the Wiki?
References: <200207311547.g6VFlk601129@odiug.zope.com>              <15688.2985.118330.48738@localhost.localdomain> <200207311616.g6VGGuF01886@odiug.zope.com>
Message-ID: <3D48183B.7070306@lemburg.com>

Guido van Rossum wrote:
>>    Guido> Juergen Hermann, Moinmoin's author, said he fixed a few thin=
gs,
>>    Guido> but also said that Moinmoin is essentially vulnerable to
>>    Guido> "recursive wget" (e.g. someone trying to suck up the entire =
Wiki
>>    Guido> by following links).  Apparently this is what brought the si=
te
>>    Guido> down this weekend -- if I understand correctly, an in-memory=
 log
>>    Guido> was growing too fast.
>>
>>I'm a bit confused by these statements.  MoinMoin is a CGI script.  I d=
on't
>>understand where "recursive wget" and "in-memory log" would come into p=
lay.
>>I recently fired up two Wikis on the Mojam server.  I never see any
>>long-running process which would suggest there's an in-memory log which
>>could grow without bound.  The MoinMoin package does generate HTTP
>>redirects, but while they might coax wget into firing off another reque=
st,
>>it should be handled by a separate MoinMoin process on the server side.=
  You
>>should see the load grow significantly as the requests pour in, but
>>shouldn't see any one MoinMoin process gobbling up all sorts of resourc=
es.
>>J=FCrgen, can you elaborate on these themes a little more?
>=20
>=20
> Juergen seems offline or too busy to respond.  Here's what he wrote on
> the matter.  I guess he's reading the entire log into memory and
> updating it there.

J=FCrgen is talking about the file event.log which MoinMoin writes.
This is not read into memory. New events are simply appended to
the file.

Now since the Wiki has recursive links such as the "LikePages"
links on all pages and history links like the per page
info screen, a recursive wget is likely to run for quite a
while (even more because the URL level doesn't change much
and thus probably doesn't trigger any depth restrictions on wget-
like crawlers) and generate lots of events...

What was the cause of the break down ? A full disk or a process
claiming all resources ?

> | Subject: [Pydotorg] wiki
> | From: Juergen Hermann <jh@web.de>
> | To: "pydotorg@python.org" <pydotorg@python.org>
> | Date: Mon, 29 Jul 2002 20:32:31 +0200
> | Hi!
> |=20
> | I looked into the wiki, and two things killed us:
> |=20
> | a) apart from google hits, some $!&%$""$% did a recursive wget. And t=
he=20
> | wiki spans a rather wide uri space...
> |=20
> | b) the event log grows much faster than I'm used to, thus some=20
> | "simple" algorithms don't hold for this size.
> |=20
> |=20
> | Solutions:=20
> |=20
> | a) I just updated the wiki software, the current cvs contains a=20
> | robot/wget filter that forbids any access except to "view page" URIs=20
> | (i.e. we remain open to google, but no more open than absolutely=20
> | needed). If need be, we can forbid access altogether, or only allow=20
> | google.
> |=20
> | b) I'll install a cron job that rotates the logs, to keep them short.
> |=20
> | I shortened the logs manually for now. So if you all agree, we could=20
> | activate the wiki again.
> |=20
> |=20
> | Ciao, J=FCrgen
>=20
> Reading this again, I think we should give it a try again.

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From tim.one@comcast.net  Wed Jul 31 18:07:46 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 31 Jul 2002 13:07:46 -0400
Subject: [Pydotorg] Re: [Python-Dev] Re: What to do about the Wiki?
In-Reply-To: <3D48183B.7070306@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEPMAIAB.tim.one@comcast.net>

[M.-A. Lemburg]
> What was the cause of the break down ? A full disk or a process
> claiming all resources ?

Thomas Wouters told me the process grew so large that it ran out of swapfile
space.

swapping-rumors-ly y'rs  - tim



From tim.one@comcast.net  Wed Jul 31 18:16:20 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 31 Jul 2002 13:16:20 -0400
Subject: [Python-Dev] Valgrinding Python
In-Reply-To: <3D47491C.B0E9E165@metaslash.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEPOAIAB.tim.one@comcast.net>

[Neal Norwitz]
> This is good news.  I changed ADDRESS_IN_RANGE to a function,
> then suppressed it.  There were no other uninitialized memory reads.

Cool!  In

	if (ADDRESS_IN_RANGE(p, pool->arenaindex)) {

it's actually only the pool->arenaindex subexpression that may read
uninitialized memory; the ADDRESS_IN_RANGE macro itself doesn't do anything
"bad".

> Valgrind does report a bunch of problems with pthreads, but
> these are likely valgrind's fault.  There are some complaints
> about memory leaks, but these seem to appear only to occur
> when spawning/threading.  The leaks are small and short lived.

A novel definition for "leak" <wink>.



From guido@python.org  Wed Jul 31 18:24:12 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 31 Jul 2002 13:24:12 -0400
Subject: [Python-Dev] Re: What to do about the Wiki?
In-Reply-To: Your message of "Wed, 31 Jul 2002 19:02:51 +0200."
 <3D48183B.7070306@lemburg.com>
References: <200207311547.g6VFlk601129@odiug.zope.com> <15688.2985.118330.48738@localhost.localdomain> <200207311616.g6VGGuF01886@odiug.zope.com>
 <3D48183B.7070306@lemburg.com>
Message-ID: <200207311724.g6VHOCZ02434@odiug.zope.com>

> > Juergen seems offline or too busy to respond.  Here's what he wrote on
> > the matter.  I guess he's reading the entire log into memory and
> > updating it there.
> 
> Jürgen is talking about the file event.log which MoinMoin writes.
> This is not read into memory. New events are simply appended to
> the file.
> 
> Now since the Wiki has recursive links such as the "LikePages"
> links on all pages and history links like the per page
> info screen, a recursive wget is likely to run for quite a
> while (even more because the URL level doesn't change much
> and thus probably doesn't trigger any depth restrictions on wget-
> like crawlers) and generate lots of events...
> 
> What was the cause of the break down ? A full disk or a process
> claiming all resources ?

A process running out of memory, AFAIK.

I just ran a recursive wget on the Wiki, and it completed without
bringing the site down, downloading about 1000 files (several views
for each Wiki page).  I didn't see the Wiki appear in the "top"
display.

So either Juergen fixed the problem (as he said he did) or there was a
different cause.

I do wish Juergen responded to his mail.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Wed Jul 31 18:26:12 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 31 Jul 2002 13:26:12 -0400
Subject: [Python-Dev] Now test_socket fails
Message-ID: <LNBBLJKPBEHFEDALKOLCEEAAAJAB.tim.one@comcast.net>

What's socket.socket() supposed to do without any arguments?  Can't work on
Windows, because socket.py has

if (sys.platform.lower().startswith("win")
    or (hasattr(os, 'uname') and os.uname()[0] == "BeOS")
    or sys.platform=="riscos"):

    _realsocketcall = _socket.socket

    def socket(family, type, proto=0):
        return _socketobject(_realsocketcall(family, type, proto))


C:\Code\python\PCbuild>python ../lib/test/test_socket.py
Testing for mission critical constants. ... ok
Testing default timeout. ... ERROR
Testing getservbyname(). ... ok
Testing getsockopt(). ... ok
Testing hostname resolution mechanisms. ... ok
Making sure getnameinfo doesn't crash the interpreter. ... ok
testNtoH (__main__.GeneralModuleTests) ... ok
Testing reference count for getnameinfo. ... ok
testing send() after close() with timeout. ... ok
Testing setsockopt(). ... ok
Testing getsockname(). ... ok
Testing that socket module exceptions. ... ok
Testing fromfd(). ... ok
Testing receive in chunks over TCP. ... ok
Testing recvfrom() in chunks over TCP. ... ok
Testing large receive over TCP. ... ok
Testing large recvfrom() over TCP. ... ok
Testing sendall() with a 2048 byte string over TCP. ... ok
Testing shutdown(). ... ok
Testing recvfrom() over UDP. ... ok
Testing sendto() and Recv() over UDP. ... ok
Testing non-blocking accept. ... ok
Testing non-blocking connect. ... ok
Testing non-blocking recv. ... ok
Testing whether set blocking works. ... ok
Performing file readline test. ... ok
Performing small file read test. ... ok
Performing unbuffered file read test. ... ok

======================================================================
ERROR: Testing default timeout.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "../lib/test/test_socket.py", line 273, in testDefaultTimeout
    s = socket.socket()
TypeError: socket() takes at least 2 arguments (0 given)

----------------------------------------------------------------------
Ran 28 tests in 3.190s

FAILED (errors=1)
Traceback (most recent call last):
  File "../lib/test/test_socket.py", line 559, in ?
    test_main()
  File "../lib/test/test_socket.py", line 556, in test_main
    test_support.run_suite(suite)
  File "C:\CODE\PYTHON\lib\test\test_support.py", line 188, in run_suite
    raise TestFailed(err)
test.test_support.TestFailed: Traceback (most recent call last):
  File "../lib/test/test_socket.py", line 273, in testDefaultTimeout
    s = socket.socket()
TypeError: socket() takes at least 2 arguments (0 given)



From tim.one@comcast.net  Wed Jul 31 18:33:03 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 31 Jul 2002 13:33:03 -0400
Subject: [Python-Dev] Now test_socket fails
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEAAAJAB.tim.one@comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEABAJAB.tim.one@comcast.net>

[me]
> What's socket.socket() supposed to do without any arguments?  
> Can't work on Windows, because socket.py has ...

Nevermind; I changed socket.py so this works as intended.


From mgilfix@eecs.tufts.edu  Wed Jul 31 18:37:11 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Wed, 31 Jul 2002 13:37:11 -0400
Subject: [Python-Dev] Now test_socket fails
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEAAAJAB.tim.one@comcast.net>; from tim.one@comcast.net on Wed, Jul 31, 2002 at 01:26:12PM -0400
References: <LNBBLJKPBEHFEDALKOLCEEAAAJAB.tim.one@comcast.net>
Message-ID: <20020731133711.H26901@eecs.tufts.edu>

  I'm pretty sure that qualifies as a bug. The problem exists on linux
as well (as a fresh cvs update has shown). In general though, the
socket call should always take the two arguments.

  It seems at one point that the 2.3 version of the socket module accepted
erroneously just a socket() call, while 2.2 does not. It seems Guido
added these lines to integrate default timeout testing. If someone with
write priveleges can just fix that to read:

  s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

  that should fix the problem.

                      -- Mike

On Wed, Jul 31 @ 13:26, Tim Peters wrote:
> What's socket.socket() supposed to do without any arguments?  Can't work on
> Windows, because socket.py has
> 
> if (sys.platform.lower().startswith("win")
>     or (hasattr(os, 'uname') and os.uname()[0] == "BeOS")
>     or sys.platform=="riscos"):
> 
>     _realsocketcall = _socket.socket
> 
>     def socket(family, type, proto=0):
>         return _socketobject(_realsocketcall(family, type, proto))
> 
> 
> C:\Code\python\PCbuild>python ../lib/test/test_socket.py
> Testing for mission critical constants. ... ok
> Testing default timeout. ... ERROR
> Testing getservbyname(). ... ok
> Testing getsockopt(). ... ok
> Testing hostname resolution mechanisms. ... ok
> Making sure getnameinfo doesn't crash the interpreter. ... ok
> testNtoH (__main__.GeneralModuleTests) ... ok
> Testing reference count for getnameinfo. ... ok
> testing send() after close() with timeout. ... ok
> Testing setsockopt(). ... ok
> Testing getsockname(). ... ok
> Testing that socket module exceptions. ... ok
> Testing fromfd(). ... ok
> Testing receive in chunks over TCP. ... ok
> Testing recvfrom() in chunks over TCP. ... ok
> Testing large receive over TCP. ... ok
> Testing large recvfrom() over TCP. ... ok
> Testing sendall() with a 2048 byte string over TCP. ... ok
> Testing shutdown(). ... ok
> Testing recvfrom() over UDP. ... ok
> Testing sendto() and Recv() over UDP. ... ok
> Testing non-blocking accept. ... ok
> Testing non-blocking connect. ... ok
> Testing non-blocking recv. ... ok
> Testing whether set blocking works. ... ok
> Performing file readline test. ... ok
> Performing small file read test. ... ok
> Performing unbuffered file read test. ... ok
> 
> ======================================================================
> ERROR: Testing default timeout.
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "../lib/test/test_socket.py", line 273, in testDefaultTimeout
>     s = socket.socket()
> TypeError: socket() takes at least 2 arguments (0 given)
> 
> ----------------------------------------------------------------------
> Ran 28 tests in 3.190s
> 
> FAILED (errors=1)
> Traceback (most recent call last):
>   File "../lib/test/test_socket.py", line 559, in ?
>     test_main()
>   File "../lib/test/test_socket.py", line 556, in test_main
>     test_support.run_suite(suite)
>   File "C:\CODE\PYTHON\lib\test\test_support.py", line 188, in run_suite
>     raise TestFailed(err)
> test.test_support.TestFailed: Traceback (most recent call last):
>   File "../lib/test/test_socket.py", line 273, in testDefaultTimeout
>     s = socket.socket()
> TypeError: socket() takes at least 2 arguments (0 given)
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
`-> (tim.one)

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html


From mgilfix@eecs.tufts.edu  Wed Jul 31 18:38:12 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Wed, 31 Jul 2002 13:38:12 -0400
Subject: [Python-Dev] Now test_socket fails
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEABAJAB.tim.one@comcast.net>; from tim.one@comcast.net on Wed, Jul 31, 2002 at 01:33:03PM -0400
References: <LNBBLJKPBEHFEDALKOLCEEAAAJAB.tim.one@comcast.net> <LNBBLJKPBEHFEDALKOLCEEABAJAB.tim.one@comcast.net>
Message-ID: <20020731133812.I26901@eecs.tufts.edu>

  Er, I'm not sure that was such a good idea. This doesn't work on
linux and shouldn't. It never worked that way in 2.2 I'm not sure what
happened to make it work in 2.3. Was prior to my adding the timeout
socket changes.

                 -- Mike

On Wed, Jul 31 @ 13:33, Tim Peters wrote:
> [me]
> > What's socket.socket() supposed to do without any arguments?  
> > Can't work on Windows, because socket.py has ...
> 
> Nevermind; I changed socket.py so this works as intended.
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
`-> (tim.one)

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html


From guido@python.org  Wed Jul 31 18:40:34 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 31 Jul 2002 13:40:34 -0400
Subject: [Python-Dev] Now test_socket fails
In-Reply-To: Your message of "Wed, 31 Jul 2002 13:26:12 EDT."
 <LNBBLJKPBEHFEDALKOLCEEAAAJAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEEAAAJAB.tim.one@comcast.net>
Message-ID: <200207311740.g6VHeYS02538@odiug.zope.com>

> What's socket.socket() supposed to do without any arguments?  Can't work on
> Windows, because socket.py has
> 
> if (sys.platform.lower().startswith("win")
>     or (hasattr(os, 'uname') and os.uname()[0] == "BeOS")
>     or sys.platform=="riscos"):
> 
>     _realsocketcall = _socket.socket
> 
>     def socket(family, type, proto=0):
>         return _socketobject(_realsocketcall(family, type, proto))

Oops.  It's supposed to default to AF_INET, SOCK_STREAM now.  Can you
test this patch and check it in if it works?

*** socket.py	18 Jul 2002 17:08:34 -0000	1.22
--- socket.py	31 Jul 2002 17:35:25 -0000
***************
*** 62,68 ****
  
      _realsocketcall = _socket.socket
  
!     def socket(family, type, proto=0):
          return _socketobject(_realsocketcall(family, type, proto))
  
      if SSL_EXISTS:
--- 62,68 ----
  
      _realsocketcall = _socket.socket
  
!     def socket(family=AF_INET, type=SOCK_STREAM, proto=0):
          return _socketobject(_realsocketcall(family, type, proto))
  
      if SSL_EXISTS:

(There's another change we should really make -- instead of a socket
function, there should be a class socket whose constructor does the
work.  That's necessary so that isinstance(s, socket.socket) works on
Windows; this currently works on Unix but not on Windows.  But I don't
have time for that now; the above patch should do what you need.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Jul 31 18:45:14 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 31 Jul 2002 13:45:14 -0400
Subject: [Python-Dev] Now test_socket fails
In-Reply-To: Your message of "Wed, 31 Jul 2002 13:37:11 EDT."
 <20020731133711.H26901@eecs.tufts.edu>
References: <LNBBLJKPBEHFEDALKOLCEEAAAJAB.tim.one@comcast.net>
 <20020731133711.H26901@eecs.tufts.edu>
Message-ID: <200207311745.g6VHjEC02589@odiug.zope.com>

>   It seems at one point that the 2.3 version of the socket module accepted
> erroneously just a socket() call, while 2.2 does not.

I added this intentionally.  I am tired of typing
(AF_INET, SOCK_STREAM) where those are the 99% case.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Wed Jul 31 18:44:50 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 31 Jul 2002 13:44:50 -0400
Subject: [Python-Dev] Now test_socket fails
In-Reply-To: <20020731133711.H26901@eecs.tufts.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEACAJAB.tim.one@comcast.net>

[Michael Gilfix]
>   I'm pretty sure that qualifies as a bug. The problem exists on linux
> as well (as a fresh cvs update has shown). In general though, the
> socket call should always take the two arguments.
>
>   It seems at one point that the 2.3 version of the socket module
> accepted erroneously just a socket() call, while 2.2 does not. It seems
> Guido added these lines to integrate default timeout testing. If someone
> with write priveleges can just fix that to read:
>
>   s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>
>   that should fix the problem.

I'll leave this to you and Guido.  The test works fine on Windows now.  The
docstring for _socket.socket claims that all arguments are optional.  The
code matches the docs:

sock_initobj(PyObject *self, PyObject *args, PyObject *kwds)
{
	PySocketSockObject *s = (PySocketSockObject *)self;
	SOCKET_T fd;
	int family = AF_INET, type = SOCK_STREAM, proto = 0;
	static char *keywords[] = {"family", "type", "proto", 0};

ALL ARGS ARE OPTIONAL HERE
	if (!PyArg_ParseTupleAndKeywords(args, kwds,
					 "|iii:socket", keywords,
					 &family, &type, &proto))
		return -1;

	Py_BEGIN_ALLOW_THREADS
	fd = socket(family, type, proto);
	Py_END_ALLOW_THREADS



From guido@python.org  Wed Jul 31 18:47:34 2002
From: guido@python.org (Guido van Rossum)
Date: Wed, 31 Jul 2002 13:47:34 -0400
Subject: [Python-Dev] Now test_socket fails
In-Reply-To: Your message of "Wed, 31 Jul 2002 13:38:12 EDT."
 <20020731133812.I26901@eecs.tufts.edu>
References: <LNBBLJKPBEHFEDALKOLCEEAAAJAB.tim.one@comcast.net> <LNBBLJKPBEHFEDALKOLCEEABAJAB.tim.one@comcast.net>
 <20020731133812.I26901@eecs.tufts.edu>
Message-ID: <200207311747.g6VHlYr02626@odiug.zope.com>

>   Er, I'm not sure that was such a good idea. This doesn't work on
> linux and shouldn't. It never worked that way in 2.2 I'm not sure what
> happened to make it work in 2.3. Was prior to my adding the timeout
> socket changes.

What do you mean it doesn't work on Linux?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Wed Jul 31 18:52:06 2002
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 31 Jul 2002 13:52:06 -0400
Subject: [Python-Dev] Now test_socket fails
In-Reply-To: <200207311740.g6VHeYS02538@odiug.zope.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEAEAJAB.tim.one@comcast.net>

[Guido]
> (There's another change we should really make -- instead of a socket
> function, there should be a class socket whose constructor does the
> work.  That's necessary so that isinstance(s, socket.socket) works on
> Windows; this currently works on Unix but not on Windows.

http://www.python.org/sf/589262



From mgilfix@eecs.tufts.edu  Wed Jul 31 18:57:06 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Wed, 31 Jul 2002 13:57:06 -0400
Subject: [Python-Dev] Now test_socket fails
In-Reply-To: <200207311745.g6VHjEC02589@odiug.zope.com>; from guido@python.org on Wed, Jul 31, 2002 at 01:45:14PM -0400
References: <LNBBLJKPBEHFEDALKOLCEEAAAJAB.tim.one@comcast.net> <20020731133711.H26901@eecs.tufts.edu> <200207311745.g6VHjEC02589@odiug.zope.com>
Message-ID: <20020731135705.J26901@eecs.tufts.edu>

  Sounds fair. Found it in the docs so I'm happy.

On Wed, Jul 31 @ 13:45, Guido van Rossum wrote:
> >   It seems at one point that the 2.3 version of the socket module accepted
> > erroneously just a socket() call, while 2.2 does not.
> 
> I added this intentionally.  I am tired of typing
> (AF_INET, SOCK_STREAM) where those are the 99% case.

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html


From mal@lemburg.com  Wed Jul 31 18:56:49 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jul 2002 19:56:49 +0200
Subject: [Python-Dev] Re: What to do about the Wiki?
References: <200207311547.g6VFlk601129@odiug.zope.com> <15688.2985.118330.48738@localhost.localdomain> <200207311616.g6VGGuF01886@odiug.zope.com>              <3D48183B.7070306@lemburg.com> <200207311724.g6VHOCZ02434@odiug.zope.com>
Message-ID: <3D4824E1.1090304@lemburg.com>

Guido van Rossum wrote:
>>>Juergen seems offline or too busy to respond.  Here's what he wrote on
>>>the matter.  I guess he's reading the entire log into memory and
>>>updating it there.
>>
>>J=FCrgen is talking about the file event.log which MoinMoin writes.
>>This is not read into memory. New events are simply appended to
>>the file.
>>
>>Now since the Wiki has recursive links such as the "LikePages"
>>links on all pages and history links like the per page
>>info screen, a recursive wget is likely to run for quite a
>>while (even more because the URL level doesn't change much
>>and thus probably doesn't trigger any depth restrictions on wget-
>>like crawlers) and generate lots of events...
>>
>>What was the cause of the break down ? A full disk or a process
>>claiming all resources ?
>=20
>=20
> A process running out of memory, AFAIK.

In that case, wouldn't it be better to impose a memoryuse limit
on the user which Apache uses for dealing with CGI
scripts ? That wouldn't solve any specific Wiki related
problem, but prevents the server from going offline because
of memory problems.

> I just ran a recursive wget on the Wiki, and it completed without
> bringing the site down, downloading about 1000 files (several views
> for each Wiki page).  I didn't see the Wiki appear in the "top"
> display.
>=20
> So either Juergen fixed the problem (as he said he did) or there was a
> different cause.
>=20
> I do wish Juergen responded to his mail.

It's vacation time in Germany, so he may well be offline for
a while.

--=20
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From mgilfix@eecs.tufts.edu  Wed Jul 31 18:58:03 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Wed, 31 Jul 2002 13:58:03 -0400
Subject: [Python-Dev] Now test_socket fails
In-Reply-To: <200207311747.g6VHlYr02626@odiug.zope.com>; from guido@python.org on Wed, Jul 31, 2002 at 01:47:34PM -0400
References: <LNBBLJKPBEHFEDALKOLCEEAAAJAB.tim.one@comcast.net> <LNBBLJKPBEHFEDALKOLCEEABAJAB.tim.one@comcast.net> <20020731133812.I26901@eecs.tufts.edu> <200207311747.g6VHlYr02626@odiug.zope.com>
Message-ID: <20020731135803.K26901@eecs.tufts.edu>

On Wed, Jul 31 @ 13:47, Guido van Rossum wrote:
> >   Er, I'm not sure that was such a good idea. This doesn't work on
> > linux and shouldn't. It never worked that way in 2.2 I'm not sure what
> > happened to make it work in 2.3. Was prior to my adding the timeout
> > socket changes.
> 
> What do you mean it doesn't work on Linux?

  My fault. It works. I, uh, didn't set my path correctly :)

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html


From mgilfix@eecs.tufts.edu  Wed Jul 31 19:00:58 2002
From: mgilfix@eecs.tufts.edu (Michael Gilfix)
Date: Wed, 31 Jul 2002 14:00:58 -0400
Subject: [Python-Dev] Now test_socket fails
In-Reply-To: <200207311740.g6VHeYS02538@odiug.zope.com>; from guido@python.org on Wed, Jul 31, 2002 at 01:40:34PM -0400
References: <LNBBLJKPBEHFEDALKOLCEEAAAJAB.tim.one@comcast.net> <200207311740.g6VHeYS02538@odiug.zope.com>
Message-ID: <20020731140057.L26901@eecs.tufts.edu>

  Would a little trick like this do?

  class socket:
    pass

  class unix_socket(socket):
    pass

  class windows_socket(socket):
    # Old windows stuff

  And then just do the namespace shuffling that's kinda already done
in socket.py.

                    -- Mike

On Wed, Jul 31 @ 13:40, Guido van Rossum wrote:
> (There's another change we should really make -- instead of a socket
> function, there should be a class socket whose constructor does the
> work.  That's necessary so that isinstance(s, socket.socket) works on
> Windows; this currently works on Unix but not on Windows.  But I don't
> have time for that now; the above patch should do what you need.)

-- 
Michael Gilfix
mgilfix@eecs.tufts.edu

For my gpg public key:
http://www.eecs.tufts.edu/~mgilfix/contact.html


From mal@lemburg.com  Wed Jul 31 19:04:36 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jul 2002 20:04:36 +0200
Subject: [Python-Dev] Re: What to do about the Wiki?
References: <200207311547.g6VFlk601129@odiug.zope.com> <15688.2985.118330.48738@localhost.localdomain> <200207311616.g6VGGuF01886@odiug.zope.com>              <3D48183B.7070306@lemburg.com> <200207311724.g6VHOCZ02434@odiug.zope.com> <3D4824E1.1090304@lemburg.com>
Message-ID: <3D4826B4.4060606@lemburg.com>

M.-A. Lemburg wrote:
> Guido van Rossum wrote:
>>> What was the cause of the break down ? A full disk or a process
>>> claiming all resources ?
>> A process running out of memory, AFAIK.
> 
> 
> In that case, wouldn't it be better to impose a memoryuse limit
> on the user which Apache uses for dealing with CGI
> scripts ? That wouldn't solve any specific Wiki related
> problem, but prevents the server from going offline because
> of memory problems.

Here's how Apache can be configured for this (without having
to fiddle with the Apache user account):

http://httpd.apache.org/docs/mod/core.html#rlimitmem

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/



From thomas.heller@ion-tof.com  Wed Jul 31 19:53:23 2002
From: thomas.heller@ion-tof.com (Thomas Heller)
Date: Wed, 31 Jul 2002 20:53:23 +0200
Subject: [Python-Dev] PEP 298 - the Fixed Buffer Interface
References: <04da01c237ef$c103ac30$e000a8c0@thomasnotebook>  <200207301946.g6UJkf520799@odiug.zope.com>
Message-ID: <0fe601c238c3$8bab1b20$e000a8c0@thomasnotebook>

I've changed PEP 298 to incorporate the latest changes.
Barry has not yet run pep2html (and I don't want to bother
him too much with this), also I don't know if it makes sense
to post it again in its full length.
So here is the link to view it online in text format:

http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/nondist/peps/pep-0298.txt?rev=1.4

and this is the checkin message:
-----
The model exposed by the fixed buffer interface was changed:
Retrieving a buffer from an object puts this in a locked state, and a
releasebuffer function must be called to unlock the object again.

Added releasefixedbuffer function slot, and renamed the
get...fixedbuffer functions to acquire...fixedbuffer functions.

Renamed the flag from Py_TPFLAG_HAVE_GETFIXEDBUFFER to
Py_TPFLAG_HAVE_FIXEDBUFFER. (Is the 'fixed buffer' name still useful,
or should we use 'static buffer' instead?)

Added posting date (was posted to c.l.p and python-dev).
-----

Thomas




From skip@pobox.com  Wed Jul 31 22:06:26 2002
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 31 Jul 2002 16:06:26 -0500
Subject: [Python-Dev] Re: What to do about the Wiki?
In-Reply-To: <15688.3739.1719.207581@anthem.wooz.org>
References: <200207311547.g6VFlk601129@odiug.zope.com>
 <15688.2985.118330.48738@localhost.localdomain>
 <15688.3739.1719.207581@anthem.wooz.org>
Message-ID: <15688.20818.999604.113193@localhost.localdomain>

    BAW> I'm doing this now, but even hitting the wiki it doesn't show up.

This is good. ;-)

Skip


From greg@cosc.canterbury.ac.nz  Wed Jul 31 23:31:37 2002
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 01 Aug 2002 10:31:37 +1200 (NZST)
Subject: [Python-Dev] pre-PEP: The Safe Buffer Interface
In-Reply-To: <20020731063113.74481.qmail@web40110.mail.yahoo.com>
Message-ID: <200207312231.g6VMVbt2019712@kuku.cosc.canterbury.ac.nz>

> Moreover, if the sensible use cases for locking are few and far
> between, then I'm still inclined to leave it out since you can add the
> locking semantics at a different level.

Are you sure about that? Without the locking, only non-resizable
objects would be able to implement the protocol. So any higher
level locking would have to be implemented on top of the old,
non-safe version. Then you'd have to make sure that all parts
of your application accessed the object through the extra
layer. The "safe" part would be lost.

> Your use of the word *no* is different than mine.  :-) I could
> similarly claim that the segment count puts no burden on
> implementations that don't need it.

I think I may have been replying to something other than what was
said. But what I said is still true -- it imposes no extra burden on
*implementers* of the interface which don't use the extra feature. I
acknowledge that it complicates things slightly for *users* of the
interface, but not as much as the seg count stuff does (there's no
need for any testing or exception raising).

> I believe it will be a no-op in enough places that extension writers
> will do it wrong without even knowing.

Well, there's not much that can be done about extension
writers who fail to read the documentation, or wilfully
ignore it.

> Which exception?  Would you introduce a standard exception that should
> be raised when the user tries to do an operation that currently isn't
> allowed because the buffer is locked?

Maybe. It doesn't matter. The important thing is that the
interpeter does not crash.

> I still believe the locking can be added on top of the simpler
> interface as needed.

But it can't, since as I pointed out above, resizable objects
won't be able to provide the simpler interface!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From ark@research.att.com  Wed Jul 31 23:35:21 2002
From: ark@research.att.com (Andrew Koenig)
Date: Wed, 31 Jul 2002 18:35:21 -0400 (EDT)
Subject: [Python-Dev] split('') revisited
Message-ID: <200207312235.g6VMZL218546@europa.research.att.com>

Back in February, there was a thread in comp.lang.python (and, I
think, also on Python-Dev) that asked whether the following behavior:

        >>> 'abcde'.split('')
        Traceback (most recent call last):
          File "<stdin>", line 1, in ?
        ValueError: empty separator

was a bug or a feature.  The prevailing opinion at the time seemed
to be that there was not a sensible, unique way of defining this
operation, so rejecting it was a feature.

That answer didn't bother me particularly at the time, but since then
I have learned a new fact (or perhaps an old fact that I didn't notice
at the time) that has changed my mind: Section 4.2.4 of the library
reference says that the 'split' method of a regular expression object
is defined as

        Identical to the split() function, using the compiled pattern.

This claim does not appear to be correct:

        >>> import re
        >>> re.compile('').split('abcde')
        ['abcde']

This result differs from the result of using the string split method.

In other words, the documentation doesn't match the actual behavior,
so the status quo is broken.

It seems to me that there are four reasonable courses of action:

   1) Do nothing -- the problem is too trivial to worry about.

   2) Change string split (and its documentation) to match regexp split.

   3) Change regexp split (and its documentation) to match string split.

   4) Change both string split and regexp split to do something else :-)

My first impulse was to argue that (4) is right, and that the behavior
should be as follows

        >>> 'abcde'.split('')
	['a', 'b', 'c', 'd', 'e']
        >>> import re
        >>> re.compile('').split('abcde')
	['a', 'b', 'c', 'd', 'e']

When this discussion came up last time, I think there was an objection
that s.split('') was ambiguous: What argument is there in favor of
'abcde'.split('') being ['a', 'b', 'c', 'd', 'e'] instead of, say,
['', 'a', 'b', 'c', 'd', 'e', ''] or, for that matter, ['', 'a', '',
'b', '', 'c', '', 'd', '', 'e', '']?

I made the counterargument that one could disambiguate by adding the
rule that no element of the result could be equal to the delimiter.
Therefore, if s is a string, s.split('') cannot contain any empty
strings.

However, looking at the behavior of regular expression splitting more
closely, I become more confused.  Can someone explain the following
behavior to me?

        >>> re.compile('a|(x?)').split('abracadabra') 
        ['', None, 'br', None, 'c', None, 'd', None, 'br', None, '']