From brett at python.org  Mon May  1 02:31:49 2006
From: brett at python.org (Brett Cannon)
Date: Sun, 30 Apr 2006 17:31:49 -0700
Subject: [Python-Dev] Adding functools.decorator
In-Reply-To: <ca471dc20604300936s2a1bdf0fof8395428230963d9@mail.gmail.com>
References: <4454C203.2060707@iinet.net.au> <e32kqa$bq6$1@sea.gmane.org>
	<ca471dc20604300936s2a1bdf0fof8395428230963d9@mail.gmail.com>
Message-ID: <bbaeab100604301731l2adbc588x7e7510a1b193b9a0@mail.gmail.com>

On 4/30/06, Guido van Rossum <guido at python.org> wrote:
> On 4/30/06, Georg Brandl <g.brandl at gmx.net> wrote:
> > Nick Coghlan wrote:
> > > Collin Winters has done the work necessary to rename PEP 309's functional
> > > module to functools and posted the details to SF [1].
> > >
> > > I'd like to take that patch, tweak it so the C module is built as _functools
> > > rather than functools, and then add a functools.py consisting of:
> >
> > I'm all for it. (You could integrate the C version of "decorator" from my SF
> > patch, but I think Python-only is enough).
>
> Stronger -- this should *not* be implemented in C. There's no
> performance need, and the C code is much harder to understand, check,
> and modify.
>
> I expect that at some point people will want to tweak what gets copied
> by _update_wrapper() -- e.g. some attributes may need to be
> deep-copied, or personalized, or skipped, etc. (Doesn't this already
> apply to __decorator__ and __decorates__? I can't prove to myself that
> these get set to the right things when several decorators are stacked
> on top of each other.)
>
> I'm curious if @decorator is the right name and the right API for this
> though? The name is overly wide (many things are decorators but should
> not be decorated with @decorator) and I wonder of a manual call to
> _update_wrapper() wouldn't be just as useful.

I am +0 on the manual call.  Just seems like I would like to happen
explicitly instead of through a decorator.

> (Perhaps with a simpler
> API -- I'm tempted to call YAGNI on the __decorator__ and
> __decorates__ attributes.)
>

This should also be redundant once I get the signature PEP cleaned up
(and probably email py3k or python-dev on the subject) since I plan to
store a reference to the function the object represents.  That would
mean if update_wrapper() copies the signature object over then you
have a reference to the original function that way.

-Brett

> I think there are too many design options here to check this in
> without more discussion.
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
>

From brett at python.org  Mon May  1 02:36:35 2006
From: brett at python.org (Brett Cannon)
Date: Sun, 30 Apr 2006 17:36:35 -0700
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
	decorator
In-Reply-To: <ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
References: <4454BA62.5080704@iinet.net.au>
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
Message-ID: <bbaeab100604301736m682bdef8rfa4d8baaeb98dd81@mail.gmail.com>

On 4/30/06, Guido van Rossum <guido at python.org> wrote:
> On 4/30/06, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> > A few things from the pre-alpha2 context management terminology review have
> > had a chance to run around in the back of my head for a while now, and I'd
> > like to return to a topic Paul Moore brought up during that discussion.
>
> I believe the context API design has gotten totally out of hand.
> Regardless of the merits of the "with" approach to HTML generation
> (which I personally believe to be an abomination), I don't see why the
> standard library should support every possible use case with a
> custom-made decorator. Let the author of that tag library provide the
> decorator.
>
> I have a counter-proposal: let's drop __context__. Nearly all use
> cases have __context__ return self. In the remaining cases, would it
> really be such a big deal to let the user make an explicit call to
> some appropriately named method? The only example that I know of where
> __context__ doesn't return self is the decimal module. So the decimal
> users would have to type
>

+1.

-Brett

>   with mycontext.some_method() as ctx:    # ctx is a clone of mycontext
>       ctx.prec += 2
>       <BODY>
>
> The implementation of some_method() could be exactly what we currently
> have as the __context__ method on the decimal.Context object. Its
> return value is a decimal.WithStatementContext() instance, whose
> __enter__() method returns a clone of the original context object
> which is assigned to the variable in the with-statement (here 'ctx').
>
> This even has an additional advantage -- some_method() could have
> keyword parameters to set the precision and various other context
> parameters, so we could write this:
>
>   with mycontext.some_method(prec=mycontext.prec+2):
>       <BODY>
>
> Note that we can drop the variable too now (unless we have another
> need to reference it). An API tweak for certain attributes that are
> often incremented or decremented could reduce writing:
>
>   with mycontext.some_method(prec_incr=2):
>       <BODY>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
>

From tjreedy at udel.edu  Mon May  1 02:45:24 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 30 Apr 2006 20:45:24 -0400
Subject: [Python-Dev] PEP 3101: Advanced String Formatting
References: <4453AF51.6050605@acm.org>
Message-ID: <e33ln5$40b$1@sea.gmane.org>


"Talin" <talin at acm.org> wrote in message news:4453AF51.6050605 at acm.org...

>     Compound names are a sequence of simple names seperated by

separated

>     The string and unicode classes will have a class method called
>     'cformat' that does all the actual work of formatting; The
>     format() method is just a wrapper that calls cformat.
>
>     The parameters to the cformat function are:
>
>         -- The format string (or unicode; the same function handles
>            both.)

A class method gets the class as its first argument.  If the format string 
in the first argument, then it is a normal instance method.

tjr




From tjreedy at udel.edu  Mon May  1 03:08:57 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 30 Apr 2006 21:08:57 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org> <e31b0t$7i4$1@sea.gmane.org>
	<44549922.7020109@gmail.com>
Message-ID: <e33n39$6pi$1@sea.gmane.org>


"Nick Coghlan" <ncoghlan at gmail.com> wrote in message 
news:44549922.7020109 at gmail.com...
> Terry Reedy wrote:
>> "Talin" <talin at acm.org> wrote in message 
>> news:4453B025.3080100 at acm.org...
>>>     Now, suppose you wanted to have 'key' be a keyword-only argument.
>>
>> Why?  Why not let the user type the additional argument(s) without the
>> parameter name?

Like Martin, you clipped most of the essential context of my question: 
Talin's second proposal.
>>>    The second syntactical change is to allow the argument name to
>>>    be omitted for a varargs argument:
>>>        def compare(a, b, *, key=None):
>>>    The reasoning behind this change is as follows.  Imagine for a
>>>    moment a function which takes several positional arguments, as
>>>    well as a keyword argument:
>>>        def compare(a, b, key=None):

Again I ask, why would one want that?  And, is the need for that so strong 
as to justify introducing '*' as a pseudoparameter?

> Because for some functions (e.g. min()/max()) you want to use *args, but
> support some additional keyword arguments to tweak a few aspects of the
> operation (like providing a "key=x" option).

This and the rest of your 'explanation' is about Talin's first proposal, to 
which I already had said "The rationale for this is pretty obvious".

Terry Jan Reedy




From john at integralsource.com  Mon May  1 03:59:32 2006
From: john at integralsource.com (John Keyes)
Date: Mon, 1 May 2006 02:59:32 +0100
Subject: [Python-Dev] unittest argv
Message-ID: <c0141a030604301859y3937484cqe40fec2b04192296@mail.gmail.com>

Hi,

main() in unittest has an optional parameter called argv.  If it is not
present in the invocation, it defaults to None.  Later in the function
a check is made to see if argv is None and if so sets it to sys.argv.
I think the default should be changed to sys.argv[1:] (i.e. the
command line arguments minus the name of the python file
being executed).

The parseArgs() function then uses getopt to parse argv.  It currently
ignores the first item in the argv list, but this causes a problem when
it is called from another python function and not from the command
line.  So using the current code if I call:

python mytest.py -v

then argv in parseArgs is ['mytest.py', '-v']

But, if I call:

unittest.main(module=None, argv=['-v','mytest'])

then argv in parseArgs is ['mytest'], as you can see the verbosity option is
now gone and cannot be used.

Here's a diff to show the code changes I have made:

744c744
<                  argv=None, testRunner=None, testLoader=defaultTestLoader):
---
>                  argv=sys.argv[1:], testRunner=None, testLoader=defaultTestLoader):
751,752d750
<         if argv is None:
<             argv = sys.argv
757c755
<         self.progName = os.path.basename(argv[0])
---
> #        self.progName = os.path.basename(argv[0])
769c767
<             options, args = getopt.getopt(argv[1:], 'hHvq',
---
>             options, args = getopt.getopt(argv, 'hHvq',

You may notice I have commented out the self.progName line.  This variable
is not used anywhere in the module so I guess it could be removed.  To
keep it then conditional check on argv would have to remain and be moved after
the self.progName line.

I hope this makes sense, and it's my first post so go easy on me ;)

Thanks,
-John K

From ncoghlan at gmail.com  Mon May  1 04:05:15 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 01 May 2006 12:05:15 +1000
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
 decorator
In-Reply-To: <ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
References: <4454BA62.5080704@iinet.net.au>
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
Message-ID: <44556CDB.1070604@gmail.com>

Guido van Rossum wrote:
> On 4/30/06, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
>> A few things from the pre-alpha2 context management terminology review 
>> have
>> had a chance to run around in the back of my head for a while now, and 
>> I'd
>> like to return to a topic Paul Moore brought up during that discussion.
> 
> I believe the context API design has gotten totally out of hand.
> Regardless of the merits of the "with" approach to HTML generation
> (which I personally believe to be an abomination),

The example is tempting because it's easy to follow. I agree actually doing it 
in real code would almost certainly be nuts :)

> I don't see why the
> standard library should support every possible use case with a
> custom-made decorator. Let the author of that tag library provide the
> decorator.

The HTML tag was just an example. The underlying idea is being able to easily 
create a re-usable object that can be passed to multiple with statements 
(potentially nested within each other or within distinct threads). Without the 
__context__ method, the naive version of such an object looks like:

class reusable(object):
     def __init__(self, factory):
         self.factory = factory
         factory() # Check the factory works at definition time
     def __enter__(self):
         current = self.current = factory()
         return current.__enter__()
     def __exit__(self, *exc_info):
         return self.current.__exit__(*exc_info)

The downside of this over the __context__ method is that it is neither nesting 
nor thread-safe.  Because the storage is on the object rather than in the 
execution frame, sharing such objects between threads or using one for nested 
with statements will break (as self.current gets overwritten).

> I have a counter-proposal: let's drop __context__. Nearly all use
> cases have __context__ return self. In the remaining cases, would it
> really be such a big deal to let the user make an explicit call to
> some appropriately named method? The only example that I know of where
> __context__ doesn't return self is the decimal module.

It would also prevent threading.Condition from using its underlying lock 
object as the managed context.

The real problem I have with removing __context__() is that it pushes the 
burden of handling thread-safety and nesting-safety issues onto the developers 
of context managers without giving them any additional tools beyond 
threading.locals(). This was the problem Jason brought up for decimal.Context 
that lead to the introduction of __context__ in the first place.

Without the __context__() method, *users* of the with statement will be forced 
to create a new object with __enter__()/__exit__() methods every time, either 
by invoking a method (whose name will vary from object to object, depending on 
the whim of the designer) or by calling a factory function (which is likely to 
be created either as a zero-argument lambda returning an object with 
enter/exit methods, or else by using PEP 309's partial function).

So if you see a with statement with a bare variable name as the context 
expression, it will probably be wrong, unless:
  a) the implementor of that type provided thread-safety and nesting-safety; or
  b) the object is known to be neither thread-safe nor nesting-safe

The synchronisation objects in threading being examples of category a, file 
objects being examples of category b. In this scenario, generator contexts 
defined using @contextfactory should always be invoked directly in the context 
expression, as attempting to cache them in order to be reused won't work (you 
would need to put them in a zero-argument lambda and call it in the context 
expression, so that you get a new generator object each time).

Documenting all of the thread-safety and nesting-safety issues and how to deal 
with them would be a serious pain. I consider it much easier to provide the 
__context__() method and explain how to use that as the one obvious way to 
deal with such problems. Then only implementors need to care about it - from a 
user's point of view, you just provide a context expression that resolves to a 
context manager, and everything works as intended, including being able to 
cache that expression in a local variable and use it multiple times. (That 
last point obviously not applying to context managers like files that leave 
themselves in an unusable state after __exit__, and don't restore themselves 
to a usable state in __enter__).

Essentially, I don't think dropping __context__ would gain us anything - the 
complexity associated with it is real, and including that method in the API 
let's us deal with that complexity in one place, once and for all.

Removing the method from the statement definition just pushes the 
documentation burden out to all of the context managers where it matters (like 
decimal.Context, the documentation for which would get stuck with trying to 
explain why you have to call a method in order to get a usable context manager).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From guido at python.org  Mon May  1 04:10:46 2006
From: guido at python.org (Guido van Rossum)
Date: Sun, 30 Apr 2006 19:10:46 -0700
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
	decorator
In-Reply-To: <5.1.1.6.0.20060430144919.01e6bb18@mail.telecommunity.com>
References: <4454BA62.5080704@iinet.net.au>
	<5.1.1.6.0.20060430144919.01e6bb18@mail.telecommunity.com>
Message-ID: <ca471dc20604301910w3570bf3an630ba2afcbe3253d@mail.gmail.com>

> At 09:53 AM 4/30/2006 -0700, Guido van Rossum wrote:
> >I have a counter-proposal: let's drop __context__.
[...]
> >   with mycontext.some_method(prec_incr=2):
> >       <BODY>

On 4/30/06, Phillip J. Eby <pje at telecommunity.com> wrote:
> But what's an appropriate name for some_method?

Let's leave that up to the decimal lovers. You could call it push() or
stack() or establish() or some other non-descript word. Or manage() or
manager(). The method name can't involve the word "context" since the
object is already called a context. But that still leaves an infinite
number of possible names. :-)

> If you can solve the naming issue for these use cases (and I notice you
> punted on that issue by calling it "some_method"), then +1 on removing
> __context__.  Otherwise, I'm -0; we're just fixing one
> documentation/explanation problem (that only people writing contexts will
> care about) by creating others (that will affect the people *using*
> contexts too).

To the contrary. Even if we never come up with a perfect name for
decimal.Context.some_method(), then we've still solved a documentation
problem for 9 out of 10 cases where __context__ is just empty ballast.

But I'm sure that if we require that folks come up with a name, they will.

Things should be as simple as possible but no simpler. It's pretty
clear to me that dropping __context__ approaches this ideal. I'm sorry
I didn't push back harder when __context__ was first proposed -- in
retrospect, the first 5 months of PEP 343's life, before __context__
(or __with__, as it was originally called) was invented, were by far
its happiest times.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From edloper at gradient.cis.upenn.edu  Mon May  1 04:50:49 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Sun, 30 Apr 2006 22:50:49 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <e33mj1$5nd$1@sea.gmane.org>
References: <4453B025.3080100@acm.org>
	<e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de>
	<e33mj1$5nd$1@sea.gmane.org>
Message-ID: <44557789.5020107@gradient.cis.upenn.edu>

Terry Reedy wrote:
> There are two subproposals: first, keyword-only args after a variable 
> number of positional args, which requires allowing keyword parameter 
> specifications after the *args parameter, and second, keyword-only args 
> after a fixed number number of positional args, implemented with a naked 
> '*'.  To the first, I said "The rationale for this is pretty obvious.".  To 
> the second, I asked, and still ask, "Why?".

I see two possible reasons:

   - A function's author believes that calls to the function will be
     easier to read if certain parameters are passed by name, rather
     than positionally; and they want to enforce that calling
     convention on their users.  This seems to me to go against the
     "consenting adults" principle.

   - A function's author believes they might change the signature in the
     future to accept new positional arguments, and they will want to put
     them before the args that they declare keyword-only.

Both of these motivations seem fairly weak.  Certainly, neither seems to 
warrant a significant change to function definition syntax.

But perhaps there are other use cases that I'm failing to consider. 
Anyone know of any?

-Edward


From guido at python.org  Mon May  1 05:08:17 2006
From: guido at python.org (Guido van Rossum)
Date: Sun, 30 Apr 2006 20:08:17 -0700
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
	decorator
In-Reply-To: <44556CDB.1070604@gmail.com>
References: <4454BA62.5080704@iinet.net.au>
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
	<44556CDB.1070604@gmail.com>
Message-ID: <ca471dc20604302008s730da442q59b49e4dc4b60d82@mail.gmail.com>

[I'm cutting straight to the chase here]
On 4/30/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> The downside of this over the __context__ method is that it is neither nesting
> nor thread-safe.

This argument is bogus.

We currently have two types of objects involved in with-statements:
those whose __context__ returns self and those whose __context__
returns some other object. In all of the latter cases, the original
object doesn't have __enter__ or __exit__ methods, so using it in the
"reduced-with-statement" will be an immediate run-time error. Writing
a method that returns the correct type of object with the correct
behavior (thread-safe, or nesting, or whatever is required by that
specific object) is no harder whether the method name is __context__
or not.

> The real problem I have with removing __context__() is that it pushes the
> burden of handling thread-safety and nesting-safety issues onto the developers
> of context managers without giving them any additional tools beyond
> threading.locals(). This was the problem Jason brought up for decimal.Context
> that lead to the introduction of __context__ in the first place.

Again, I don't see how writing the thread-safe version is easier when
the method is called __context__.

> Without the __context__() method, *users* of the with statement will be forced
> to create a new object with __enter__()/__exit__() methods every time,

You seem to be missing the evidence that 9 out of 10 objects currently
have a __context__ that returns self. In all those cases the user of
the with-statement won't have to make any changes at all compared to
code that works with 2.5a2.

> either
> by invoking a method (whose name will vary from object to object, depending on
> the whim of the designer) or by calling a factory function (which is likely to
> be created either as a zero-argument lambda returning an object with
> enter/exit methods, or else by using PEP 309's partial function).

Having the name being different in each situation may actually be an
advantage -- it will give an additional clue as to what is happening,
and it will let us design different APIs for use in a with-statement
(just like dicts have iterkeys() and iteritems()).

> So if you see a with statement with a bare variable name as the context
> expression, it will probably be wrong, unless:
>   a) the implementor of that type provided thread-safety and nesting-safety; or
>   b) the object is known to be neither thread-safe nor nesting-safe
>
> The synchronisation objects in threading being examples of category a, file
> objects being examples of category b. In this scenario, generator contexts
> defined using @contextfactory should always be invoked directly in the context
> expression, as attempting to cache them in order to be reused won't work (you
> would need to put them in a zero-argument lambda and call it in the context
> expression, so that you get a new generator object each time).

Now we get to the crux of the matter. When I recommend that people write

  with foo.some_method():
      <BLOCK>

you are worried that if they need two separate blocks like that, they
are tempted to write

  x = foo.some_method()
  with x:
      <BLOCK1>
  with x:
      <BLOCK2>

But there's an obvious solution for that: make sure that the object
returned by foo.some_method() can only be used once. The second "with
x" will raise an exception explaining what went wrong. (We could even
rig it so that nesting the second "with x" inside the first one will
produce a different error message.)

So the "with foo.some_method()" idiom is the only one that works, and
that is just as thread-safe and nesting-resilient as is writing
__context__() today.

If you object against the extra typing, we'll first laugh at you
(proposals that *only* shave a few characters of a common idiom aren't
all that popular in these parts), and then suggest that you can spell
foo.some_method() as foo().

> Essentially, I don't think dropping __context__ would gain us anything - the
> complexity associated with it is real, and including that method in the API
> let's us deal with that complexity in one place, once and for all.

It's not real in 9 out of 10 current cases.

(And the Condition variable could easily be restored to its previous
case where it had its own __enter__ and __exit__ methods that called
the corresponding methods of the underlying lock. The code currently
in svn is merely an optimization.)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Mon May  1 06:17:18 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 01 May 2006 00:17:18 -0400
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
 decorator
In-Reply-To: <ca471dc20604302008s730da442q59b49e4dc4b60d82@mail.gmail.co
 m>
References: <44556CDB.1070604@gmail.com> <4454BA62.5080704@iinet.net.au>
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
	<44556CDB.1070604@gmail.com>
Message-ID: <5.1.1.6.0.20060501001028.01e6a460@mail.telecommunity.com>

At 08:08 PM 4/30/2006 -0700, Guido van Rossum wrote:
>If you object against the extra typing, we'll first laugh at you
>(proposals that *only* shave a few characters of a common idiom aren't
>all that popular in these parts), and then suggest that you can spell
>foo.some_method() as foo().

Okay, you've moved me to at least +0 for dropping __context__.  I have only 
one object myself that has a non-self __context__, and it doesn't have a 
__call__, so none of my code breaks beyond the need to add parentheses in a 
few places.  ;)

As for decimal contexts, I'm thinking maybe we should have a 
decimal.using(ctx=None, **kw) function, where ctx defaults to the current 
decimal context, and the keyword arguments are used to make a modified 
copy, seems like a reasonable best way to implement the behavior that 
__context__ was added for.  And then all of the existing special machinery 
can go away and be replaced with a single @contextfactory.

(I think we should stick with @contextfactory as the decorator name, btw, 
even if we go back to calling __enter__/__exit__ things context managers.)


From jcarlson at uci.edu  Mon May  1 06:19:04 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 30 Apr 2006 21:19:04 -0700
Subject: [Python-Dev] methods on the bytes object
In-Reply-To: <44550752.1050907@v.loewis.de>
References: <20060430111344.673E.JCARLSON@uci.edu>
	<44550752.1050907@v.loewis.de>
Message-ID: <20060430191152.6741.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
> Josiah Carlson wrote:
> >> I think what you are missing is that algorithms that currently operate
> >> on byte strings should be reformulated to operate on character strings,
> >> not reformulated to operate on bytes objects.
> > 
> > By "character strings" can I assume you mean unicode strings which
> > contain data, and not some new "character string" type?
> 
> I mean unicode strings, period. I can't imagine what "unicode strings
> which do not contain data" could be.

Binary data as opposed to text.  Input to a array.fromstring(),
struct.unpack(), etc.


> > I know I must
> > have missed some conversation. I was under the impression that in Py3k:
> > 
> > Python 1.x and 2.x str -> mutable bytes object
> 
> No. Python 1.x and 2.x str -> str, Python 2.x unicode -> str
> In addition, a bytes type is added, so that
> Python 1.x and 2.x str -> bytes
> 
> The problem is that the current string type is used both to represent
> bytes and characters. Current applications of str need to be studied,
> and converted appropriately, depending on whether they use
> "str-as-bytes" or "str-as-characters". The "default", in some
> sense of that word, is that str applications are assumed to operate
> on character strings; this is achieved by making string literals
> objects of the character string type.

Certainly it is the case that right now strings are used to contain
'text' and 'bytes' (binary data, encodings of text, etc.).  The problem
is in the ambiguity of Python 2.x str containing text where it should
only contain bytes. But in 3.x, there will continue to be an ambiguity,
as strings will still contain bytes and text (parsing literals, see the
somewhat recent argument over bytes.encode('base64'), etc.). We've not
removed the problem, only changed it from being contained in non-unicode
strings to be contained in unicode strings (which are 2 or 4 times larger
than their non-unicode counterparts).


Within the remainder of this email, there are two things I'm trying to
accomplish:
1. preserve the Python 2.x string type
2. make the bytes object more pallatable regardless of #1


The current plan (from what I understand) is to make all string literals
equivalent to their Python 2.x u-prefixed equivalents, and to leave
u-prefixed literals alone (unless the u prefix is being removed?). I
won't argue because I think it is a great idea.

I do, however, believe that the Python 2.x string type is very useful
from a data parsing/processing perspective.  Look how successful and
effective it has been so far in the history of Python.  In order to make
the bytes object be as effective in 3.x, one would need to add basically
all of the Python 2.x string methods to it (having some mechanism to use
slices of bytes objects as dictionary keys (if data[:4] in handler: ...
-> if tuple(data[:4]) in handler: ... ?) would also be nice).  Of course,
these implementations, ultimately, already exist with Python 2.x
immutable strings.

So, what to do?  Rename Python 2.x str to bytes.  The name of the type
now confers the idea that it should contain bytes, not strings.  If
bytes literals are deemed necessary (I think they would be nice, but not
required), have b"..." as the bytes literal.  Not having a literal, I
think, will generally reduce the number of people who try to put text
into bytes.

Ahh, but what about the originally thought-about bytes object?  That
mutable, file-like, string-like thing which is essentially array.array
('B', ...) with some other useful stuff?  Those are certainly still
useful, but not so much from a data parsing/processing perspective, as
much as a mutable in-memory buffer (not the Python built-in buffer
object, but a C-equivalent char* = (char*)malloc(...); ).  I currently
use mmaps and array objects for that (to limited success), but a new
type in the collections module (perhaps mutablebytes?) which offers such
functionality would be perfectly reasonable (as would moving the
immutable bytes object if it lacked a literal; or even switch to
bytes/frozenbytes).

If we were to go to the mutable/immutable bytes object pair, we could
still give mutable bytes .read()/.write(), slice assignment, etc., and
even offer an integer view mechanism (for iteration, assignment, etc.). 
Heck, we could do the same thing for the immutable type (except for
.write(), assignment, etc.), and essentially replace
cStringIO(initializer) (of course mutable bytes effectively replace
cStringIO()).

> > and that there would be some magical argument
> > to pass to the file or open open(fn, 'rb', magical_parameter).read() ->
> > bytes.
> 
> I think the precise details of that are still unclear. But yes,
> the plan is to have two file modes: one that returns character
> strings (type 'str') and one that returns type 'bytes'.

Here's a thought; require 'b' or 't' as arguments to open/file, the 't'
also having an optional encoding argument (which defaults to the current
default encoding).  If one attempts to write bytes to a text file or if
one attempts to write text to a bytes file; IOError, "Cannot write bytes
to a text file" or "Cannot write text to a bytes file".  Passing an
encoding to the 'b' file could either raise an exception, or provide an
encoding for text writing (removing the "Cannot write text to a bytes
file"), though I wouldn't want to do any encoding by default for this
case.

If there are mutable/immutable bytes as I describe above, reads on such
could produce either, but only one of the two (immutable seems
reasonable, at least from a consistancy perspective), but writes could
take either (or even buffer()s).


> > I mention this because I do binary data handling, some ''.join(...) for
> > IO buffers as Guido mentioned (because it is the fastest string
> > concatenation available in Python 2.x), and from this particular
> > conversation, it seems as though Python 3.x is going to lose
> > some expressiveness and power.
> 
> You certainly need a "concatenate list of bytes into a single
> bytes". Apparently, Guido assumes that this can be done through
> bytes().join(...); I personally feel that this is over-generalization:
> if the only practical application of .join is the empty bytes
> object as separator, I think the method should be omitted.
> 
>   bytes(...)
>   bytes.join(...)

I don't know if the only use-case for bytes would be ''.join() (all of
mine happen to be; non-''.join() cases are text), but I don't see the
motivator for _only_ allowing that particular use.  The difference is an
increment in the implementation; type checking and data copying should
be more significant.

 - Josiah


From martin at v.loewis.de  Mon May  1 07:38:59 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 01 May 2006 07:38:59 +0200
Subject: [Python-Dev] methods on the bytes object
In-Reply-To: <20060430191152.6741.JCARLSON@uci.edu>
References: <20060430111344.673E.JCARLSON@uci.edu>
	<44550752.1050907@v.loewis.de>
	<20060430191152.6741.JCARLSON@uci.edu>
Message-ID: <44559EF3.3050605@v.loewis.de>

Josiah Carlson wrote:
>> I mean unicode strings, period. I can't imagine what "unicode strings
>> which do not contain data" could be.
> 
> Binary data as opposed to text.  Input to a array.fromstring(),
> struct.unpack(), etc.

You can't/shouldn't put such data into character strings: you need
an encoding first. Neither array.fromstring nor struct.unpack will
produce/consume type 'str' in Python 3; both will operate on the
bytes type. So fromstring should probably be renamed frombytes.

> Certainly it is the case that right now strings are used to contain
> 'text' and 'bytes' (binary data, encodings of text, etc.).  The problem
> is in the ambiguity of Python 2.x str containing text where it should
> only contain bytes. But in 3.x, there will continue to be an ambiguity,
> as strings will still contain bytes and text (parsing literals, see the
> somewhat recent argument over bytes.encode('base64'), etc.).

No. In Python 3, type 'str' cannot be interpreted to contain bytes.
Operations that expect bytes and are given type 'str', and no encoding,
should raise TypeError.

> We've not removed the problem, only changed it from being contained 
> in non-unicode
> strings to be contained in unicode strings (which are 2 or 4 times larger
> than their non-unicode counterparts).

We have removed the problem.

> Within the remainder of this email, there are two things I'm trying to
> accomplish:
> 1. preserve the Python 2.x string type

I would expect that people try that. I'm -1.

> 2. make the bytes object more pallatable regardless of #1

This might be good, but we have to be careful to not create a type
that people would casually use to represent text.

> I do, however, believe that the Python 2.x string type is very useful
> from a data parsing/processing perspective.

You have to explain your terminology somewhat better here: What
applications do you have in mind when you are talking about
"parsing/processing"? To me, "parsing" always means "text", never
"raw bytes". I'm thinking of the Chomsky classification of grammars,
EBNF, etc. when I hear "parsing".

> Look how successful and
> effective it has been so far in the history of Python.  In order to make
> the bytes object be as effective in 3.x, one would need to add basically
> all of the Python 2.x string methods to it

The precondition of this clause is misguided: the bytes type doesn't
need to be as effective, since the string type is as effective in 2.3,
so you can do all parsing based on strings.

> (having some mechanism to use
> slices of bytes objects as dictionary keys (if data[:4] in handler: ...
> -> if tuple(data[:4]) in handler: ... ?) would also be nice).

You can't use the bytes type as a dictionary key because it is
immutable. Use the string type instead.

> So, what to do?  Rename Python 2.x str to bytes.  The name of the type
> now confers the idea that it should contain bytes, not strings.

It seems that you want an immutable version of the bytes type. As I
don't understand what "parsing" is, I cannot see the need for it;
I think having two different bytes types is confusing.

Regards,
Martin


From martin at v.loewis.de  Mon May  1 07:46:55 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 01 May 2006 07:46:55 +0200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <e33mj1$5nd$1@sea.gmane.org>
References: <4453B025.3080100@acm.org>
	<e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de>
	<e33mj1$5nd$1@sea.gmane.org>
Message-ID: <4455A0CF.3080008@v.loewis.de>

Terry Reedy wrote:
>> Are you asking why that feature (keyword-only arguments) is desirable?
>> That's the whole point of the PEP. Or are you asking why the user
>> shouldn't be allowed to pass keyword-only arguments by omitting the
>> keyword? Because they wouldn't be keyword-only arguments then, anymore.
> 
> There are two subproposals: first, keyword-only args after a variable 
> number of positional args, which requires allowing keyword parameter 
> specifications after the *args parameter, and second, keyword-only args 
> after a fixed number number of positional args, implemented with a naked 
> '*'.  To the first, I said "The rationale for this is pretty obvious.".  To 
> the second, I asked, and still ask, "Why?".

One reason I see is to have keyword-only functions, i.e. with no
positional arguments at all:

def make_person(*, name, age, phone, location):
    pass

which also works for methods:

    def make_person(self, *, name, age, phone, location):
        pass

In these cases, you don't *want* name, age to be passed in a positional
way. How else would you formulate that if this syntax wasn't available?
(I know it is possible to formulate it elsehow, I'm asking what notation
you would use)

> Again: if a function has a fixed number n of params, why say that the first 
> k can be passed by position, while the remaining n-k *must* be passed by 
> name?

I see an important use case for k=0 functions, and k=1 methods (where
the only positional argument is self).

Regards,
Martin

From martin at v.loewis.de  Mon May  1 07:50:09 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 01 May 2006 07:50:09 +0200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <e33n39$6pi$1@sea.gmane.org>
References: <4453B025.3080100@acm.org>
	<e31b0t$7i4$1@sea.gmane.org>	<44549922.7020109@gmail.com>
	<e33n39$6pi$1@sea.gmane.org>
Message-ID: <4455A191.5000404@v.loewis.de>

Terry Reedy wrote:
>>>>     Now, suppose you wanted to have 'key' be a keyword-only argument.
>>> Why?  Why not let the user type the additional argument(s) without the
>>> parameter name?
> 
> Like Martin, you clipped most of the essential context of my question: 
> Talin's second proposal.

I clipped it because I couldn't understand your question: "Why" what?
(the second question only gives "Why not") I then assumed that the
question must have applied to the text that immediately preceded the
question - hence that's the text that I left.

Now I understand, though.

Regards,
Martin

From martin at v.loewis.de  Mon May  1 08:31:51 2006
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 01 May 2006 08:31:51 +0200
Subject: [Python-Dev] Tkinter lockups.
In-Reply-To: <9e804ac0604231436q491d7289m97bb165dc42a5de@mail.gmail.com>
References: <9e804ac0604231436q491d7289m97bb165dc42a5de@mail.gmail.com>
Message-ID: <4455AB57.4020200@v.loewis.de>

Thomas Wouters wrote:
> It seems that, on my platform at least, Tk_Init() doesn't like being
> called twice even when the first call resulted in an error. That's Tcl
> and Tk 8.4.12. Tkapp_Init() (which is the Tkinter part that calls
> Tk_Init()) does its best to guard against calling Tk_Init() twice when
> the first call was succesful, but it doesn't remember failure cases. I
> don't know enough about Tcl/Tk or Tkinter how this is best handled, but
> it would be mightily convenient if it were. ;-) I've created a bugreport
> on it, and I hope someone with Tkinter knowledge can step in and fix it.
> (It looks like SF auto-assigned it to Martin already, hmm.)

I have now reported the underlying Tk bug at

http://sourceforge.net/tracker/index.php?func=detail&aid=1479587&group_id=12997&atid=112997

and worked around it in _tkinter.c.

Regards,
Martin

From nnorwitz at gmail.com  Mon May  1 09:04:51 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 1 May 2006 00:04:51 -0700
Subject: [Python-Dev] speeding up function calls
Message-ID: <ee2a432c0605010004r41cbf62dwebf960c665c72be@mail.gmail.com>

Results: 2.86% for 1 arg (len), 11.8% for 2 args (min), and 1.6% for pybench.

./python.exe -m timeit 'for x in xrange(10000): len([])'
./python.exe -m timeit 'for x in xrange(10000): min(1,2)'

One part of it is a little dangerous though.

http://python.org/sf/1479611

The general idea is to preallocate arg tuples and never dealloc.  This
saves a fair amount of work.  I'm not sure it's entirely  safe though.

I noticed in doing this patch that PyTuple_Pack() calls _New() which
initializes each item to NULL, then in _Pack() each item is set to the
appropriate value.  If we could get rid of duplicate work like that
(or checking values in both callers and callees), we could get more
speed.  In order to try and find functions where this is more
important, you can use Walter's coverage results:

http://coverage.livinglogic.de

n

From jcarlson at uci.edu  Mon May  1 10:07:55 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 01 May 2006 01:07:55 -0700
Subject: [Python-Dev] methods on the bytes object
In-Reply-To: <44559EF3.3050605@v.loewis.de>
References: <20060430191152.6741.JCARLSON@uci.edu>
	<44559EF3.3050605@v.loewis.de>
Message-ID: <20060430231024.674A.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
> Josiah Carlson wrote:
> >> I mean unicode strings, period. I can't imagine what "unicode strings
> >> which do not contain data" could be.
> > 
> > Binary data as opposed to text.  Input to a array.fromstring(),
> > struct.unpack(), etc.
> 
> You can't/shouldn't put such data into character strings: you need
> an encoding first.

Certainly that is the case.  But how would you propose embedded bytes
data be represented? (I talk more extensively about this particular
issue later).

> Neither array.fromstring nor struct.unpack will
> produce/consume type 'str' in Python 3; both will operate on the
> bytes type. So fromstring should probably be renamed frombytes.

Um...struct.unpack() already works on unicode...
    >>> struct.unpack('>L', u'work')
    (2003792491L,)
As does array.fromstring...
    >>> a = array.array('B')
    >>> a.fromstring(u'work')
    >>> a
    array('B', [119, 111, 114, 107])

... assuming that all characters are in the 0...127 range.  But that's a
different discussion.


> > Certainly it is the case that right now strings are used to contain
> > 'text' and 'bytes' (binary data, encodings of text, etc.).  The problem
> > is in the ambiguity of Python 2.x str containing text where it should
> > only contain bytes. But in 3.x, there will continue to be an ambiguity,
> > as strings will still contain bytes and text (parsing literals, see the
> > somewhat recent argument over bytes.encode('base64'), etc.).
> 
> No. In Python 3, type 'str' cannot be interpreted to contain bytes.
> Operations that expect bytes and are given type 'str', and no encoding,
> should raise TypeError.

I am apparently not communicating this particular idea effectively
enough.  How would you propose that I store parsing literals for
non-textual data, and how would you propose that I set up a dictionary
to hold some non-trivial number of these parsing literals?  I don't want
a vague "you can't do X", I want a "here's the code you would use".

From what I understand, it would seem that you would suggest that I use
something like the following...

handler = {bytes('...', encoding=...).encode('latin-1'): ...,
           #or
           '\uXXXX\uXXXX...': ...,
           #or even without bytes/str
           (0xXX, 0xXX, ...): ..., }

Note how two of those examples have non-textual data inside of a Python
3.x string?  Yeah.


> > We've not removed the problem, only changed it from being contained 
> > in non-unicode
> > strings to be contained in unicode strings (which are 2 or 4 times larger
> > than their non-unicode counterparts).
> 
> We have removed the problem.

Excuse me?  People are going to use '...' to represent literals of all
different kinds.  Whether these are text literals, binary data literals,
encoded binary data blobs (see the output of img2py.py from wxPython),
whatever.  We haven't removed the problem, we've only forced all
string literals to be unicode; foolish consistancy and all that.


> > Within the remainder of this email, there are two things I'm trying to
> > accomplish:
> > 1. preserve the Python 2.x string type
> 
> I would expect that people try that. I'm -1.

I also expect that people will try to make it happen; I am (and I'm
certainly not a visionary when it comes to programming language features). 
I would also hope that others are able to see that immutable unicode and
mutable bytes aren't necessarily sufficient, especially when the
standard line will be something like "if you are putting binary data
inside of a unicode string, you are doing it wrong".  Especially
considering that unless one jumps through hoops of defining their bytes
data as a bytes(list/tuple), and not bytes('...', encoding=...), that
technically, they are still going to be storing bytes data as unicode
strings.


> > 2. make the bytes object more palatable regardless of #1
> 
> This might be good, but we have to be careful to not create a type
> that people would casually use to represent text.

Certainly.  But by lacking #1, we will run into a situation where Python
3.x strings will be used to represent bytes.  Understand that I'm also
trying to differentiate the two cases (and thinking further, a bytes
literal would allow users to differentiate them without needing to use
bytes('...', ...) ).

In the realm of palatability, giving bytes objects most of the current
string methods, (with perhaps .read(), .seek(), (and .write() for
mutable bytes) ), I think, would go a long ways (if not all the way)
towards being more than satisfactory.


> > I do, however, believe that the Python 2.x string type is very useful
> > from a data parsing/processing perspective.
> 
> You have to explain your terminology somewhat better here: What
> applications do you have in mind when you are talking about
> "parsing/processing"? To me, "parsing" always means "text", never
> "raw bytes". I'm thinking of the Chomsky classification of grammars,
> EBNF, etc. when I hear "parsing".

What does pickle.load(...) do to the files that are passed into it?  It
reads the (possibly binary) data it reads in from a file (or file-like
object), performing a particular operation based based on a dictionary
of expected tokens in the file, producing a Python object. I would say
that pickle 'parses' the content of a file, which I presume isn't
necessary text.

Replace pickle with a structured data storage format of your choice.
It's still parsing (at least according to my grasp of English, which
could certainly be flawed (my wife says as much on a daily basis)).


> > Look how successful and
> > effective it has been so far in the history of Python.  In order to make
> > the bytes object be as effective in 3.x, one would need to add basically
> > all of the Python 2.x string methods to it
> 
> The precondition of this clause is misguided: the bytes type doesn't
> need to be as effective, since the string type is as effective in 2.3,
> so you can do all parsing based on strings.

Not if those strings contain binary data.  I thought we were
disambiguating what Python 3.x strings are supposed to contain?  If they
contain binary data, then there isn't a disambiguation as to their
content, and we end up with the same situation we have now (only worse
because now your data takes up twice as much memory and takes twice as
much time to process).


> > (having some mechanism to use
> > slices of bytes objects as dictionary keys (if data[:4] in handler: ...
> > -> if tuple(data[:4]) in handler: ... ?) would also be nice).
> 
> You can't use the bytes type as a dictionary key because it is
> immutable. Use the string type instead.

I meant that currently, if I have data as a Python 2.x string, and I
were to perhaps handle the current portion of the string via...
    if data[:4] in handler:
...that when my data becomes bytes in 3.x (because it isn't text, and
non-text shouldn't be in the 3.x string, if I understand our discussions
about it correctly), then I would need to use...
    if tuple(data[:4]) in handler:
or even
    if data[:4].decode('latin-1') in handler:
...because data is mutable.

I was expressing that being able to leave out the tuple() (or .decode())
would be convenient, though not necessary.  If given an immutable bytes
type, data to be parsed would likely be immutable, so data[:4] could be
hashable (I personally rarely mutate input data that is being parsed,
and I would suggest, if there is a choice between mutable/immutable file
reads, etc., that it be immutable).


> > So, what to do?  Rename Python 2.x str to bytes.  The name of the type
> > now confers the idea that it should contain bytes, not strings.
> 
> It seems that you want an immutable version of the bytes type. As I
> don't understand what "parsing" is, I cannot see the need for it;

I hope I've made what I consider parsing sufficiently clear.


> I think having two different bytes types is confusing.

I think that the difference between an immutable and mutable bytes types
will be clear, especially because they are given different names, and
because attempting to assign to read-only bytes would raise an
AttributeError, "'[immutable]bytes' object has no attribute '...'" (if
one wanted to be clever, one could even have it raise TypeError,
"'[immutable]bytes' object is not writable'".

There are at least two other situations in which there are mutable and
immutable simple variants of the same structure: set/frozenset and
list/tuple (though the latter instance doesn't support the same API in
both objects, due to their significantly different use-cases).  One of
the reasons I think that a bytes/mutablebytes should have a very similar
API, is because their use-cases may very well overlap, in a similar
fashion to how set/frozenset and list/tuple sometimes overlap (I
remember a discussion about giving tuples a list.index-like method, and
even another about giving one or the other a str.find-like method).

I would also point out that duplication of very similar functionality in
Python types is not that uncommon (beyond set/frozenset and list/tuple).
See the 4 (soon 3) different kinds of numbers in Python 2.4, and the 5
different ways of representing dates and times.

 - Josiah


From fredrik at pythonware.com  Mon May  1 10:48:44 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 10:48:44 +0200
Subject: [Python-Dev] More on contextlib - adding back a
	contextmanagerdecorator
References: <4454BA62.5080704@iinet.net.au><5.1.1.6.0.20060430144919.01e6bb18@mail.telecommunity.com>
	<ca471dc20604301910w3570bf3an630ba2afcbe3253d@mail.gmail.com>
Message-ID: <e34i1f$tl$1@sea.gmane.org>

Guido van Rossum wrote:

> Things should be as simple as possible but no simpler. It's pretty
> clear to me that dropping __context__ approaches this ideal. I'm sorry
> I didn't push back harder when __context__ was first proposed -- in
> retrospect, the first 5 months of PEP 343's life, before __context__
> (or __with__, as it was originally called) was invented, were by far
> its happiest times.

I've posted two versions of the "with" page from the language reference:

    http://pyref.infogami.com/with (current)
    http://pyref.infogami.com/with-alt (simplified)

(the original is a slightly tweaked version of the current development docs)

</F>




From martin at v.loewis.de  Mon May  1 10:55:52 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 01 May 2006 10:55:52 +0200
Subject: [Python-Dev] methods on the bytes object
In-Reply-To: <20060430231024.674A.JCARLSON@uci.edu>
References: <20060430191152.6741.JCARLSON@uci.edu>
	<44559EF3.3050605@v.loewis.de>
	<20060430231024.674A.JCARLSON@uci.edu>
Message-ID: <4455CD18.8030200@v.loewis.de>

Josiah Carlson wrote:
> Certainly that is the case.  But how would you propose embedded bytes
> data be represented? (I talk more extensively about this particular
> issue later).

Can't answer: I don't know what "embedded bytes data" are.

> Um...struct.unpack() already works on unicode...
>     >>> struct.unpack('>L', u'work')
>     (2003792491L,)
> As does array.fromstring...
>     >>> a = array.array('B')
>     >>> a.fromstring(u'work')
>     >>> a
>     array('B', [119, 111, 114, 107])
> 
> ... assuming that all characters are in the 0...127 range.  But that's a
> different discussion.

Yes, it applies the default encoding. This is unfortunate: it shouldn't
have worked in the first place. I hope this gives a type error in Python
3, with the default encoding gone.

> I am apparently not communicating this particular idea effectively
> enough.  How would you propose that I store parsing literals for
> non-textual data, and how would you propose that I set up a dictionary
> to hold some non-trivial number of these parsing literals?

I can't answer that question: I don't know what a "parsing literal
for non-textual data" is. If you are asking how you represent bytes
object in source code: I would encode them as a list of integers,
then use, say,

  parsing_literal = bytes([3,5,30,99])

> From what I understand, it would seem that you would suggest that I use
> something like the following...
> 
> handler = {bytes('...', encoding=...).encode('latin-1'): ...,
>            #or
>            '\uXXXX\uXXXX...': ...,
>            #or even without bytes/str
>            (0xXX, 0xXX, ...): ..., }
> 
> Note how two of those examples have non-textual data inside of a Python
> 3.x string?  Yeah.

Unfortunately, I don't notice. I assume you don't mean a literal '...';
if this is what you represent, I would write

handler = { '...': "some text" }

But I cannot guess what you want to put into '...' instead.

>>> We've not removed the problem, only changed it from being contained 
>>> in non-unicode
>>> strings to be contained in unicode strings (which are 2 or 4 times larger
>>> than their non-unicode counterparts).
>> We have removed the problem.
> 
> Excuse me?  People are going to use '...' to represent literals of all
> different kinds.

In Python 3, '...' will be a character string. You can't use it to
represent anything else but characters.

Can you give examples of actual source code where people use '...'
something other than text?

> What does pickle.load(...) do to the files that are passed into it?  It
> reads the (possibly binary) data it reads in from a file (or file-like
> object), performing a particular operation based based on a dictionary
> of expected tokens in the file, producing a Python object. I would say
> that pickle 'parses' the content of a file, which I presume isn't
> necessary text.

Right. I think pickle can be implemented with just the bytes type.
There is a textual and a binary version of the pickle format. The
textual should be read as text; the binary version using bytes.
(I *thinK* the encoding of the textual version is ASCII, but one
would have to check).

> Replace pickle with a structured data storage format of your choice.
> It's still parsing (at least according to my grasp of English, which
> could certainly be flawed (my wife says as much on a daily basis)).

Well, I have no "native language" intuition with respect to these
words - I only use them in the way I see them used elsewhere. I see
"parsing" typically associated with textual data, and "unmarshalling"
with binary data.

>>> Look how successful and
>>> effective it has been so far in the history of Python.  In order to make
>>> the bytes object be as effective in 3.x, one would need to add basically
>>> all of the Python 2.x string methods to it
>> The precondition of this clause is misguided: the bytes type doesn't
>> need to be as effective, since the string type is as effective in 2.3,
>> so you can do all parsing based on strings.
> 
> Not if those strings contain binary data.

I couldn't (until right now) understand what you mean by "parsing binary
data". I still doubt that you typically need the same operations for
unmarshalling that you need for parsing.

In parsing, you need to look (scan) for separators, follow a possibly
recursive grammar, and so on. For unmarshalling, you typically have
a TLV (tag-length-value) structure, where you read a tag (of fixed
size), then the length, then the value (of the size indicated in the
length). There are variations, of course, but you typically don't
need .find, .startswith, etc.

Regards,
Martni


From fredrik at pythonware.com  Mon May  1 11:33:03 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 11:33:03 +0200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org>
	<e31b0t$7i4$1@sea.gmane.org><445476CC.3080008@v.loewis.de>
	<e33mj1$5nd$1@sea.gmane.org>
Message-ID: <e34kkh$7tc$1@sea.gmane.org>

Terry Reedy wrote:

> My "Why?" was and is exactly a request for that further discussion.
>
> Again: if a function has a fixed number n of params, why say that the first
> k can be passed by position, while the remaining n-k *must* be passed by
> name?

have you designed API:s for others than yourself, and followed up how
they are used ?

</F>




From greg.ewing at canterbury.ac.nz  Mon May  1 11:34:17 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 01 May 2006 21:34:17 +1200
Subject: [Python-Dev] elimination of scope bleeding of
	iteration	variables
In-Reply-To: <4454C87A.7020701@gmail.com>
References: <44532F0B.5050804@666.com> <4454C87A.7020701@gmail.com>
Message-ID: <4455D619.9030002@canterbury.ac.nz>

Nick Coghlan wrote:
> However, the scoping of for loop 
> variables won't change, as the current behaviour is essential for search loops 
> that use a break statement to terminate the loop when the item is found. 

It occurs to me that there's a middle ground here:
leave the loop variable scope alone, but make it
an error to use the same variable in two different
loops at the same time.

e.g.

   for x in stuff:
     if its_what_were_looking_for(x):
       break
   snarfle(x)
   for x in otherstuff:
     dosomethingelse(x)

would be fine, but

   for x in stuff:
     for x in otherstuff:
       dosomethingelse(x)

would be a SyntaxError because the inner loop
is trying to use x while it's still in use by the
outer loop.

--
Greg

From ncoghlan at gmail.com  Mon May  1 11:49:46 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 01 May 2006 19:49:46 +1000
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <e33n39$6pi$1@sea.gmane.org>
References: <4453B025.3080100@acm.org>
	<e31b0t$7i4$1@sea.gmane.org>	<44549922.7020109@gmail.com>
	<e33n39$6pi$1@sea.gmane.org>
Message-ID: <4455D9BA.9070804@gmail.com>

Terry Reedy wrote:
> "Nick Coghlan" <ncoghlan at gmail.com> wrote in message 
>> Because for some functions (e.g. min()/max()) you want to use *args, but
>> support some additional keyword arguments to tweak a few aspects of the
>> operation (like providing a "key=x" option).
> 
> This and the rest of your 'explanation' is about Talin's first proposal, to 
> which I already had said "The rationale for this is pretty obvious".

Actually, I misread Talin's PEP moreso than your question - I thought the 
first syntax change was about the earlier Py3k discussion of permitting 
'*args' before keyword arguments in a functional call. It seems that one is 
actually non-controversial enough to not really need a PEP at all :)

Reading the PEP again, I realise what you were actually asking, and have to 
say I agree the only use case that has been identified for keyword-only 
arguments is functions which accept an arbitrary number of positional arguments.

So +1 for the first change, -1 for the second.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Mon May  1 12:29:12 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 01 May 2006 20:29:12 +1000
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
 decorator
In-Reply-To: <5.1.1.6.0.20060501001028.01e6a460@mail.telecommunity.com>
References: <44556CDB.1070604@gmail.com> <4454BA62.5080704@iinet.net.au>
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
	<44556CDB.1070604@gmail.com>
	<5.1.1.6.0.20060501001028.01e6a460@mail.telecommunity.com>
Message-ID: <4455E2F8.5020206@gmail.com>

Phillip J. Eby wrote:
> At 08:08 PM 4/30/2006 -0700, Guido van Rossum wrote:
>> If you object against the extra typing, we'll first laugh at you
>> (proposals that *only* shave a few characters of a common idiom aren't
>> all that popular in these parts), and then suggest that you can spell
>> foo.some_method() as foo().
> 
> Okay, you've moved me to at least +0 for dropping __context__.  I have 
> only one object myself that has a non-self __context__, and it doesn't 
> have a __call__, so none of my code breaks beyond the need to add 
> parentheses in a few places.  ;)

At least +0 here, too. I've just been so deep in this lately that it is taking 
a while to wind my thinking back a year or so. Still, far better to be having 
this discussion now than in 6 months time :)

It sure has been a long and winding road back to Guido's original version of 
PEP 343, though!

> As for decimal contexts, I'm thinking maybe we should have a 
> decimal.using(ctx=None, **kw) function, where ctx defaults to the 
> current decimal context, and the keyword arguments are used to make a 
> modified copy, seems like a reasonable best way to implement the 
> behavior that __context__ was added for.  And then all of the existing 
> special machinery can go away and be replaced with a single 
> @contextfactory.

'localcontext' would probably work as at least an interim name for such a 
function.

   with decimal.localcontext() as ctx:
       # use the new context here

This is really an all-round improvement over the current SVN approach, where 
the fact that a new decimal context object is being created by the existing 
decimal context object is thoroughly implicit and unobvious.

> (I think we should stick with @contextfactory as the decorator name, 
> btw, even if we go back to calling __enter__/__exit__ things context 
> managers.)

Agreed. And that decorator will still be useful for defining methods as well 
as functions.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From greg.ewing at canterbury.ac.nz  Mon May  1 13:03:48 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 01 May 2006 23:03:48 +1200
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
 decorator
In-Reply-To: <ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
References: <4454BA62.5080704@iinet.net.au>
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
Message-ID: <4455EB14.6000709@canterbury.ac.nz>

Guido van Rossum wrote:

> I believe the context API design has gotten totally out of hand.

My thoughts exactly!

> I have a counter-proposal: let's drop __context__... would it
> really be such a big deal to let the user make an explicit call to
> some appropriately named method?

Another possibility I thought of, for the case where
an object is needed to keep track of the state of each
invocation of a context, is to have the __enter__ method
return the state object, which the with-statement tucks
away and later passes to the __exit__ method.

Also a thought on terminology. Even though it seems I
may have been the person who thought it up originally,
I'm not sure I like the term "manager". It seems rather
wooly, and it's not clear whether a "context manager"
is supposed to manage just one context or multiple
contexts.

I've been thinking about the terms "guarded context"
and "context guard". We could say that the with-statement
executes its body in a guarded context (an abstract
notion, not a concrete object). To do this, it creates
a context guard (a concrete object) with __enter__
and __exit__ methods that set up and tear down the
guarded context. This seems clearer to me, since I
can more readily visualise a "guard" object being
specially commissioned to deal with one particular
job (guarding a particular invocation of a context).

With only one object, there wouldn't be a need for any more
terms. But if another term is needed for an object with a
__context__ method or equivalent, I rather liked Nick's
"context specifier".

--
Greg

From ben at 666.com  Mon May  1 05:47:07 2006
From: ben at 666.com (Ben Wing)
Date: Sun, 30 Apr 2006 22:47:07 -0500
Subject: [Python-Dev] global variable modification in functions [Re:
 elimination of scope bleeding of iteration variables]
In-Reply-To: <4454C87A.7020701@gmail.com>
References: <44532F0B.5050804@666.com> <4454C87A.7020701@gmail.com>
Message-ID: <445584BB.8090708@666.com>



Nick Coghlan wrote:

> Ben Wing wrote:
>
>> apologies if this has been brought up on python-dev already.
>>
>> a suggestion i have, perhaps for python 3.0 since it may break some 
>> code (but imo it could go into 2.6 or 2.7 because the likely breakage 
>> would be very small, see below), is the elimination of the misfeature 
>> whereby the iteration variable used in for-loops, list 
>> comprehensions, etc. bleeds out into the surrounding scope.
>>
>> [i'm aware that there is a similar proposal for python 3.0 for list 
>> comprehensions specifically, but that's not enough.]
>
>
> List comprehensions will be fixed in Py3k. However, the scoping of for 
> loop variables won't change, as the current behaviour is essential for 
> search loops that use a break statement to terminate the loop when the 
> item is found. Accordingly, there is plenty of code in the wild that 
> *would* break if the for loop variables were constrained to the for 
> loop, even if your own code wouldn't have such a problem.
>
> Outside pure scripts, significant control flow logic (like for loops) 
> should be avoided at module level. You are typically much better off 
> moving the logic inside a _main() function and invoking it at the end 
> of the module. This avoids the 'accidental global' problem for all of 
> the script-only variables, not only the ones that happen to be used as 
> for loop variables.

i did in fact end up doing that.  however, in the process i ran into 
another python annoyance i've tripped over repeatedly: you can't assign 
to a global variable without explicitly declaring it as `global'.  
instead, you "magically" get a shadowing local variable.  this behavior 
is extremely hostile to newcomers: e.g.

foo = 1

def set_foo():
  foo = 2

print foo

--> 1

the worst part is, not a single warning from Python about this.  in a 
large program, such a bug can be very tricky to track down.

now i can see how an argument against changing this behavior might hinge 
upon global names like `hash' and `list'; you certainly wouldn't want an 
intended local variable called `hash' or `list' to trounce upon these.  
but this argument confuses lexical and dynamic scope: global variables 
declared inside a module are (or can be viewed as) globally lexically 
scoped in the module, whereas `hash' and `list' are dynamically scoped.

so i'd suggest:

[1] ideally, change this behavior, either for 2.6 or 3.0.  maybe have a 
`local' keyword if you really want a new scope.
[2] until this change, python should always print a warning in this 
situation.
[3] the current 'UnboundLocal' exception should probably be more 
helpful, e.g. suggesting that you might need to use a `global foo' 
declaration.

ben


From ben at 666.com  Mon May  1 06:42:13 2006
From: ben at 666.com (Ben Wing)
Date: Sun, 30 Apr 2006 23:42:13 -0500
Subject: [Python-Dev] python syntax additions to support indentation
 insensitivity/generated code
Message-ID: <445591A5.8050808@666.com>

recently i've been writing code that generates a python program from a 
source file containing intermixed python code and non-python constructs, 
which get converted into python.

similar things have been done in many other languages -- consider, for 
example, the way php is embedded into web pages.

unfortunately, doing this is really hard in python because of its 
whitespace sensitivity.

i suggest the following simple change: in addition to constructs like this:

block:
  within-block statement
  within-block statement
  ...
out-of-block statement

python should support

block::
  within-block statement
  within-block statement
  ...
end
out-of-block statement

the syntax is similar, but has an extra colon, and is terminated by an 
`end'.  indentation of immediate-level statements within the block is 
unimportant.

mixed-mode code should be possible, e.g.:

block-1::
  within-block-1 statement
  within-block-1 statement
  block-2:
    within-block-2 statement
    within-block-2 statement
  within-block-1 statement
  ...
end

in other words, code within block-2 is indentation-sensitive; block-2 is 
terminated by the first immediate-level statement at or below the 
indentation of the `block-2' statement. 

similarly, in this:

[A] block-1::
[B]   within-block-1 statement
[C]   within-block-1 statement
[D]   block-2:
[E]     within-block-2 statement
[F]     within-block-2 statement
[G]     block-3::
[H]       within-block-3 statement
[I]       within-block-3 statement
[J]     end
[K]     within-block-2 statement
[L]   within-block-1 statement
[M]   ...
[N] end

the indentation of lines [D], [E], [F], [G], [K] and [L]  is 
significant, but not any others.  that is, [E], [F], [G], and [K] must 
be at the same level, which is greater than the level of line [D], and 
line [L] must be at a level less than or equal to line [D].  all other 
lines, including [H], [I] and [J], can be at any indentation level.  
also, line [D] can be at any level with respect to line [C].

the idea is that a python code generator could easily mix generated and 
hand-written code.  hand-written code written in normal python style 
could be wrapped by generated code using the indentation-insensitive 
style; if the generated code had no indentation, everything would work 
as expected without the generator having to worry about indentation.

i don't see too many problems with backward-compatibility here.  the 
double-colon shouldn't cause any problems, since that syntax isn't legal 
currently.  `end' could be recognized as a keyword only following a 
double-colon block; elsewhere, it could still be a variable.

ben


From fredrik at pythonware.com  Mon May  1 13:39:26 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 13:39:26 +0200
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
	decorator
References: <4454BA62.5080704@iinet.net.au><ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
	<4455EB14.6000709@canterbury.ac.nz>
Message-ID: <e34s1g$t4m$1@sea.gmane.org>

Greg Ewing wrote:

> I've been thinking about the terms "guarded context"
> and "context guard". We could say that the with-statement
> executes its body in a guarded context (an abstract
> notion, not a concrete object). To do this, it creates
> a context guard (a concrete object) with __enter__
> and __exit__ methods that set up and tear down the
> guarded context. This seems clearer to me, since I
> can more readily visualise a "guard" object being
> specially commissioned to deal with one particular
> job (guarding a particular invocation of a context).
>
> With only one object, there wouldn't be a need for any more
> terms.

contrast and compare:

    http://pyref.infogami.com/with
    http://pyref.infogami.com/with-alt
    http://pyref.infogami.com/with-guard

a distinct term for "whatever the __enter__ method returns" (i.e.
the thing assigned to the target list) would still be nice.

</F>




From ncoghlan at gmail.com  Mon May  1 13:42:13 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 01 May 2006 21:42:13 +1000
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
 decorator
In-Reply-To: <4455EB14.6000709@canterbury.ac.nz>
References: <4454BA62.5080704@iinet.net.au>
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
	<4455EB14.6000709@canterbury.ac.nz>
Message-ID: <4455F415.3040104@gmail.com>

Greg Ewing wrote:
> Also a thought on terminology. Even though it seems I
> may have been the person who thought it up originally,
> I'm not sure I like the term "manager". It seems rather
> wooly, and it's not clear whether a "context manager"
> is supposed to manage just one context or multiple
> contexts.

I think getting rid of __context__ should clear up most of this confusion 
(which is further evidence that Guido is making the right call). Once that 
change is made, the context expression in the with statement produces a 
context manager with __enter__ and __exit__ methods which set up and tear down 
a managed context for the body of the with statement. This is very similar to 
your later suggestion of context guard and guarded context.

I believe this is actually going back to using the terminology as you 
originally suggested it (one concrete object, one abstract concept). We really 
only got into trouble once we tried to add a second kind of concrete object 
into the mix (objects with only a __context__ method).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From greg.ewing at canterbury.ac.nz  Mon May  1 14:02:45 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 02 May 2006 00:02:45 +1200
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
 decorator
In-Reply-To: <4455F415.3040104@gmail.com>
References: <4454BA62.5080704@iinet.net.au>
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
	<4455EB14.6000709@canterbury.ac.nz> <4455F415.3040104@gmail.com>
Message-ID: <4455F8E5.7070203@canterbury.ac.nz>

Nick Coghlan wrote:
> the context expression in the with 
> statement produces a context manager with __enter__ and __exit__ methods 
> which set up and tear down a managed context for the body of the with 
> statement. This is very similar to your later suggestion of context 
> guard and guarded context.

Currently I think I still prefer the term "guard",
since it does a better job of conjuring up the same
sort of idea as a try-finally.

There's also one other issue, what to call the
decorator. I don't like @contextfactory, because it
sounds like something that produces contexts,
yet we have no such object.

With only one object, it should probably be named
after that object, i.e. @contextmanager or
@contextguard. That's if we think it's really
worth making it easy to abuse a generator in this
way -- which I'm not convinced about. It's not
as if people are going to be implementing context
managers/guards every day.

--
Greg

From ncoghlan at gmail.com  Mon May  1 14:15:59 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 01 May 2006 22:15:59 +1000
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
 decorator
In-Reply-To: <4455F415.3040104@gmail.com>
References: <4454BA62.5080704@iinet.net.au>	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>	<4455EB14.6000709@canterbury.ac.nz>
	<4455F415.3040104@gmail.com>
Message-ID: <4455FBFF.3080301@gmail.com>

Nick Coghlan wrote:
> Greg Ewing wrote:
>> Also a thought on terminology. Even though it seems I
>> may have been the person who thought it up originally,
>> I'm not sure I like the term "manager". It seems rather
>> wooly, and it's not clear whether a "context manager"
>> is supposed to manage just one context or multiple
>> contexts.
> 
> I think getting rid of __context__ should clear up most of this confusion 
> (which is further evidence that Guido is making the right call). Once that 
> change is made, the context expression in the with statement produces a 
> context manager with __enter__ and __exit__ methods which set up and tear down 
> a managed context for the body of the with statement. This is very similar to 
> your later suggestion of context guard and guarded context.

Thinking about it a bit further. . .

1. PEP 343, 2.5 alpha 1, 2.5 alpha 2 and the discussions here have no doubt 
seriously confused the meaning of the term 'context manager' for a lot of 
people (you can certainly put me down as one such person). Anyone not already 
confused is likely to *become* confused if we subtly change the meaning in 
alpha 3.

2. The phrase "managed context" is unfortunately close to .NET's term "managed 
code", and would likely lead to confusion for IronPython folks (and other 
programmers with .NET experience)

3. "manager" is an extremely generic term that is already used in a lot of 
different ways in various programming contexts

Switching to Greg's suggestion of "context guard" and "guarded context" as the 
terms would allow us to hit the reset button and start the documentation 
afresh without terminology confusion resulting from the evolution of PEP 343 
and its implementation and documentation.

I think context guard also works better in terms of guarding entry to and exit 
from the guarded context, whereas I always wanted to call those operations 
"set up" and "tear down" for context managers.

The current @contextfactory decorator could be renamed to @guardfactory to 
make it explicit that it results in a factory function for context guards.

Cheers,
Nick.

P.S. I think I can hear anguished howls coming from the offices of various 
book publishers around the world ;)


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From jepler at unpythonic.net  Mon May  1 14:16:16 2006
From: jepler at unpythonic.net (Jeff Epler)
Date: Mon, 1 May 2006 07:16:16 -0500
Subject: [Python-Dev] Tkinter lockups.
In-Reply-To: <4455AB57.4020200@v.loewis.de>
References: <9e804ac0604231436q491d7289m97bb165dc42a5de@mail.gmail.com>
	<4455AB57.4020200@v.loewis.de>
Message-ID: <20060501121616.GD2603@unpythonic.net>

Thanks Martin!

Jeff

From ncoghlan at gmail.com  Mon May  1 14:24:58 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 01 May 2006 22:24:58 +1000
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
 decorator
In-Reply-To: <4455F8E5.7070203@canterbury.ac.nz>
References: <4454BA62.5080704@iinet.net.au>
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
	<4455EB14.6000709@canterbury.ac.nz> <4455F415.3040104@gmail.com>
	<4455F8E5.7070203@canterbury.ac.nz>
Message-ID: <4455FE1A.2010506@gmail.com>

Greg Ewing wrote:
> Nick Coghlan wrote:
>> the context expression in the with statement produces a context 
>> manager with __enter__ and __exit__ methods which set up and tear down 
>> a managed context for the body of the with statement. This is very 
>> similar to your later suggestion of context guard and guarded context.
> 
> Currently I think I still prefer the term "guard",
> since it does a better job of conjuring up the same
> sort of idea as a try-finally.

See the other message I wrote while you were writing this one for the various 
reasons I now agree with you :)

> There's also one other issue, what to call the
> decorator. I don't like @contextfactory, because it
> sounds like something that produces contexts,
> yet we have no such object.

Agreed.

> With only one object, it should probably be named
> after that object, i.e. @contextmanager or
> @contextguard. That's if we think it's really
> worth making it easy to abuse a generator in this
> way -- which I'm not convinced about. It's not
> as if people are going to be implementing context
> managers/guards every day.

I suggested renaming it to "guardfactory" in my other message. Keeping the 
term 'factory' in the name emphasises that the decorator results in a callable 
that returns a context guard, rather than producing a context guard directly.

As for whether or not we should provide this ability, I think we definitely 
should. It allows try/finally boilerplate code to be replaced with a guarded 
context almost mechanically, whereas converting the same code to a manually 
written context guard could involve significant effort in changing from local 
variable based storage to instance attribute based storage. IOW, the feature 
is provided for the same reason that generator functions are provided: it is 
typically *much* easier to write a generator function than it is to write the 
same iterator manually.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Mon May  1 14:32:06 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 01 May 2006 22:32:06 +1000
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
 decorator
In-Reply-To: <e34s1g$t4m$1@sea.gmane.org>
References: <4454BA62.5080704@iinet.net.au><ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>	<4455EB14.6000709@canterbury.ac.nz>
	<e34s1g$t4m$1@sea.gmane.org>
Message-ID: <4455FFC6.8010901@gmail.com>

Fredrik Lundh wrote:
> a distinct term for "whatever the __enter__ method returns" (i.e.
> the thing assigned to the target list) would still be nice.

I've called that the "context entry value" in a few places (I don't think any 
of them were in the actual documentation though).

A sample modification to the reference page:
------------------------------
Here's a more detailed description:

    1. The context expression is evaluated, to obtain a context guard.
    2. The guard object's __enter__ method is invoked to obtain the context
       entry value.
    3. If a target list was included in the with statement, the context entry
       value is assigned to it.
    4. The suite is executed.
    5. The guard object's __exit__ method is invoked. If an exception caused
       the suite to be exited, its type, value, and traceback are passed as
       arguments to __exit__. Otherwise, three None arguments are supplied.
------------------------------

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From phd at mail2.phd.pp.ru  Mon May  1 14:32:35 2006
From: phd at mail2.phd.pp.ru (Oleg Broytmann)
Date: Mon, 1 May 2006 16:32:35 +0400
Subject: [Python-Dev] global variable modification in functions [Re:
	elimination of scope bleeding of iteration variables]
In-Reply-To: <445584BB.8090708@666.com>
References: <44532F0B.5050804@666.com> <4454C87A.7020701@gmail.com>
	<445584BB.8090708@666.com>
Message-ID: <20060501123235.GF6556@phd.pp.ru>

On Sun, Apr 30, 2006 at 10:47:07PM -0500, Ben Wing wrote:
> foo = 1
> 
> def set_foo():
>   foo = 2

   PyLint gives a warning here "local foo shadows global variable".

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From edloper at gradient.cis.upenn.edu  Mon May  1 14:36:03 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Mon, 01 May 2006 08:36:03 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <4455A0CF.3080008@v.loewis.de>
References: <4453B025.3080100@acm.org>	<e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de>	<e33mj1$5nd$1@sea.gmane.org>
	<4455A0CF.3080008@v.loewis.de>
Message-ID: <445600B3.7040903@gradient.cis.upenn.edu>

Martin v. L?wis wrote:
> One reason I see is to have keyword-only functions, i.e. with no
> positional arguments at all:
> 
> def make_person(*, name, age, phone, location):
>     pass
> 
> which also works for methods:
> 
>     def make_person(self, *, name, age, phone, location):
>         pass
> 
> In these cases, you don't *want* name, age to be passed in a positional
> way. How else would you formulate that if this syntax wasn't available?

But is it necessary to syntactically *enforce* that the arguments be 
used as keywords?  I.e., why not just document that the arguments should 
be used as keyword arguments, and leave it at that.  If users insist on 
using them positionally, then their code will be less readable, and 
might break if you decide to change the order of the parameters, but 
we're all consenting adults.  (And if you *do* believe that the 
parameters should all be passed as keywords, then I don't see any reason 
why you'd ever be motivated to change their order.)

-Edward


From fredrik at pythonware.com  Mon May  1 15:04:01 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 15:04:01 +0200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org>	<e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de>	<e33mj1$5nd$1@sea.gmane.org><4455A0CF.3080008@v.loewis.de>
	<445600B3.7040903@gradient.cis.upenn.edu>
Message-ID: <e35102$d53$1@sea.gmane.org>

Edward Loper wrote:

> > One reason I see is to have keyword-only functions, i.e. with no
> > positional arguments at all:
> >
> > def make_person(*, name, age, phone, location):
> >     pass
> >
> > which also works for methods:
> >
> >     def make_person(self, *, name, age, phone, location):
> >         pass
> >
> > In these cases, you don't *want* name, age to be passed in a positional
> > way. How else would you formulate that if this syntax wasn't available?
>
> But is it necessary to syntactically *enforce* that the arguments be
> used as keywords?  I.e., why not just document that the arguments should
> be used as keyword arguments, and leave it at that.

and how do you best do that, in a way that automatic introspection
tools understand, unless you invent some kind of standard syntax for
it?

and if you have a standard syntax for it, why not support it at the
interpreter level ?

</F>




From fredrik at pythonware.com  Mon May  1 15:05:24 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 15:05:24 +0200
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
	decorator
References: <4454BA62.5080704@iinet.net.au><ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>	<4455EB14.6000709@canterbury.ac.nz><e34s1g$t4m$1@sea.gmane.org>
	<4455FFC6.8010901@gmail.com>
Message-ID: <e3512l$ddp$1@sea.gmane.org>

Nick Coghlan wrote:

> I've called that the "context entry value" in a few places (I don't think any
> of them were in the actual documentation though).

that doesn't really give me the right associations (I want something
that makes it clear that this is an "emphemeral" object).

> A sample modification to the reference page:

except for the exact term, that's a definite improvement.  thanks!

</F>




From jason.orendorff at gmail.com  Mon May  1 15:14:38 2006
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Mon, 1 May 2006 09:14:38 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <44557789.5020107@gradient.cis.upenn.edu>
References: <4453B025.3080100@acm.org> <e31b0t$7i4$1@sea.gmane.org>
	<445476CC.3080008@v.loewis.de> <e33mj1$5nd$1@sea.gmane.org>
	<44557789.5020107@gradient.cis.upenn.edu>
Message-ID: <bb8868b90605010614s3739756aj254b0d9477edceb@mail.gmail.com>

On 4/30/06, Edward Loper <edloper at gradient.cis.upenn.edu> wrote
(referring to keyword-only arguments):
> I see two possible reasons:
>
>    - A function's author believes that calls to the function will be
>      easier to read if certain parameters are passed by name, rather
>      than positionally; and they want to enforce that calling
>      convention on their users.  This seems to me to go against the
>      "consenting adults" principle.
>
>    - A function's author believes they might change the signature in the
>      future to accept new positional arguments, and they will want to put
>      them before the args that they declare keyword-only.
>
> Both of these motivations seem fairly weak.  Certainly, neither seems to
> warrant a significant change to function definition syntax.

I disagree.  I think the use cases are more significant than you
suggest, and the proposed change less significant.

Readability and future-compatibility are key factors in API design. 
How well a language supports them determines how sweet its libraries
can be.

Even relatively simple high-level functions often have lots of clearly
inessential "options". When I design this kind of function, I often
wish for keyword-only arguments.  path.py's write_lines() is an
example.

In fact... it feels as though I've seen "keyword-only" arguments in a
few places in the stdlib.  Am I imagining this?

Btw, I don't think the term "consenting adults" applies.  To me, that
refers to the agreeable state of affairs where you, the programmer
about to do something dangerous, know it's dangerous and indicate your
consent somehow in your source code, e.g. by typing an underscore. 
That underscore sends a warning.  It tells you to think twice.  It
tells you the blame is all yours if this doesn't work.  It makes
consent explicit (both mentally and syntactically).

I'm +1 on the use cases but -0 on the PEP.  The proposed syntax isn't
clear; I think I want a new 'explicit' keyword or something.  (Like
that'll happen.  Pfft.)

-j

From fdrake at acm.org  Mon May  1 15:30:22 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 1 May 2006 09:30:22 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <44557789.5020107@gradient.cis.upenn.edu>
References: <4453B025.3080100@acm.org> <e33mj1$5nd$1@sea.gmane.org>
	<44557789.5020107@gradient.cis.upenn.edu>
Message-ID: <200605010930.23198.fdrake@acm.org>

On Sunday 30 April 2006 22:50, Edward Loper wrote:
 > I see two possible reasons:

Another use case, observed in the wild:

   - An library function is written to take an arbitrary number of
     positional arguments using *args syntax.  The library is released,
     presumably creating dependencies on the specific signature of the
     function.  In a subsequent version of the function, the function is
     determined to need additional information.  The only way to add an
     argument is to use a keyword for which there is no positional
     equivalent.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From edloper at gradient.cis.upenn.edu  Mon May  1 15:39:00 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Mon, 01 May 2006 09:39:00 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <200605010930.23198.fdrake@acm.org>
References: <4453B025.3080100@acm.org> <e33mj1$5nd$1@sea.gmane.org>
	<44557789.5020107@gradient.cis.upenn.edu>
	<200605010930.23198.fdrake@acm.org>
Message-ID: <44560F74.5000600@gradient.cis.upenn.edu>

Fred L. Drake, Jr. wrote:
> On Sunday 30 April 2006 22:50, Edward Loper wrote:
>  > I see two possible reasons:
> 
> Another use case, observed in the wild:
> 
>    - An library function is written to take an arbitrary number of
>      positional arguments using *args syntax.  The library is released,
>      presumably creating dependencies on the specific signature of the
>      function.  In a subsequent version of the function, the function is
>      determined to need additional information.  The only way to add an
>      argument is to use a keyword for which there is no positional
>      equivalent.

This falls under the "first subproposal" from Terry's email:

> There are two subproposals: first, keyword-only args after a variable 
> number of positional args, which requires allowing keyword parameter 
> specifications after the *args parameter, and second, keyword-only args 
> after a fixed number number of positional args, implemented with a naked 
> '*'.  To the first, I said "The rationale for this is pretty obvious.".  To 
> the second, I asked, and still ask, "Why?".

I was trying to come up with use cases for the "second subproposal."

-Edward


From foom at fuhm.net  Mon May  1 15:55:23 2006
From: foom at fuhm.net (James Y Knight)
Date: Mon, 1 May 2006 09:55:23 -0400
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
	decorator
In-Reply-To: <4455FBFF.3080301@gmail.com>
References: <4454BA62.5080704@iinet.net.au>	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>	<4455EB14.6000709@canterbury.ac.nz>
	<4455F415.3040104@gmail.com> <4455FBFF.3080301@gmail.com>
Message-ID: <8D9580FC-95C5-4A15-994C-6F91E31E3F45@fuhm.net>


On May 1, 2006, at 8:15 AM, Nick Coghlan wrote:

> 1. PEP 343, 2.5 alpha 1, 2.5 alpha 2 and the discussions here have  
> no doubt
> seriously confused the meaning of the term 'context manager' for a  
> lot of
> people (you can certainly put me down as one such person). Anyone  
> not already
> confused is likely to *become* confused if we subtly change the  
> meaning in
> alpha 3.

Don't forget that the majority of users will never have heard any of  
these discussions nor have used 2.5a1 or 2.5a2. Choose the best term  
for them, not for the readers of python-dev.

James

From guido at python.org  Mon May  1 16:18:54 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 07:18:54 -0700
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
	decorator
In-Reply-To: <8D9580FC-95C5-4A15-994C-6F91E31E3F45@fuhm.net>
References: <4454BA62.5080704@iinet.net.au>
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
	<4455EB14.6000709@canterbury.ac.nz> <4455F415.3040104@gmail.com>
	<4455FBFF.3080301@gmail.com>
	<8D9580FC-95C5-4A15-994C-6F91E31E3F45@fuhm.net>
Message-ID: <ca471dc20605010718v769d7215l81a83029e70ec5a3@mail.gmail.com>

On 5/1/06, James Y Knight <foom at fuhm.net> wrote:
> Don't forget that the majority of users will never have heard any of
> these discussions nor have used 2.5a1 or 2.5a2. Choose the best term
> for them, not for the readers of python-dev.

I couldn't agree more! (Another thought, occasionally useful,is to
consider that surely the number of Python programs yet to be written,
and the number of Python programmers who yet have to learn the
language, must surely exceed the current count. At least, one would
hope so -- if that's not true, we might as well stop now. :-)

Nick, do you have it in you to fix PEP 343? Or at least come up with a
draft patch? We can take this off-linel with all the +0's and +1's
coming in I'm pretty comfortable with this change now, although we
should probably wait until later today to commit.
--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  1 16:28:59 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 07:28:59 -0700
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <44560F74.5000600@gradient.cis.upenn.edu>
References: <4453B025.3080100@acm.org> <e33mj1$5nd$1@sea.gmane.org>
	<44557789.5020107@gradient.cis.upenn.edu>
	<200605010930.23198.fdrake@acm.org>
	<44560F74.5000600@gradient.cis.upenn.edu>
Message-ID: <ca471dc20605010728g2742bd01m4b4b284c15eabb18@mail.gmail.com>

On 5/1/06, Edward Loper <edloper at gradient.cis.upenn.edu> wrote:
> > There are two subproposals: first, keyword-only args after a variable
> > number of positional args, which requires allowing keyword parameter
> > specifications after the *args parameter, and second, keyword-only args
> > after a fixed number number of positional args, implemented with a naked
> > '*'.  To the first, I said "The rationale for this is pretty obvious.".  To
> > the second, I asked, and still ask, "Why?".
>
> I was trying to come up with use cases for the "second subproposal."

A function/method could have one argument that is obviously needed and
a whole slew of options that few people care about. For most people,
the signature they know is foo(arg). It would be nice if all the
options were required to be written as keyword arguments, so the
reader will not have to guess what foo(arg, True) means.

This signature style could perhaps be used for the join() builtin that
some folks are demanding: join(iterable, sep=" ", auto_str=False).

For the record, I'm +1 on Talin's PEP, -1 on join.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  1 16:32:40 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 07:32:40 -0700
Subject: [Python-Dev] global variable modification in functions [Re:
	elimination of scope bleeding of iteration variables]
In-Reply-To: <445584BB.8090708@666.com>
References: <44532F0B.5050804@666.com> <4454C87A.7020701@gmail.com>
	<445584BB.8090708@666.com>
Message-ID: <ca471dc20605010732n1e50f222p4db710e963a3f7f4@mail.gmail.com>

On 4/30/06, Ben Wing <ben at 666.com> wrote:
> [1] ideally, change this behavior, either for 2.6 or 3.0.  maybe have a
> `local' keyword if you really want a new scope.
> [2] until this change, python should always print a warning in this
> situation.
> [3] the current 'UnboundLocal' exception should probably be more
> helpful, e.g. suggesting that you might need to use a `global foo'
> declaration.

You're joking right?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  1 16:48:04 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 07:48:04 -0700
Subject: [Python-Dev] unittest argv
In-Reply-To: <c0141a030604301859y3937484cqe40fec2b04192296@mail.gmail.com>
References: <c0141a030604301859y3937484cqe40fec2b04192296@mail.gmail.com>
Message-ID: <ca471dc20605010748p36e32c49y5d29f7bc0277c216@mail.gmail.com>

Wouldn't this be an incompatible change? That would make it a no-no.
Providing a dummy argv[0] isn't so hard is it?

On 4/30/06, John Keyes <john at integralsource.com> wrote:
> Hi,
>
> main() in unittest has an optional parameter called argv.  If it is not
> present in the invocation, it defaults to None.  Later in the function
> a check is made to see if argv is None and if so sets it to sys.argv.
> I think the default should be changed to sys.argv[1:] (i.e. the
> command line arguments minus the name of the python file
> being executed).
>
> The parseArgs() function then uses getopt to parse argv.  It currently
> ignores the first item in the argv list, but this causes a problem when
> it is called from another python function and not from the command
> line.  So using the current code if I call:
>
> python mytest.py -v
>
> then argv in parseArgs is ['mytest.py', '-v']
>
> But, if I call:
>
> unittest.main(module=None, argv=['-v','mytest'])
>
> then argv in parseArgs is ['mytest'], as you can see the verbosity option is
> now gone and cannot be used.
>
> Here's a diff to show the code changes I have made:
>
> 744c744
> <                  argv=None, testRunner=None, testLoader=defaultTestLoader):
> ---
> >                  argv=sys.argv[1:], testRunner=None, testLoader=defaultTestLoader):
> 751,752d750
> <         if argv is None:
> <             argv = sys.argv
> 757c755
> <         self.progName = os.path.basename(argv[0])
> ---
> > #        self.progName = os.path.basename(argv[0])
> 769c767
> <             options, args = getopt.getopt(argv[1:], 'hHvq',
> ---
> >             options, args = getopt.getopt(argv, 'hHvq',
>
> You may notice I have commented out the self.progName line.  This variable
> is not used anywhere in the module so I guess it could be removed.  To
> keep it then conditional check on argv would have to remain and be moved after
> the self.progName line.
>
> I hope this makes sense, and it's my first post so go easy on me ;)
>
> Thanks,
> -John K
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  1 16:51:35 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 07:51:35 -0700
Subject: [Python-Dev] Adding functools.decorator
In-Reply-To: <e32r0a$s69$1@sea.gmane.org>
References: <4454C203.2060707@iinet.net.au> <e32kqa$bq6$1@sea.gmane.org>
	<ca471dc20604300936s2a1bdf0fof8395428230963d9@mail.gmail.com>
	<e32r0a$s69$1@sea.gmane.org>
Message-ID: <ca471dc20605010751s50394141lf2a966beb15882b9@mail.gmail.com>

On 4/30/06, Georg Brandl <g.brandl at gmx.net> wrote:
> Guido van Rossum wrote:
> > I expect that at some point people will want to tweak what gets copied
> > by _update_wrapper() -- e.g. some attributes may need to be
> > deep-copied, or personalized, or skipped, etc.
>
> What exactly do you have in mind there? If someone wants to achieve this,
> she can write his own version of @decorator.

I meant that the provided version should make writing your own easier
than copying the source and editing it. Some form of subclassing might
make sense, or a bunch of smaller functions that can be called for
various actions. You'll probably have to discover some real use cases
before you'll be able to design the right API for this.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  1 16:54:55 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 07:54:55 -0700
Subject: [Python-Dev] socket module recvmsg/sendmsg
In-Reply-To: <200604302133.15205.me+python-dev@modelnine.org>
References: <200604302133.15205.me+python-dev@modelnine.org>
Message-ID: <ca471dc20605010754u7d0c726cy694239e56cc2bfcc@mail.gmail.com>

Is there a question or a request in here somewhere? If not, c.l.py.ann
would be more appropriate.

If you want that code integrated into core Python, read python.org/dev
and prepare a patch for SF!

--Guido

On 4/30/06, Heiko Wundram <me+python-dev at modelnine.org> wrote:
> Hi all!
>
> I've implemented recvmsg and sendmsg for the socket module in my private
> Python tree for communication between two forked processes, which are
> essentially wrappers for proper handling of SCM_RIGHTS and SCM_CREDENTIALS
> Unix-Domain-Socket messages (which are the two types of messages that are
> defined on Linux).
>
> The main reason I need these two primitives is that I require (more or less
> transparent) file/socket descriptor exchange between two forked processes,
> where one process accepts a socket, and delegates processing of the socket
> connection to another process of a set of processes; this is much like a
> ForkingTCPServer, but with the Handler-process prestarted.
>
> As connection to the Unix-Domain-Socket is openly available, the receiving
> process needs to check the identity of the first process; this is done using
> a getsockopt(SO_PEERCRED) call, which is also handled more specifically by my
> socket extension to return a socket._ucred-type structure, which wraps the
> pid/uid/gid-structure returned by the corresponding getsockopt call, and also
> the socket message (SCM_CREDENTIALS) which passes or sets this information
> for the remote process.
>
> I'd love to see these two socket message primitives (of which the first,
> SCM_RIGHTS, is available on pretty much any Unix derivative) included in a
> Python distribution at some point in time, and as I've not got the time to
> push for an inclusion in the tree (and even less time to work on other Python
> patches myself) at the moment, I thought that I might just post here so that
> someone interested might pick up the work I've done so far and check the
> implementation for bugs, and at some stage these two functions might actually
> find their way into the Python core.
>
> Anyway, my private Python tree (which has some other patches which aren't of
> general interest, I'd think) is available at:
>
> http://dev.modelnine.org/hg/python
>
> and I can, if anyone is interested, post a current diff of socketmodule.*
> against 2.4.3 to the Python bug tracker at sourceforge. I did that some time
> ago (about half a year) when socket-passing code wasn't completely
> functioning yet, but at least at that point there didn't seem much interest
> in the patch. The patch should apply pretty cleanly against the current HEAD
> too, at least it did the last time I checked.
>
> I'll add a small testsuite for both functions to my tree some time tomorrow.
>
> --- Heiko.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  1 17:05:46 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 08:05:46 -0700
Subject: [Python-Dev] methods on the bytes object (was: Crazy idea for
	str.join)
In-Reply-To: <20060430023808.673B.JCARLSON@uci.edu>
References: <20060429055551.672C.JCARLSON@uci.edu>
	<ca471dc20604291952s2d0050bcl8c446d37f577b7ef@mail.gmail.com>
	<20060430023808.673B.JCARLSON@uci.edu>
Message-ID: <ca471dc20605010805r3bc9a8b4ye4fc08c5e580f649@mail.gmail.com>

Please take this to the py3k list.

It's still open which methods to add; it'll depend on the needs we
discover while using bytes to write the I/O library.

I don't believe we should add everything we can; rather, I'd like to
keep the API small until we have a clear need for a particular method.
For the record, I'm holding off adding join() for now; I'd rather
speed up the += operation.

--Guido

On 4/30/06, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "Guido van Rossum" <guido at python.org> wrote:
> > On 4/29/06, Josiah Carlson <jcarlson at uci.edu> wrote:
> > > I understand the underlying implementation of str.join can be a bit
> > > convoluted (with the auto-promotion to unicode and all), but I don't
> > > suppose there is any chance to get str.join to support objects which
> > > implement the buffer interface as one of the items in the sequence?
> >
> > In Py3k, buffers won't be compatible with strings -- buffers will be
> > about bytes, while strings will be about characters. Given that future
> > I don't think we should mess with the semantics in 2.x; one change in
> > the near(ish) future is enough of a transition.
>
> This brings up something I hadn't thought of previously.  While unicode
> will obviously keep its .join() method when it becomes str in 3.x, will
> bytes objects get a .join() method?  Checking the bytes PEP, very little
> is described about the type other than it basically being an array of 8
> bit integers.  That's fine and all, but it kills many of the parsing
> and modification use-cases that are performed on strings via the non
> __xxx__ methods.
>
>
> Specifically in the case of bytes.join(), the current common use-case of
> <literal>.join(...) would become something similar to
> bytes(<literal>).join(...), unless bytes objects got a syntax... Or
> maybe I'm missing something?
>
>
> Anyways, when the bytes type was first being discussed, I had hoped that
> it would basically become array.array("B", ...) + non-unicode str.
> Allowing for bytes to do everything that str was doing before, plus a
> few new tricks (almost like an mmap...), minus those operations which
> require immutability.
>
>
>  - Josiah
>
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From aahz at pythoncraft.com  Mon May  1 17:19:22 2006
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 1 May 2006 08:19:22 -0700
Subject: [Python-Dev] python syntax additions to support indentation
	insensitivity/generated code
In-Reply-To: <445591A5.8050808@666.com>
References: <445591A5.8050808@666.com>
Message-ID: <20060501151922.GA6886@panix.com>

On Sun, Apr 30, 2006, Ben Wing wrote:
>
> recently i've been writing code that generates a python program from a 
> source file containing intermixed python code and non-python constructs, 
> which get converted into python.

Please take this to comp.lang.python (this is not in-and-of-itself
inappropriate for python-dev, but experience indicates that this
discussion will probably not be productive and therefore belongs on the
general newsgroup).
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Argue for your limitations, and sure enough they're yours."  --Richard Bach

From pje at telecommunity.com  Mon May  1 17:34:28 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 01 May 2006 11:34:28 -0400
Subject: [Python-Dev] global variable modification in functions [Re:
 elimination of scope bleeding of iteration variables]
In-Reply-To: <ca471dc20605010732n1e50f222p4db710e963a3f7f4@mail.gmail.co
 m>
References: <445584BB.8090708@666.com> <44532F0B.5050804@666.com>
	<4454C87A.7020701@gmail.com> <445584BB.8090708@666.com>
Message-ID: <5.1.1.6.0.20060501112424.03760840@mail.telecommunity.com>

At 07:32 AM 5/1/2006 -0700, Guido van Rossum wrote:
>On 4/30/06, Ben Wing <ben at 666.com> wrote:
> > [1] ideally, change this behavior, either for 2.6 or 3.0.  maybe have a
> > `local' keyword if you really want a new scope.
> > [2] until this change, python should always print a warning in this
> > situation.
> > [3] the current 'UnboundLocal' exception should probably be more
> > helpful, e.g. suggesting that you might need to use a `global foo'
> > declaration.
>
>You're joking right?

While I agree that item #1 is a non-starter, it seems to me that in the 
case where the compiler statically knows a name is being bound in the 
module's globals, and there is a *non-argument* local variable being bound 
in a function body, the odds are quite high that the programmer forgot to 
use "global".  I could almost see issuing a warning, or having a way to 
enable such a warning.

And for the case where the compiler can tell the variable is accessed 
before it's defined, there's definitely something wrong.  This code, for 
example, is definitely missing a "global" and the compiler could in 
principle tell:

     foo = 1

     def bar():
         foo+=1

So I see no problem (in principle, as opposed to implementation) with 
issuing a warning or even a compilation error for that code.  (And it's 
wrong even if the snippet I showed is in a nested function definition, 
although the error would be different.)

If I recall correctly, the new compiler uses a control-flow graph that 
could possibly be used to determine whether there is a path on which a 
local could be read before it's stored.


From pje at telecommunity.com  Mon May  1 17:28:58 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 01 May 2006 11:28:58 -0400
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
 decorator
In-Reply-To: <4455E2F8.5020206@gmail.com>
References: <5.1.1.6.0.20060501001028.01e6a460@mail.telecommunity.com>
	<44556CDB.1070604@gmail.com> <4454BA62.5080704@iinet.net.au>
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
	<44556CDB.1070604@gmail.com>
	<5.1.1.6.0.20060501001028.01e6a460@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20060501111752.01e6d258@mail.telecommunity.com>

At 08:29 PM 5/1/2006 +1000, Nick Coghlan wrote:
>'localcontext' would probably work as at least an interim name for such a 
>function.
>
>   with decimal.localcontext() as ctx:
>       # use the new context here

And the "as ctx" should be unnecessary for most use cases, if localcontext 
has an appropriately designed API.


From martin at v.loewis.de  Mon May  1 17:33:33 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 01 May 2006 17:33:33 +0200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <445600B3.7040903@gradient.cis.upenn.edu>
References: <4453B025.3080100@acm.org>	<e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de>	<e33mj1$5nd$1@sea.gmane.org>
	<4455A0CF.3080008@v.loewis.de>
	<445600B3.7040903@gradient.cis.upenn.edu>
Message-ID: <44562A4D.30102@v.loewis.de>

Edward Loper wrote:
> Martin v. L?wis wrote:
>> One reason I see is to have keyword-only functions, i.e. with no
>> positional arguments at all:
>>
>> def make_person(*, name, age, phone, location):
>>     pass
> 
> But is it necessary to syntactically *enforce* that the arguments be
> used as keywords?

This really challenges the whole point of the PEP: keyword-only
arguments (at least, it challenges the title of the PEP, although
probably not the specified rationale).

> I.e., why not just document that the arguments should
> be used as keyword arguments, and leave it at that.

Because they wouldn't be keyword-only arguments, then.

Regards,
Martin

From guido at python.org  Mon May  1 17:38:13 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 08:38:13 -0700
Subject: [Python-Dev] global variable modification in functions [Re:
	elimination of scope bleeding of iteration variables]
In-Reply-To: <5.1.1.6.0.20060501112424.03760840@mail.telecommunity.com>
References: <44532F0B.5050804@666.com> <4454C87A.7020701@gmail.com>
	<445584BB.8090708@666.com>
	<5.1.1.6.0.20060501112424.03760840@mail.telecommunity.com>
Message-ID: <ca471dc20605010838j54beb7d9y8f5aff485053eb03@mail.gmail.com>

On 5/1/06, Phillip J. Eby <pje at telecommunity.com> wrote:
> While I agree that item #1 is a non-starter, it seems to me that in the
> case where the compiler statically knows a name is being bound in the
> module's globals, and there is a *non-argument* local variable being bound
> in a function body, the odds are quite high that the programmer forgot to
> use "global".  I could almost see issuing a warning, or having a way to
> enable such a warning.
>
> And for the case where the compiler can tell the variable is accessed
> before it's defined, there's definitely something wrong.  This code, for
> example, is definitely missing a "global" and the compiler could in
> principle tell:
>
>      foo = 1
>
>      def bar():
>          foo+=1
>
> So I see no problem (in principle, as opposed to implementation) with
> issuing a warning or even a compilation error for that code.  (And it's
> wrong even if the snippet I showed is in a nested function definition,
> although the error would be different.)
>
> If I recall correctly, the new compiler uses a control-flow graph that
> could possibly be used to determine whether there is a path on which a
> local could be read before it's stored.

Sure. This is a quality of implementation issue.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From fredrik at pythonware.com  Mon May  1 17:40:08 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 17:40:08 +0200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org>	<e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de>	<e33mj1$5nd$1@sea.gmane.org><4455A0CF.3080008@v.loewis.de><445600B3.7040903@gradient.cis.upenn.edu>
	<44562A4D.30102@v.loewis.de>
Message-ID: <e35a4q$c23$1@sea.gmane.org>

Martin v. Löwis wrote:

> > I.e., why not just document that the arguments should
> > be used as keyword arguments, and leave it at that.
>
> Because they wouldn't be keyword-only arguments, then.

which reminds me of the following little absurdity gem from the language
reference:

    The following identifiers are used as keywords of the language, and
    cannot be used as ordinary identifiers. They must be spelled exactly
    as written here: /.../

(maybe it's just me).

btw, talking about idioms used in the language reference, can any of the
native speakers on this list explain if "A is a nicer way of spelling B" means
that "A is preferred over B", "B is preferred over A", "A and B are the same
word and whoever wrote this is just being absurd", or anything else ?

</F>




From p.f.moore at gmail.com  Mon May  1 17:56:49 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 1 May 2006 16:56:49 +0100
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <e35a4q$c23$1@sea.gmane.org>
References: <4453B025.3080100@acm.org> <e31b0t$7i4$1@sea.gmane.org>
	<445476CC.3080008@v.loewis.de> <e33mj1$5nd$1@sea.gmane.org>
	<4455A0CF.3080008@v.loewis.de>
	<445600B3.7040903@gradient.cis.upenn.edu> <44562A4D.30102@v.loewis.de>
	<e35a4q$c23$1@sea.gmane.org>
Message-ID: <79990c6b0605010856p12d2083cl2fe2858a1cf2c7d8@mail.gmail.com>

On 5/1/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
> btw, talking about idioms used in the language reference, can any of the
> native speakers on this list explain if "A is a nicer way of spelling B" means
> that "A is preferred over B", "B is preferred over A", "A and B are the same
> word and whoever wrote this is just being absurd", or anything else ?

Without any context, I'd read it as "A is preferred over B".

Paul

From tjreedy at udel.edu  Mon May  1 18:43:48 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 1 May 2006 12:43:48 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org><e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de><e33mj1$5nd$1@sea.gmane.org>
	<4455A0CF.3080008@v.loewis.de>
Message-ID: <e35ds3$p77$1@sea.gmane.org>


""Martin v. Löwis"" <martin at v.loewis.de> wrote in message 
news:4455A0CF.3080008 at v.loewis.de...
> Terry Reedy wrote:
>> There are two subproposals: first, keyword-only args after a variable
>> number of positional args, which requires allowing keyword parameter
>> specifications after the *args parameter, and second, keyword-only args
>> after a fixed number number of positional args, implemented with a naked
>> '*'.  To the first, I said "The rationale for this is pretty obvious.". 
>> To
>> the second, I asked, and still ask, "Why?".
>
> One reason I see is to have keyword-only functions, i.e. with no
> positional arguments at all:

This is not a reason for subproposal two, but a special case, as you 
yourself note below, and hence does say why you want to have such.

> def make_person(*, name, age, phone, location):
>    pass

And again, why would you *make* me, the user-programmer, type

make_person(name=namex, age=agex, phone=phonex, location = locationx)
#instead of
make_person(namex,agex,phonex,locationx)
?

Ditto for methods.

> In these cases, you don't *want* name, age to be passed in a positional
> way.

I sure you know what I am going to ask, that you did not answer ;-)

Terry Jan Reedy

PS.  I see that Guido finally gave a (different) use case for bare * that 
does make sense to me.




From aahz at pythoncraft.com  Mon May  1 18:45:40 2006
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 1 May 2006 09:45:40 -0700
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <445600B3.7040903@gradient.cis.upenn.edu>
References: <4453B025.3080100@acm.org> <e31b0t$7i4$1@sea.gmane.org>
	<445476CC.3080008@v.loewis.de> <e33mj1$5nd$1@sea.gmane.org>
	<4455A0CF.3080008@v.loewis.de>
	<445600B3.7040903@gradient.cis.upenn.edu>
Message-ID: <20060501164540.GA4961@panix.com>

On Mon, May 01, 2006, Edward Loper wrote:
>
> But is it necessary to syntactically *enforce* that the arguments be 
> used as keywords?  I.e., why not just document that the arguments should 
> be used as keyword arguments, and leave it at that.  If users insist on 
> using them positionally, then their code will be less readable, and 
> might break if you decide to change the order of the parameters, but 
> we're all consenting adults.  (And if you *do* believe that the 
> parameters should all be passed as keywords, then I don't see any reason 
> why you'd ever be motivated to change their order.)

IIRC, part of the motivation for this is to make it easier for super() to
work correctly.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Argue for your limitations, and sure enough they're yours."  --Richard Bach

From fredrik at pythonware.com  Mon May  1 18:49:00 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 18:49:00 +0200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org><e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de><e33mj1$5nd$1@sea.gmane.org><4455A0CF.3080008@v.loewis.de>
	<e35ds3$p77$1@sea.gmane.org>
Message-ID: <e35e5u$q8l$1@sea.gmane.org>

Terry Reedy wrote:

> And again, why would you *make* me, the user-programmer, type
>
> make_person(name=namex, age=agex, phone=phonex, location = locationx)
> #instead of
> make_person(namex,agex,phonex,locationx)
> ?

because a good API designer needs to consider more than just the current
release.

I repeat my question: have you done API design for others, and have you
studied how your API:s are used (and how they evolve) over a longer period
of time ?

</F>




From tjreedy at udel.edu  Mon May  1 18:47:34 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 1 May 2006 12:47:34 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org><e31b0t$7i4$1@sea.gmane.org>	<44549922.7020109@gmail.com><e33n39$6pi$1@sea.gmane.org>
	<4455A191.5000404@v.loewis.de>
Message-ID: <e35e36$q0i$1@sea.gmane.org>


""Martin v. Löwis"" <martin at v.loewis.de> wrote in message
 > I clipped it because I couldn't understand your question: "Why" what?
> (the second question only gives "Why not") I then assumed that the
> question must have applied to the text that immediately preceded the
> question - hence that's the text that I left.
>
> Now I understand, though.

For future reference, when I respond to a post, I usually try to help 
readers by snipping away inessentials, leaving only the essential context 
of my responses.

Terry






From john at integralsource.com  Mon May  1 19:11:50 2006
From: john at integralsource.com (John Keyes)
Date: Mon, 1 May 2006 18:11:50 +0100
Subject: [Python-Dev] unittest argv
In-Reply-To: <ca471dc20605010748p36e32c49y5d29f7bc0277c216@mail.gmail.com>
References: <c0141a030604301859y3937484cqe40fec2b04192296@mail.gmail.com>
	<ca471dc20605010748p36e32c49y5d29f7bc0277c216@mail.gmail.com>
Message-ID: <c0141a030605011011vceeae32n4b07b00be1dedf91@mail.gmail.com>

On 5/1/06, Guido van Rossum <guido at python.org> wrote:
> Wouldn't this be an incompatible change? That would make it a no-no.
> Providing a dummy argv[0] isn't so hard is it?

It would be incompatible with existing code, but that code is
already broken (IMO) by passing a dummy argv[0].  I don't
think fixing it would affect much code, because normally
people don't specify the '-q' or '-v' in code, it is almost
exclusively used on the command line.

The only reason I came across it was that I was modifying
an ant task (py-test) so it could handle all of the named
arguments that TestProgram.__init__ supports.

If the list index code can't change, at a minimum the default value
for argv should change from None to sys.argv.

Are the tests for unittest.py?

Thanks,
-John K

>
> On 4/30/06, John Keyes <john at integralsource.com> wrote:
> > Hi,
> >
> > main() in unittest has an optional parameter called argv.  If it is not
> > present in the invocation, it defaults to None.  Later in the function
> > a check is made to see if argv is None and if so sets it to sys.argv.
> > I think the default should be changed to sys.argv[1:] (i.e. the
> > command line arguments minus the name of the python file
> > being executed).
> >
> > The parseArgs() function then uses getopt to parse argv.  It currently
> > ignores the first item in the argv list, but this causes a problem when
> > it is called from another python function and not from the command
> > line.  So using the current code if I call:
> >
> > python mytest.py -v
> >
> > then argv in parseArgs is ['mytest.py', '-v']
> >
> > But, if I call:
> >
> > unittest.main(module=None, argv=['-v','mytest'])
> >
> > then argv in parseArgs is ['mytest'], as you can see the verbosity option is
> > now gone and cannot be used.
> >
> > Here's a diff to show the code changes I have made:
> >
> > 744c744
> > <                  argv=None, testRunner=None, testLoader=defaultTestLoader):
> > ---
> > >                  argv=sys.argv[1:], testRunner=None, testLoader=defaultTestLoader):
> > 751,752d750
> > <         if argv is None:
> > <             argv = sys.argv
> > 757c755
> > <         self.progName = os.path.basename(argv[0])
> > ---
> > > #        self.progName = os.path.basename(argv[0])
> > 769c767
> > <             options, args = getopt.getopt(argv[1:], 'hHvq',
> > ---
> > >             options, args = getopt.getopt(argv, 'hHvq',
> >
> > You may notice I have commented out the self.progName line.  This variable
> > is not used anywhere in the module so I guess it could be removed.  To
> > keep it then conditional check on argv would have to remain and be moved after
> > the self.progName line.
> >
> > I hope this makes sense, and it's my first post so go easy on me ;)
> >
> > Thanks,
> > -John K
> > _______________________________________________
> > Python-Dev mailing list
> > Python-Dev at python.org
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
> >
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>

From edloper at gradient.cis.upenn.edu  Mon May  1 19:11:54 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Mon, 01 May 2006 13:11:54 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <e35e5u$q8l$1@sea.gmane.org>
References: <4453B025.3080100@acm.org><e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de><e33mj1$5nd$1@sea.gmane.org><4455A0CF.3080008@v.loewis.de>	<e35ds3$p77$1@sea.gmane.org>
	<e35e5u$q8l$1@sea.gmane.org>
Message-ID: <4456415A.6040406@gradient.cis.upenn.edu>

Fredrik Lundh wrote:
>> And again, why would you *make* me, the user-programmer, type
>>
>> make_person(name=namex, age=agex, phone=phonex, location = locationx)
>> #instead of
>> make_person(namex,agex,phonex,locationx)
>> ?
> 
> because a good API designer needs to consider more than just the current
> release.

I believe that it's quite possible that you're right, but I think more 
concrete answers would be helpful.  I.e., how does having parameters 
that are syntactically keyword-only (as opposed to being simply 
documented as keyword-only) help you develop an API over time?

I gave some thought to it, and can come up with a few answers.  In all 
cases, assume a user BadUser, who decided to use positional arguments 
for arguments that you documented as keyword-only.

- You would like to deprecate, and eventually remove, an argument to
   a function.  For people who read your documentation, and use
   keyword args for the keyword-only arguments, their code will break
   in a clean, easy-to-understand way.  But BadUser's code may break in
   strange hard-to-understand ways, since their positional arguments will
   get mapped to the wrong function arguments.

- You would like to add a new parameter to a function, and would like
   to make that new parameter available for positional argument use.
   So you'd like to add it before all the keyword arguments.  But this
   will break BadUser's code.

- You have a function that takes one argument and a bunch of keyword
   options, and would like to change it to accept *varargs instead of
   just one argument.  But this will break BadUser's code.

I think that this kind of *concrete* use-case provides a better 
justification for the feature than just saying "it will help API 
design."  As someone who has had a fair amount of experience designing & 
maintaining APIs over time, perhaps you'd care to contribute some more 
use cases where you think having syntactically keyword-only arguments 
would help?

-Edward

From guido at python.org  Mon May  1 19:23:43 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 10:23:43 -0700
Subject: [Python-Dev] unittest argv
In-Reply-To: <c0141a030605011011vceeae32n4b07b00be1dedf91@mail.gmail.com>
References: <c0141a030604301859y3937484cqe40fec2b04192296@mail.gmail.com>
	<ca471dc20605010748p36e32c49y5d29f7bc0277c216@mail.gmail.com>
	<c0141a030605011011vceeae32n4b07b00be1dedf91@mail.gmail.com>
Message-ID: <ca471dc20605011023y1a2a2279l10decee091118c32@mail.gmail.com>

On 5/1/06, John Keyes <john at integralsource.com> wrote:
> On 5/1/06, Guido van Rossum <guido at python.org> wrote:
> > Wouldn't this be an incompatible change? That would make it a no-no.
> > Providing a dummy argv[0] isn't so hard is it?
>
> It would be incompatible with existing code, but that code is
> already broken (IMO) by passing a dummy argv[0].

That's a new meaning of "broken", one that I haven't heard before.
It's broken because it follows the API?!?!

>I don't
> think fixing it would affect much code, because normally
> people don't specify the '-q' or '-v' in code, it is almost
> exclusively used on the command line.

Famous last words.

> The only reason I came across it was that I was modifying
> an ant task (py-test) so it could handle all of the named
> arguments that TestProgram.__init__ supports.
>
> If the list index code can't change, at a minimum the default value
> for argv should change from None to sys.argv.

No. Late binding of sys.argv is very important. There are plenty of
uses where sys.argv is dynamically modified.

> Are the tests for unittest.py?

Assuming you meant "Are there tests", yes: test_unittest.py. But it needs work.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Mon May  1 19:26:42 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 01 May 2006 13:26:42 -0400
Subject: [Python-Dev] unittest argv
In-Reply-To: <c0141a030605011011vceeae32n4b07b00be1dedf91@mail.gmail.com
 >
References: <ca471dc20605010748p36e32c49y5d29f7bc0277c216@mail.gmail.com>
	<c0141a030604301859y3937484cqe40fec2b04192296@mail.gmail.com>
	<ca471dc20605010748p36e32c49y5d29f7bc0277c216@mail.gmail.com>
Message-ID: <5.1.1.6.0.20060501132427.01f27d58@mail.telecommunity.com>

At 06:11 PM 5/1/2006 +0100, John Keyes wrote:
>On 5/1/06, Guido van Rossum <guido at python.org> wrote:
> > Wouldn't this be an incompatible change? That would make it a no-no.
> > Providing a dummy argv[0] isn't so hard is it?
>
>It would be incompatible with existing code, but that code is
>already broken (IMO) by passing a dummy argv[0].  I don't
>think fixing it would affect much code, because normally
>people don't specify the '-q' or '-v' in code, it is almost
>exclusively used on the command line.

Speak for yourself - I have at least two tools that would have to change 
for this, at least one of which would have to grow version testing code, 
since it's distributed for Python 2.3 and up.  That's far more wasteful 
than providing an argv[0], which is already a common requirement for main 
program functions in Python.


From tjreedy at udel.edu  Mon May  1 19:25:16 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 1 May 2006 13:25:16 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org>	<e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de>	<e33mj1$5nd$1@sea.gmane.org><4455A0CF.3080008@v.loewis.de><445600B3.7040903@gradient.cis.upenn.edu><44562A4D.30102@v.loewis.de>
	<e35a4q$c23$1@sea.gmane.org>
Message-ID: <e35g9r$275$1@sea.gmane.org>


"Fredrik Lundh" <fredrik at pythonware.com> wrote in message 
news:e35a4q$c23$1 at sea.gmane.org...
> which reminds me of the following little absurdity gem from the language
> reference:
>
>    The following identifiers are used as keywords of the language, and
>    cannot be used as ordinary identifiers. They must be spelled exactly
>    as written here: /.../
>
> (maybe it's just me).

I am not sure of what you see as absurdity, as opposed to clumbsiness. 
Keywords are syntacticly indentifiers, but are reserved for predefined 
uses, and thus sematically are not identifiers.  Perhaps 'ordinary 
indentifiers' should be replaced by 'names'.  The second sentence is 
referring to case variations, and that could be more explicit.  Before the 
elevation of None to reserved word status, one could have just added ", in 
lower case letters"

> btw, talking about idioms used in the language reference, can any of the
> native speakers on this list explain if "A is a nicer way of spelling B" 
> means
> that "A is preferred over B",  [alternatives snipped]

Yes.

Terry Jan Reedy




From martin at v.loewis.de  Mon May  1 19:26:41 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 01 May 2006 19:26:41 +0200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <e35ds3$p77$1@sea.gmane.org>
References: <4453B025.3080100@acm.org><e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de><e33mj1$5nd$1@sea.gmane.org>	<4455A0CF.3080008@v.loewis.de>
	<e35ds3$p77$1@sea.gmane.org>
Message-ID: <445644D1.3050200@v.loewis.de>

Terry Reedy wrote:
> This is not a reason for subproposal two, but a special case, as you 
> yourself note below, and hence does say why you want to have such.
> 
>> def make_person(*, name, age, phone, location):
>>    pass

You weren't asking for a reason, you were asking for an example:
this is one.

> And again, why would you *make* me, the user-programmer, type
> 
> make_person(name=namex, age=agex, phone=phonex, location = locationx)
> #instead of
> make_person(namex,agex,phonex,locationx)
> ?

Because there should be preferably only one obvious way to call that
function. Readers of the code should not need to remember the
order of parameters, instead, the meaning of the parameters should
be obvious in the call. This is the only sane way of doing functions
with many arguments.

> PS.  I see that Guido finally gave a (different) use case for bare * that 
> does make sense to me.

It's actually the same use case: I don't *want* callers to pass these
parameters positionally, to improve readability.

Regards,
Martin

From brett at python.org  Mon May  1 19:42:44 2006
From: brett at python.org (Brett Cannon)
Date: Mon, 1 May 2006 10:42:44 -0700
Subject: [Python-Dev] signature object issues (to discuss while I am out of
	contact)
Message-ID: <bbaeab100605011042y501482fauf830d9803a3a4d67@mail.gmail.com>

Signature objects (which has been lightly discussed on python-3000,
but I realize should be retargeted to 2.6 since there is no
incompatibility problems) are the idea of having an object that
represents the parameters of a function for easy introspection.  But
there are two things that I can't quite decide upon.

One is whether a signature object should be automatically created for
every function.  As of right now the PEP I am drafting has it on a
per-need basis and have it assigned to __signature__ through a
built-in function or putting it 'inspect'.  Now automatically creating
the object would possibly make it more useful, but it could also be
considered overkill.  Also not doing it automatically allows signature
objects to possibly make more sense for classes (to represent
__init__) and instances (to represent __call__).  But having that same
support automatically feels off for some reason to me.

The second question is whether it is worth providing a function that
will either figure out if a tuple and dict representing arguments
would work in calling the function.  Some have even suggested a
function that returns the actual bindings if the call were to occur. 
Personally I don't see a huge use for either, but even less for the
latter version.  If people have a legit use case for either please
speak up, otherwise I am tempted to keep the object simple.

Now, I probably won't be participating in this discussion for the rest
of the week.  I am driving down to the Bay Area from Seattle for the
next few days and have no idea what my Internet access will be like. 
But I wanted to get this discussion going since it kept me up last
night thinking about it and I would like to sleep by knowing
python-dev, in its infinite wisdom <grin>, is considering the issues. 
=)

-Brett

From fredrik at pythonware.com  Mon May  1 19:43:03 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 19:43:03 +0200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org><e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de><e33mj1$5nd$1@sea.gmane.org><4455A0CF.3080008@v.loewis.de>	<e35ds3$p77$1@sea.gmane.org><e35e5u$q8l$1@sea.gmane.org>
	<4456415A.6040406@gradient.cis.upenn.edu>
Message-ID: <e35hb9$5pt$1@sea.gmane.org>

Edward Loper wrote:

> >> And again, why would you *make* me, the user-programmer, type
> >>
> >> make_person(name=namex, age=agex, phone=phonex, location = locationx)
> >> #instead of
> >> make_person(namex,agex,phonex,locationx)
> >> ?
> >
> > because a good API designer needs to consider more than just the current
> > release.
>
> I believe that it's quite possible that you're right, but I think more
> concrete answers would be helpful.  I.e., how does having parameters
> that are syntactically keyword-only (as opposed to being simply
> documented as keyword-only) help you develop an API over time?

As you've figured out, the main advantage is that by forcing people to do what
you intended, you can change things around without breaking existing code.

And others can implement your API:s without having to emulate every possible
implementation artifact.  (I guess this is something that distinguish me from most
other Pythoneers; I don't consider an API design good enough until it has at least
two semi-independent implementations...)

> I gave some thought to it, and can come up with a few answers.  In all
> cases, assume a user BadUser, who decided to use positional arguments
> for arguments that you documented as keyword-only.
>
> - You would like to deprecate, and eventually remove, an argument to
>    a function.  For people who read your documentation, and use
>    keyword args for the keyword-only arguments, their code will break
>    in a clean, easy-to-understand way.  But BadUser's code may break in
>    strange hard-to-understand ways, since their positional arguments will
>    get mapped to the wrong function arguments.
>
> - You would like to add a new parameter to a function, and would like
>    to make that new parameter available for positional argument use.
>    So you'd like to add it before all the keyword arguments.  But this
>    will break BadUser's code.
>
> - You have a function that takes one argument and a bunch of keyword
>    options, and would like to change it to accept *varargs instead of
>    just one argument.  But this will break BadUser's code.
>
> I think that this kind of *concrete* use-case provides a better
> justification for the feature than just saying "it will help API
> design."  As someone who has had a fair amount of experience designing &
> maintaining APIs over time, perhaps you'd care to contribute some more
> use cases where you think having syntactically keyword-only arguments
> would help?

In addition to your cases, the major remaining use case is the "one or more
standard arguments, and one or more options"-style interface.  ET's write
method is a typical example.  The documented interface is

    tree.write(out, **options)

where 'out' is either a filename or something that has a 'write' method, and
the available options are implementation dependent.

An implementer that wants to support an "encoding" option could implement
this as

    def write(self, out, encoding="us-ascii"):

even if he only documents the write(out, **options) form, but developers
using e.g. introspection-based "intellisense" tools or pydoc or help() will in-
variably end up calling this as

    tree.write(out, "utf-8")

which causes problems for reimplementors, and limits how the interface can
be modified in the future (once enough people do this, you cannot even go
back to a stricter "def write(self, out, **options)" approach without breaking
stuff).

I've suffered from enough "specification by reverse engineering" incidents
during my career to ignore things like this...

</F>




From amk at amk.ca  Mon May  1 19:47:11 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Mon, 1 May 2006 13:47:11 -0400
Subject: [Python-Dev] introducing the experimental pyref wiki
In-Reply-To: <e30coa$u43$1@sea.gmane.org>
References: <e30coa$u43$1@sea.gmane.org>
Message-ID: <20060501174711.GA12953@localhost.localdomain>

On Sat, Apr 29, 2006 at 08:54:00PM +0200, Fredrik Lundh wrote:
>     http://pyref.infogami.com/

I find this work very exciting.  Time hasn't been kind to the
reference guide -- as language features were added to 2.x, not
everything has been applied to the RefGuide, and users will probably
have been forced to read a mixture of the RefGuide and various PEPs.

The Reference Guide tries to provide a formal specification of the
language.  A while ago I wondered if we needed a "User's Guide" that
explains all the keywords, lists special methods, and that sort of
thing, in a style that isn't as formal and as complete as the
Reference Guide.  Now maybe we don't -- maybe the RefGuide can be
tidied bit by bit into something more readable.  

(Or are the two goals -- completeness and readability --
incompossible, unable to be met at the same time by one document?)

--amk

From john at integralsource.com  Mon May  1 19:51:37 2006
From: john at integralsource.com (John Keyes)
Date: Mon, 1 May 2006 18:51:37 +0100
Subject: [Python-Dev] unittest argv
In-Reply-To: <ca471dc20605011023y1a2a2279l10decee091118c32@mail.gmail.com>
References: <c0141a030604301859y3937484cqe40fec2b04192296@mail.gmail.com>
	<ca471dc20605010748p36e32c49y5d29f7bc0277c216@mail.gmail.com>
	<c0141a030605011011vceeae32n4b07b00be1dedf91@mail.gmail.com>
	<ca471dc20605011023y1a2a2279l10decee091118c32@mail.gmail.com>
Message-ID: <c0141a030605011051m17051a98yeca74abd321a2105@mail.gmail.com>

On 5/1/06, Guido van Rossum <guido at python.org> wrote:
> On 5/1/06, John Keyes <john at integralsource.com> wrote:
> > On 5/1/06, Guido van Rossum <guido at python.org> wrote:
> > > Wouldn't this be an incompatible change? That would make it a no-no.
> > > Providing a dummy argv[0] isn't so hard is it?
> >
> > It would be incompatible with existing code, but that code is
> > already broken (IMO) by passing a dummy argv[0].
>
> That's a new meaning of "broken", one that I haven't heard before.
> It's broken because it follows the API?!?!

Fair enough, a bad use of language on my part.

> >I don't
> > think fixing it would affect much code, because normally
> > people don't specify the '-q' or '-v' in code, it is almost
> > exclusively used on the command line.
>
> Famous last words.

Probably ;)

> > The only reason I came across it was that I was modifying
> > an ant task (py-test) so it could handle all of the named
> > arguments that TestProgram.__init__ supports.
> >
> > If the list index code can't change, at a minimum the default value
> > for argv should change from None to sys.argv.
>
> No. Late binding of sys.argv is very important. There are plenty of
> uses where sys.argv is dynamically modified.

Can you explain this some more?  If it all happens in the same
function call so how can it be late binding?

> > Are the tests for unittest.py?
>
> Assuming you meant "Are there tests", yes: test_unittest.py. But it needs work.

Ok thanks,
-John K

From aahz at pythoncraft.com  Mon May  1 20:00:49 2006
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 1 May 2006 11:00:49 -0700
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
	decorator
In-Reply-To: <4455F8E5.7070203@canterbury.ac.nz>
References: <4454BA62.5080704@iinet.net.au>
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
	<4455EB14.6000709@canterbury.ac.nz> <4455F415.3040104@gmail.com>
	<4455F8E5.7070203@canterbury.ac.nz>
Message-ID: <20060501180049.GA16007@panix.com>

On Tue, May 02, 2006, Greg Ewing wrote:
> Nick Coghlan wrote:
>>
>> the context expression in the with statement produces a context
>> manager with __enter__ and __exit__ methods which set up and tear
>> down a managed context for the body of the with statement. This is
>> very similar to your later suggestion of context guard and guarded
>> context.
>
> Currently I think I still prefer the term "guard", since it does a
> better job of conjuring up the same sort of idea as a try-finally.

"Guard" really doesn't work for me.  It seems clear (especially in light
of Fredrik's docs) that "wrapper" comes much closer to what's going on.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Argue for your limitations, and sure enough they're yours."  --Richard Bach

From dinov at exchange.microsoft.com  Mon May  1 17:31:40 2006
From: dinov at exchange.microsoft.com (Dino Viehland)
Date: Mon, 1 May 2006 08:31:40 -0700
Subject: [Python-Dev] __getslice__ usage in sre_parse
In-Reply-To: <ca471dc20604300830i7cc814f9o183107b6e71329b2@mail.gmail.com>
Message-ID: <4039D552ADAB094BB1EA670F3E96214E02ADE24F@df-foxhound-msg.exchange.corp.microsoft.com>

I've also opened a bug for supporting __getslice__ in IronPython.

Do you want to help develop Dynamic languages on CLR? (http://members.microsoft.com/careers/search/details.aspx?JobID=6D4754DE-11F0-45DF-8B78-DC1B43134038)

-----Original Message-----
From: python-dev-bounces+dinov=microsoft.com at python.org [mailto:python-dev-bounces+dinov=microsoft.com at python.org] On Behalf Of Guido van Rossum
Sent: Sunday, April 30, 2006 8:31 AM
To: Sanghyeon Seo
Cc: Discussion of IronPython; python-dev at python.org
Subject: Re: [Python-Dev] __getslice__ usage in sre_parse

On 4/28/06, Sanghyeon Seo <sanxiyn at gmail.com> wrote:
> Hello,
>
> Python language reference 3.3.6 deprecates __getslice__. I think it's
> okay that UserList.py has it, but sre_parse shouldn't use it, no?

Well, as long as the deprecated code isn't removed, there's no reason
why other library code shouldn't use it. So I disagree that
technically there's a reason why sre_parse shouldn't use it.

> __getslice__ is not implemented in IronPython and this breaks usage of
> _sre.py, a pure-Python implementation of _sre, on IronPython:
> http://ubique.ch/code/_sre/
>
> _sre.py is needed for me because IronPython's own regex implementation
> using underlying .NET implementation is not compatible enough for my
> applications. I will write a separate bug report for this.
>
> It should be a matter of removing __getslice__ and adding
> isinstance(index, slice) check in __getitem__. I would very much
> appreciate it if this is fixed before Python 2.5.

You can influence the fix yourself -- please write a patch (relative
to Python 2.5a2 that was just released), submit it to Python's patch
tracker on SourceForge (read python.org/dev first), and then sending
an email here to alert the developers. This ought to be done well
before the planned 2.5b1 release (see PEP 256 for the 2.5 release
timeline). You should make sure that the patched Python 2.5 passes all
unit tests before submitting your test.

Good luck!

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/dinov%40microsoft.com

From tjreedy at udel.edu  Mon May  1 20:07:16 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 1 May 2006 14:07:16 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org>
	<e33mj1$5nd$1@sea.gmane.org><44557789.5020107@gradient.cis.upenn.edu><200605010930.23198.fdrake@acm.org><44560F74.5000600@gradient.cis.upenn.edu>
	<ca471dc20605010728g2742bd01m4b4b284c15eabb18@mail.gmail.com>
Message-ID: <e35ioj$al9$1@sea.gmane.org>


"Guido van Rossum" <guido at python.org> wrote in message 
news:ca471dc20605010728g2742bd01m4b4b284c15eabb18 at mail.gmail.com...

> A function/method could have one argument that is obviously needed and
> a whole slew of options that few people care about. For most people,
> the signature they know is foo(arg). It would be nice if all the
> options were required to be written as keyword arguments, so the
> reader will not have to guess what foo(arg, True) means.

Ok, this stimulates an old memory of written IBM JCL statements something 
like

DD FILENAME,,,,,2,,HOP

where you had to carefully count commas to correctly write and later read 
which defaulted options were being overridden.

dd(filename, unit=2, meth = 'hop') is much nicer.

So it seems to me now that '*, name1 = def1, name2=def2, ...' in the 
signature is a really a substitute for and usually an improvement upon 
(easier to write, read, and programmaticly extract) '**names' in the 
signature followed by
  name1 = names.get('name1', def1)
  name2 = names.get('name2', def2)
  ...
  (with the semantic difference being when defs are calculated)

(But I still don't quite see why one would require that args for required, 
non-defaulted param be passed by name instead of position ;-).

As something of an aside, the use of 'keyword' to describe arguments passed 
by name instead of position conflicts with and even contradicts the Ref Man 
definition of keyword as an identifier, with a reserved use, that cannot be 
used as a normal identifier.  The parameter names used to pass an argument 
by name are normal identifiers and cannot be keywords in the other sense of 
the word.  (Yes, I understand that this is usual CS usage, but Python docs 
can be more careful.)  So I would like to see the name of this PEP changed 
to 'Name-only arguments'

Terry Jan Reedy




From tjreedy at udel.edu  Mon May  1 20:09:26 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 1 May 2006 14:09:26 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org>	<e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de>	<e33mj1$5nd$1@sea.gmane.org><4455A0CF.3080008@v.loewis.de><445600B3.7040903@gradient.cis.upenn.edu><44562A4D.30102@v.loewis.de><e35a4q$c23$1@sea.gmane.org>
	<e35g9r$275$1@sea.gmane.org>
Message-ID: <e35isl$b4h$1@sea.gmane.org>


"Terry Reedy" <tjreedy at udel.edu> wrote in message 
news:e35g9r$275$1 at sea.gmane.org...
>
> "Fredrik Lundh" <fredrik at pythonware.com> wrote in message 
> news:e35a4q$c23$1 at sea.gmane.org...
>> which reminds me of the following little absurdity gem from the language
>> reference:

> I am not sure of what you see as absurdity,

Perhaps I do.  Were you referring to what I wrote in the last paragraph of 
my response to Guido?

tjr




From fredrik at pythonware.com  Mon May  1 20:11:08 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 20:11:08 +0200
Subject: [Python-Dev] more pyref: continue in finally statements
Message-ID: <e35ivt$biu$1@sea.gmane.org>

the language reference says:

    continue may only occur syntactically nested in a for or while loop,
    but not nested in a function or class definition or finally statement
    within that loop. /.../

    It may occur within an except or else clause. The restriction on occurring
    in the try clause is implementor's laziness and will eventually be lifted.

and it looks like the new compiler still has the same issue:

    $ python test.py
        File "test.py", line 5:
            continue
    SyntaxError: 'continue' not supported inside 'finally' clause

how hard would it be to fix this ?

(shouldn't the "try clause" in the note read "finally clause", btw?  "continue"
within the "try" suite seem to work just fine...)

</F>




From guido at python.org  Mon May  1 20:24:19 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 11:24:19 -0700
Subject: [Python-Dev] more pyref: continue in finally statements
In-Reply-To: <e35ivt$biu$1@sea.gmane.org>
References: <e35ivt$biu$1@sea.gmane.org>
Message-ID: <ca471dc20605011124p573c5cc1of812f08ddbfca6a4@mail.gmail.com>

Strange. I thought this was supposed to be fixed? (But I can confirm
that it isn't.)

BTW there's another bug in the compiler: it doesn't diagnose this
inside "while 0".

--Guido

On 5/1/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
> the language reference says:
>
>     continue may only occur syntactically nested in a for or while loop,
>     but not nested in a function or class definition or finally statement
>     within that loop. /.../
>
>     It may occur within an except or else clause. The restriction on occurring
>     in the try clause is implementor's laziness and will eventually be lifted.
>
> and it looks like the new compiler still has the same issue:
>
>     $ python test.py
>         File "test.py", line 5:
>             continue
>     SyntaxError: 'continue' not supported inside 'finally' clause
>
> how hard would it be to fix this ?
>
> (shouldn't the "try clause" in the note read "finally clause", btw?  "continue"
> within the "try" suite seem to work just fine...)
>
> </F>
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  1 20:31:00 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 11:31:00 -0700
Subject: [Python-Dev] unittest argv
In-Reply-To: <c0141a030605011051m17051a98yeca74abd321a2105@mail.gmail.com>
References: <c0141a030604301859y3937484cqe40fec2b04192296@mail.gmail.com>
	<ca471dc20605010748p36e32c49y5d29f7bc0277c216@mail.gmail.com>
	<c0141a030605011011vceeae32n4b07b00be1dedf91@mail.gmail.com>
	<ca471dc20605011023y1a2a2279l10decee091118c32@mail.gmail.com>
	<c0141a030605011051m17051a98yeca74abd321a2105@mail.gmail.com>
Message-ID: <ca471dc20605011131n38736fd3k1a7d6db89e81af8d@mail.gmail.com>

On 5/1/06, John Keyes <john at integralsource.com> wrote:
> > No. Late binding of sys.argv is very important. There are plenty of
> > uses where sys.argv is dynamically modified.
>
> Can you explain this some more?  If it all happens in the same
> function call so how can it be late binding?

You seem to be unaware of the fact that defaults are computed once,
when the 'def' is executed (typically when the module is imported).

Consider module A containing this code:

  import sys
  def foo(argv=sys.argv):
      print argv

and module B doing

  import sys
  import A
  sys.argv = ["a", "b", "c"]
  A.foo()

This will print the initial value for sys.argv, not ["a", "b", "c"].
With the late binding version it will print ["a", "b", "c"]:

  def foo(argv=None):
    if argv is None:
      argv = sys.argv
    print argv

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon May  1 20:37:37 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 11:37:37 -0700
Subject: [Python-Dev] introducing the experimental pyref wiki
In-Reply-To: <20060501174711.GA12953@localhost.localdomain>
References: <e30coa$u43$1@sea.gmane.org>
	<20060501174711.GA12953@localhost.localdomain>
Message-ID: <ca471dc20605011137q1e69dc97l9ba34c4f4bb071b4@mail.gmail.com>

Agreed. Is it too late to also attempt to bring Doc/ref/*.tex
completely up to date and remove confusing language from it? Ideally
that's the authoritative Language Reference -- admittedly it's been
horribly out of date but needn't stay so forever.

--Guido

On 5/1/06, A.M. Kuchling <amk at amk.ca> wrote:
> On Sat, Apr 29, 2006 at 08:54:00PM +0200, Fredrik Lundh wrote:
> >     http://pyref.infogami.com/
>
> I find this work very exciting.  Time hasn't been kind to the
> reference guide -- as language features were added to 2.x, not
> everything has been applied to the RefGuide, and users will probably
> have been forced to read a mixture of the RefGuide and various PEPs.
>
> The Reference Guide tries to provide a formal specification of the
> language.  A while ago I wondered if we needed a "User's Guide" that
> explains all the keywords, lists special methods, and that sort of
> thing, in a style that isn't as formal and as complete as the
> Reference Guide.  Now maybe we don't -- maybe the RefGuide can be
> tidied bit by bit into something more readable.
>
> (Or are the two goals -- completeness and readability --
> incompossible, unable to be met at the same time by one document?)
>
> --amk
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Mon May  1 20:44:52 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 01 May 2006 14:44:52 -0400
Subject: [Python-Dev] introducing the experimental pyref wiki
In-Reply-To: <ca471dc20605011137q1e69dc97l9ba34c4f4bb071b4@mail.gmail.co
 m>
References: <20060501174711.GA12953@localhost.localdomain>
	<e30coa$u43$1@sea.gmane.org>
	<20060501174711.GA12953@localhost.localdomain>
Message-ID: <5.1.1.6.0.20060501144339.01e6f6c8@mail.telecommunity.com>

At 11:37 AM 5/1/2006 -0700, Guido van Rossum wrote:
>Agreed. Is it too late to also attempt to bring Doc/ref/*.tex
>completely up to date and remove confusing language from it? Ideally
>that's the authoritative Language Reference -- admittedly it's been
>horribly out of date but needn't stay so forever.

Well, I added stuff for PEP 343, but PEP 342 (yield expression plus 
generator-iterator methods) hasn't really been added yet, mostly because I 
was unsure of how to fit it in without one of those "first let's explain it 
how it was, then how we changed it" sort of things.  :(


From edloper at gradient.cis.upenn.edu  Mon May  1 20:40:31 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Mon, 01 May 2006 14:40:31 -0400
Subject: [Python-Dev] signature object issues (to discuss while I am out
 of	contact)
In-Reply-To: <bbaeab100605011042y501482fauf830d9803a3a4d67@mail.gmail.com>
References: <bbaeab100605011042y501482fauf830d9803a3a4d67@mail.gmail.com>
Message-ID: <4456561F.5090804@gradient.cis.upenn.edu>

Brett Cannon wrote:
> The second question is whether it is worth providing a function that
> will either figure out if a tuple and dict representing arguments
> would work in calling the function.  Some have even suggested a
> function that returns the actual bindings if the call were to occur. 
> Personally I don't see a huge use for either, but even less for the
> latter version.  If people have a legit use case for either please
> speak up, otherwise I am tempted to keep the object simple.

One use case that comes to mind is a type-checking decorator (or 
precondition-checking decorator, etc):

@precondition(lambda x,y: x>y)
@precondition(lambda y,z: y>z)
def foo(x, y, z): ...

where precondition is something like:

def precondition(test):
     def add_precondition(func):
         def f(*args, **kwargs):
             bindings = func.__signature__.bindings(args, kwargs)
             if not test(**bindings):
                 raise ValueError, 'Precontition not met'
             func(*args, **kwargs)
         return f
     return add_precondition

-Edward

From fuzzyman at voidspace.org.uk  Mon May  1 20:47:39 2006
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Mon, 01 May 2006 19:47:39 +0100
Subject: [Python-Dev] introducing the experimental pyref wiki
In-Reply-To: <20060501174711.GA12953@localhost.localdomain>
References: <e30coa$u43$1@sea.gmane.org>
	<20060501174711.GA12953@localhost.localdomain>
Message-ID: <445657CB.2040603@voidspace.org.uk>

A.M. Kuchling wrote:
> On Sat, Apr 29, 2006 at 08:54:00PM +0200, Fredrik Lundh wrote:
>   
>>     http://pyref.infogami.com/
>>     
>
> I find this work very exciting.  Time hasn't been kind to the
> reference guide -- as language features were added to 2.x, not
> everything has been applied to the RefGuide, and users will probably
> have been forced to read a mixture of the RefGuide and various PEPs.
>
> The Reference Guide tries to provide a formal specification of the
> language.  A while ago I wondered if we needed a "User's Guide" that
> explains all the keywords, lists special methods, and that sort of
> thing, in a style that isn't as formal and as complete as the
> Reference Guide.  Now maybe we don't -- maybe the RefGuide can be
> tidied bit by bit into something more readable.  
>   
At my company we recently got badly bitten because the language syntax 
as defined in the 'language reference' varies wildly from the grammar in 
SVN.

We had to implement it twice !

When we tried to check syntax usage (as distinct from the BNF type 
specification in the grammar) we found that things like keyword 
arguments (etc) are not documented anywhere, except possibly in the 
tutorial.

Currently the language reference seems to neither reflect the language 
definition in the grammar, nor be a good reference for users (except 
parts that are excellent).

A users guide which straddles the middle would be very useful, and with 
some shepherding can probably be mainly done by community input.

I also find that when trying to implement objects with 'the magic 
methods', I have to search in several places in the documentation.

For example, to implement a mapping type I will probably need to refer 
to the following pages :

    http://docs.python.org/ref/sequence-types.html
    http://docs.python.org/lib/typesmapping.html
    http://docs.python.org/ref/customization.html

Michael Foord 

> (Or are the two goals -- completeness and readability --
> incompossible, unable to be met at the same time by one document?)
>
> --amk
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>
>   


From martin at v.loewis.de  Mon May  1 20:52:44 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 01 May 2006 20:52:44 +0200
Subject: [Python-Dev] introducing the experimental pyref wiki
In-Reply-To: <ca471dc20605011137q1e69dc97l9ba34c4f4bb071b4@mail.gmail.com>
References: <e30coa$u43$1@sea.gmane.org>	<20060501174711.GA12953@localhost.localdomain>
	<ca471dc20605011137q1e69dc97l9ba34c4f4bb071b4@mail.gmail.com>
Message-ID: <445658FC.70806@v.loewis.de>

Guido van Rossum wrote:
> Agreed. Is it too late to also attempt to bring Doc/ref/*.tex
> completely up to date and remove confusing language from it? Ideally
> that's the authoritative Language Reference -- admittedly it's been
> horribly out of date but needn't stay so forever.

It's never too late to update the specification. I really think there
should be a specification, and I really think it should be as precise
as possible - where "possible" takes both of these into account:
- it may get out of date due to lack of contributors. This is free
  software, and you don't always get what you want unless you do
  it yourself (and even then, sometimes not).
- it might be deliberately vague to allow for different implementation
  strategies. Ideally, it would be precise in pointing out where it
  is deliberately vague.

So I think the PEPs all should be merged into the documentation,
at least their specification parts (rationale, history, examples
might stay in the PEPs).

To some degree, delivery of documentation can be enforced by making
acceptance of the PEP conditional upon creation of documentation
patches. Once the feature is committed, we can only hope for (other)
volunteers to provide the documentation, or keep nagging the author
of the code to produce it.

Regards,
Martin

From jcarlson at uci.edu  Mon May  1 21:02:21 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 01 May 2006 12:02:21 -0700
Subject: [Python-Dev] methods on the bytes object
In-Reply-To: <4455CD18.8030200@v.loewis.de>
References: <20060430231024.674A.JCARLSON@uci.edu>
	<4455CD18.8030200@v.loewis.de>
Message-ID: <20060501092208.6756.JCARLSON@uci.edu>


Before I get into my reply, I'm going to start out by defining a new
term:

operationX - the operation of interpreting information differently than
how it is presented, generally by constructing a data structure based on
the input information.
    eg; programming language source file -> parse tree,
        natual language -> parse tree or otherwise,
        structured data file -> data structure (tree, dictionary, etc.),
        etc.
synonyms: parsing, unmarshalling, interpreting, ...

Any time I would previously describe something as some variant of 'parse',
replace that with 'operationX'.  I will do that in all of my further
replies.

"Martin v. L?wis" <martin at v.loewis.de> wrote:
> Josiah Carlson wrote:
> > Certainly that is the case.  But how would you propose embedded bytes
> > data be represented? (I talk more extensively about this particular
> > issue later).
> 
> Can't answer: I don't know what "embedded bytes data" are.

I described this before as the output of img2py from wxPython.  Here's a
sample which includes the py.ico from Python 2.3 .

#----------------------------------------------------------------------
# This file was generated by C:\Python23\Scripts\img2py
#
from wx import ImageFromStream, BitmapFromImage
import cStringIO, zlib


def getData():
    return zlib.decompress(
'x\xda\x01\x14\x02\xeb\xfd\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00 \
\x00\x00\x00 \x08\x06\x00\x00\x00szz\xf4\x00\x00\x00\x04sBIT\x08\x08\x08\x08\
|\x08d\x88\x00\x00\x01\xcbIDATX\x85\xb5W\xd1\xb6\x84 \x08\x1ct\xff;\xf6\xc3\
\x93\xfb`\x18\x92f\xb6^:\x9e\xd4\x94\x19\x04\xd4\x88BDO$\xed\x02\x00\x14"i]\
\xdb\xddI\x93B=\x02\x92va\xceu\xe6\\T\x98\xd7\x91h\x12\xb0\xd6z\xb1\xa4V\x90\
\xf8\xf4>\xc8A\x81\xe8\xack\xdb\xae\xc6\xbf\x11\xc8`\x02\x80L\x1d\xa5\xbdJ\
\xc2Rm\xab\t\x88PU\xb7m\xe0>V^\x13(\xa9G\xa7\xbf\xb5n\xfd\xafoI\xbbhyC\xa0\
\xc4\x80*h\x05\xd8]\xd0\xd5\xe9y\xee\x1bO\t\x10\x85X\xe5\xfc\x8c\xf0\x06\xa0\
\x91\x153)\x1af\xc1y\xab\xdf\x906\x81\xa7.)1\xe0w\xba\x1e\xb0\xaf\xff*C\x02\
\x17`\xc2\xa3\xad\xe0\xe9*\x04U\xec\x97\xb6i\xb1\x02\x0b\xc0_\xd3\xf7C2\xe6\
\xe9#\x05\xc6bfG\xc8\xcc-\xa4\xcc\xd8Q0\x06\n\x91\x14@\x15To;\xdd\x125s<\xf0\
\x8c\x94\xd3\xd0\xfa\xab\xb5\xeb{\xcb\xcb\x1d\xe1\xd7\x15\xf0\x1d\x1e\x9c9wz\
p\x0f\xfa\x06\x1cp\xa7a\x01?\x82\x8c7\x80\xf5\xe3\xa1J\x95\xaa\xf5\xdc\x00\
\x9f\x91\xe2\x82\xa4g\x80\x0f\xc8\x06p9\x0f\xb66\xf8\xccNH\x14\xe2\t\xde\x1a\
`\x14\x8d|>\x0b\x0e\x00\x9f\x94v\t!RJ\xbb\xf4&VV/\x04\x97\xb4K\xe5\x82\xe0&\
\x97\xcc\x18X\xfd\x16\x1cxx+\x06\xfa\xfeVp+\x17\xb7\xb9~\xd5\xcd<\xb8\x13V\
\xdb\xf1\r\xf8\xf54\xcc\xee\xbc\x18\xc1\xd7;G\x93\x80\x0f\xb6.\xc1\x06\xf8\
\xd9\x7f=\xe6[c\xbb\xff\x05O\x97\xff\xadh\xcct]\xb0\xf2\xcc/\xc6\x98mV\xe3\
\xe1\xf1\xb5\xbcGhDT--\x87\x9e\xdb\xca\xa7\xb2\xe0"\xe6~\xd0\xfb6LM\n\xb1[\
\x90\xef\n\xe5a>j\x19R\xaaq\xae\xdc\xe9\xad\xca\xdd\xef\xb9\xaeD\x83\xf4\xb2\
\xff\xb3?\x1c\xcd1U-7%\x96\x00\x00\x00\x00IEND\xaeB`\x82\xdf\x98\xf1\x8f' )

def getBitmap():
    return BitmapFromImage(getImage())

def getImage():
    stream = cStringIO.StringIO(getData())
    return ImageFromStream(stream)


That data is non-textual.  It is bytes within a string literal.  And it
is embedded (within a .py file).


> > I am apparently not communicating this particular idea effectively
> > enough.  How would you propose that I store parsing literals for
> > non-textual data, and how would you propose that I set up a dictionary
> > to hold some non-trivial number of these parsing literals?
> 
> I can't answer that question: I don't know what a "parsing literal
> for non-textual data" is. If you are asking how you represent bytes
> object in source code: I would encode them as a list of integers,
> then use, say,
> 
>   parsing_literal = bytes([3,5,30,99])

An operationX literal is a symbol that describes how to interpret the
subsequent or previous data.  For an example of this, see the pickle
module (portions of which I include below).


> > From what I understand, it would seem that you would suggest that I use
> > something like the following...
> > 
> > handler = {bytes('...', encoding=...).encode('latin-1'): ...,
> >            #or
> >            '\uXXXX\uXXXX...': ...,
> >            #or even without bytes/str
> >            (0xXX, 0xXX, ...): ..., }
> > 
> > Note how two of those examples have non-textual data inside of a Python
> > 3.x string?  Yeah.
> 
> Unfortunately, I don't notice. I assume you don't mean a literal '...';
> if this is what you represent, I would write
> 
> handler = { '...': "some text" }
> 
> But I cannot guess what you want to put into '...' instead.

I described before how you would use this kind of thing to perform
operationX on structured information.  It turns out that pickle (in
Python) uses a dictionary of operationX symbols/literals -> unbound
instance methods to perform operationX on the pickled representation of
Python objects (literals where XXXX = '...' are defined, and symbols
using the XXXX names). The relevant code for unpickling is the while 1:
section of the following.

    def load(self):
        """Read a pickled object representation from the open file.

        Return the reconstituted object hierarchy specified in the file.
        """
        self.mark = object() # any new unique object
        self.stack = []
        self.append = self.stack.append
        read = self.read
        dispatch = self.dispatch
        try:
            while 1:
                key = read(1)
                dispatch[key](self)
        except _Stop, stopinst:
            return stopinst.value


> >>> We've not removed the problem, only changed it from being contained 
> >>> in non-unicode
> >>> strings to be contained in unicode strings (which are 2 or 4 times larger
> >>> than their non-unicode counterparts).
> >> We have removed the problem.
> > 
> > Excuse me?  People are going to use '...' to represent literals of all
> > different kinds.
> 
> In Python 3, '...' will be a character string. You can't use it to
> represent anything else but characters.
> 
> Can you give examples of actual source code where people use '...'
> something other than text?


For an example of where people use '...' to represent non-textual
information in a literal, see the '# Protocol 2' section of pickle.py ...


# Protocol 2

PROTO           = '\x80'  # identify pickle protocol
NEWOBJ          = '\x81'  # build object by applying cls.__new__ to argtuple
EXT1            = '\x82'  # push object from extension registry; 1-byte index
EXT2            = '\x83'  # ditto, but 2-byte index
EXT4            = '\x84'  # ditto, but 4-byte index
TUPLE1          = '\x85'  # build 1-tuple from stack top
TUPLE2          = '\x86'  # build 2-tuple from two topmost stack items
TUPLE3          = '\x87'  # build 3-tuple from three topmost stack items
NEWTRUE         = '\x88'  # push True
NEWFALSE        = '\x89'  # push False
LONG1           = '\x8a'  # push long from < 256 bytes
LONG4           = '\x8b'  # push really big long


Also look at the getData() function I defined earlier in this post for
non-text string literals.


> > What does pickle.load(...) do to the files that are passed into it?  It
> > reads the (possibly binary) data it reads in from a file (or file-like
> > object), performing a particular operation based based on a dictionary
> > of expected tokens in the file, producing a Python object. I would say
> > that pickle 'parses' the content of a file, which I presume isn't
> > necessary text.
> 
> Right. I think pickle can be implemented with just the bytes type.
> There is a textual and a binary version of the pickle format. The
> textual should be read as text; the binary version using bytes.
> (I *thinK* the encoding of the textual version is ASCII, but one
> would have to check).

The point of this example was to show that operationX isn't necessarily
the processing of text, but may in fact be the interpretation of binary
data. It was also supposed to show how one may need to define symbols
for such interpretation via literals of some kind.  In the pickle module,
this is done in two parts: XXX = <literal>; dispatch[XXX] = fcn.  I've
also seen it as dispatch = {<literal>: fcn}

In regards to the text pickles using text as input, and binary pickles
using bytes as input, I would remind you that bytes are mutable.  This
isn't quite as big a deal with pickle, but one would need to either add
both the '...' and int literals to the dictionary:
    dispatch[SYMBOL] = fcn
    dispatch[SYMBOL.encode('latin-1')[0]] = fcn

Or always use text _or_ integers in in the decoding:

#decoding bytes when only str in dispatch...

    dispatch[bytesX.decode('latin-1')](self)

#encoding str when only int in dispatch...

    dispatch[strX.encode('latin-1')[0]](self)


> > Replace pickle with a structured data storage format of your choice.
> > It's still parsing (at least according to my grasp of English, which
> > could certainly be flawed (my wife says as much on a daily basis)).
> 
> Well, I have no "native language" intuition with respect to these
> words - I only use them in the way I see them used elsewhere. I see
> "parsing" typically associated with textual data, and "unmarshalling"
> with binary data.

Before Python, I hadn't heard of 'marshal' in relation to the storage or
retrieval of data structures as data, and at least in my discussion
about such topics with my friends and colleagues, unmarshalling is an
example of parsing.

In any case, you can replace many of my uses of 'parsing' with
'unmarshalling', or even 'operationX', which I defined at the beginning
of this email.


> >>> Look how successful and
> >>> effective it has been so far in the history of Python.  In order to make
> >>> the bytes object be as effective in 3.x, one would need to add basically
> >>> all of the Python 2.x string methods to it
> >> The precondition of this clause is misguided: the bytes type doesn't
> >> need to be as effective, since the string type is as effective in 2.3,
> >> so you can do all parsing based on strings.
> > 
> > Not if those strings contain binary data.
> 
> I couldn't (until right now) understand what you mean by "parsing binary
> data". I still doubt that you typically need the same operations for
> unmarshalling that you need for parsing.
> 
> In parsing, you need to look (scan) for separators, follow a possibly
> recursive grammar, and so on. For unmarshalling, you typically have
> a TLV (tag-length-value) structure, where you read a tag (of fixed
> size), then the length, then the value (of the size indicated in the
> length). There are variations, of course, but you typically don't
> need .find, .startswith, etc.

See any line-based socket protocol for where .find() is useful.  We'll
take telnet, for example.  A telnet line is terminated by a CRLF pair,
and being able to interpret incoming information on a line-by-line basis
is sufficient to support all of the required portions of the RFC.  In
addition, telnet lines may contain non-text information as escapes
(ascii 0-8, 11-12, 14-31, 128-255), regardless of line buffering.


 - Josiah


From martin at v.loewis.de  Mon May  1 21:01:32 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 01 May 2006 21:01:32 +0200
Subject: [Python-Dev] more pyref: continue in finally statements
In-Reply-To: <ca471dc20605011124p573c5cc1of812f08ddbfca6a4@mail.gmail.com>
References: <e35ivt$biu$1@sea.gmane.org>
	<ca471dc20605011124p573c5cc1of812f08ddbfca6a4@mail.gmail.com>
Message-ID: <44565B0C.6010207@v.loewis.de>

Guido van Rossum wrote:
> Strange. I thought this was supposed to be fixed? (But I can confirm
> that it isn't.)

Perhaps you were confusing it with this HISTORY entry?

- A 'continue' statement can now appear in a try block within the body
  of a loop.  It is still not possible to use continue in a finally
  clause.

This was added as

------------------------------------------------------------------------
r19261 | jhylton | 2001-02-01 23:53:15 +0100 (Do, 01 Feb 2001) | 2 lines
Ge?nderte Pfade:
   M /python/trunk/Misc/NEWS

continue now allowed in try block

------------------------------------------------------------------------
r19260 | jhylton | 2001-02-01 23:48:12 +0100 (Do, 01 Feb 2001) | 3 lines
Ge?nderte Pfade:
   M /python/trunk/Doc/ref/ref7.tex
   M /python/trunk/Include/opcode.h
   M /python/trunk/Lib/dis.py
   M /python/trunk/Lib/test/output/test_exceptions
   M /python/trunk/Lib/test/output/test_grammar
   M /python/trunk/Lib/test/test_exceptions.py
   M /python/trunk/Lib/test/test_grammar.py
   M /python/trunk/Python/ceval.c
   M /python/trunk/Python/compile.c

Allow 'continue' inside 'try' clause
SF patch 102989 by Thomas Wouters

------------------------------------------------------------------------

Regards,
Martin

From fredrik at pythonware.com  Mon May  1 21:02:38 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 21:02:38 +0200
Subject: [Python-Dev] introducing the experimental pyref wiki
References: <e30coa$u43$1@sea.gmane.org>
	<20060501174711.GA12953@localhost.localdomain>
Message-ID: <e35m0g$lj1$1@sea.gmane.org>

A.M. Kuchling wrote:

> I find this work very exciting.  Time hasn't been kind to the
> reference guide -- as language features were added to 2.x, not
> everything has been applied to the RefGuide, and users will probably
> have been forced to read a mixture of the RefGuide and various PEPs.

or as likely, mailing list archives.

> The Reference Guide tries to provide a formal specification of the
> language.  A while ago I wondered if we needed a "User's Guide" that
> explains all the keywords, lists special methods, and that sort of
> thing, in a style that isn't as formal and as complete as the
> Reference Guide.  Now maybe we don't -- maybe the RefGuide can be
> tidied bit by bit into something more readable.
>
> (Or are the two goals -- completeness and readability --
> incompossible, unable to be met at the same time by one document?)

well, I'm biased, but I'm convinced that the pyref material (which consists
of the entire language reference plus portions of the library reference) can
be both complete and readable.  I don't think it can be complete, readable,
and quite as concise as before, though ;-)

(see the "a bit more inviting" part on the front-page for the guidelines I've
been using this far).

</F>




From fredrik at pythonware.com  Mon May  1 21:08:07 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 21:08:07 +0200
Subject: [Python-Dev] introducing the experimental pyref wiki
References: <e30coa$u43$1@sea.gmane.org><20060501174711.GA12953@localhost.localdomain>
	<ca471dc20605011137q1e69dc97l9ba34c4f4bb071b4@mail.gmail.com>
Message-ID: <e35mao$ml2$1@sea.gmane.org>

Guido van Rossum wrote:

> Agreed. Is it too late to also attempt to bring Doc/ref/*.tex
> completely up to date and remove confusing language from it? Ideally
> that's the authoritative Language Reference -- admittedly it's been
> horribly out of date but needn't stay so forever.

it's perfectly possible to generate a list of changes from the wiki which
some volunteer could apply to the existing document.

or we could generate a complete new set of Latex documents from the
wiki contents.  it's just a small matter of programming...

I'm not sure I would bother, though -- despite Fred's heroic efforts, the
toolchain is nearly as dated as the language reference itself.  XHTML and
ODT should be good enough, really.

</F>




From martin at v.loewis.de  Mon May  1 21:09:36 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 01 May 2006 21:09:36 +0200
Subject: [Python-Dev] more pyref: continue in finally statements
In-Reply-To: <e35ivt$biu$1@sea.gmane.org>
References: <e35ivt$biu$1@sea.gmane.org>
Message-ID: <44565CF0.4000207@v.loewis.de>

Fredrik Lundh wrote:
> the language reference says:
> 
>     continue may only occur syntactically nested in a for or while loop,
>     but not nested in a function or class definition or finally statement
>     within that loop. /.../
> 
>     It may occur within an except or else clause. The restriction on occurring
>     in the try clause is implementor's laziness and will eventually be lifted.
> 
> and it looks like the new compiler still has the same issue:
> 
>     $ python test.py
>         File "test.py", line 5:
>             continue
>     SyntaxError: 'continue' not supported inside 'finally' clause
> 
> how hard would it be to fix this ?
> 
> (shouldn't the "try clause" in the note read "finally clause", btw?  "continue"
> within the "try" suite seem to work just fine...)

For the latter: the documentation apparently wasn't fully updated in
r19260: it only changed ref7.tex, but not ref6.tex. IOW, it really
means to say "in the try clause", and it is out-of-date in saying so.

As for "continue in the 'finally' clause: What would that mean?
Given

def f():
  raise Exception

while 1:
  try:
    f()
  finally:
    g()
    continue

then what should be the meaning of "continue" here? The finally
block *eventually* needs to re-raise the exception. When should
that happen?

So I would say: It's very easy to fix, just change the message to

     SyntaxError: 'continue' not allowed inside 'finally' clause

:-)

Regards,
Martin

From martin at v.loewis.de  Mon May  1 21:24:30 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 01 May 2006 21:24:30 +0200
Subject: [Python-Dev] methods on the bytes object
In-Reply-To: <20060501092208.6756.JCARLSON@uci.edu>
References: <20060430231024.674A.JCARLSON@uci.edu>	<4455CD18.8030200@v.loewis.de>
	<20060501092208.6756.JCARLSON@uci.edu>
Message-ID: <4456606E.2030100@v.loewis.de>

Josiah Carlson wrote:
>>> Certainly that is the case.  But how would you propose embedded bytes
>>> data be represented? (I talk more extensively about this particular
>>> issue later).
>> Can't answer: I don't know what "embedded bytes data" are.

Ok. I think I would use base64, of possibly compressed content. It's
more compact than your representation, as it only uses 1.3 characters
per byte, instead of the up-to-four bytes that the img2py uses.

If ease-of-porting is an issue, img2py should just put an
.encode("latin-1") at the end of the string.

>     return zlib.decompress(
> 'x\xda\x01\x14\x02\xeb\xfd\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00 \
[...]

> That data is non-textual.  It is bytes within a string literal.  And it
> is embedded (within a .py file).

In Python 2.x, it is that, yes. In Python 3, it is a (meaningless)
text.

>>> I am apparently not communicating this particular idea effectively
>>> enough.  How would you propose that I store parsing literals for
>>> non-textual data, and how would you propose that I set up a dictionary
>>> to hold some non-trivial number of these parsing literals?
> 
> An operationX literal is a symbol that describes how to interpret the
> subsequent or previous data.  For an example of this, see the pickle
> module (portions of which I include below).

I don't think there can be, or should be, a general solution for
all operationX literals, because the different applications of
operationX all have different requirements wrt. their literals.

In binary data, integers are the most obvious choice for
operationX literals. In text data, string literals are.

> I described before how you would use this kind of thing to perform
> operationX on structured information.  It turns out that pickle (in
> Python) uses a dictionary of operationX symbols/literals -> unbound
> instance methods to perform operationX on the pickled representation of
> Python objects (literals where XXXX = '...' are defined, and symbols
> using the XXXX names). The relevant code for unpickling is the while 1:
> section of the following.

Right. I would convert the top of pickle.py to read

MARK            = ord('(')
STOP            = ord('.')
...

> 
>     def load(self):
>         """Read a pickled object representation from the open file.
> 
>         Return the reconstituted object hierarchy specified in the file.
>         """
>         self.mark = object() # any new unique object
>         self.stack = []
>         self.append = self.stack.append
>         read = self.read
>         dispatch = self.dispatch
>         try:
>             while 1:
>                 key = read(1)

and then this to
                  key = ord(read(1))

>                 dispatch[key](self)
>         except _Stop, stopinst:
>             return stopinst.value

> For an example of where people use '...' to represent non-textual
> information in a literal, see the '# Protocol 2' section of pickle.py ...

Right.

> # Protocol 2
> 
> PROTO           = '\x80'  # identify pickle protocol

This should be changed to

PROTO           = 0x80  # identify pickle protocol
etc.

> The point of this example was to show that operationX isn't necessarily
> the processing of text, but may in fact be the interpretation of binary
> data. It was also supposed to show how one may need to define symbols
> for such interpretation via literals of some kind.  In the pickle module,
> this is done in two parts: XXX = <literal>; dispatch[XXX] = fcn.  I've
> also seen it as dispatch = {<literal>: fcn}

Yes. For pickle, the ordinals of the type code make good operationX
literals.

> See any line-based socket protocol for where .find() is useful.

Any line-based protocol is textual, usually based on ASCII.

Regards,
Martin


From aahz at pythoncraft.com  Mon May  1 21:25:27 2006
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 1 May 2006 12:25:27 -0700
Subject: [Python-Dev] signature object issues (to discuss while I am out
	of contact)
In-Reply-To: <bbaeab100605011042y501482fauf830d9803a3a4d67@mail.gmail.com>
References: <bbaeab100605011042y501482fauf830d9803a3a4d67@mail.gmail.com>
Message-ID: <20060501192527.GA11018@panix.com>

On Mon, May 01, 2006, Brett Cannon wrote:
>
> But there are two things that I can't quite decide upon.
>
> One is whether a signature object should be automatically created
> for every function.  As of right now the PEP I am drafting has it
> on a per-need basis and have it assigned to __signature__ through
> a built-in function or putting it 'inspect'.  Now automatically
> creating the object would possibly make it more useful, but it
> could also be considered overkill.  Also not doing it automatically
> allows signature objects to possibly make more sense for classes (to
> represent __init__) and instances (to represent __call__).  But having
> that same support automatically feels off for some reason to me.

My take is that we should do it automatically and provide a helper
function that does additional work.  The class case is already
complicated by __new__(); we probably don't want to automatically sort
out __init__() vs __new__(), but I think we do want regular functions and
methods to automatically have a __signature__ attribute.  Aside from the
issue with classes, are there any other drawbacks to automatically
creating __signature__?
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Argue for your limitations, and sure enough they're yours."  --Richard Bach

From tim.peters at gmail.com  Mon May  1 21:27:48 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 1 May 2006 15:27:48 -0400
Subject: [Python-Dev] introducing the experimental pyref wiki
In-Reply-To: <20060501174711.GA12953@localhost.localdomain>
References: <e30coa$u43$1@sea.gmane.org>
	<20060501174711.GA12953@localhost.localdomain>
Message-ID: <1f7befae0605011227o4b8873d4sf35e0951a4ecdd3e@mail.gmail.com>

[A.M. Kuchling]
> ...
> (Or are the two goals -- completeness and readability --
> incompossible, unable to be met at the same time by one document?)

No, but it's not easy, and it's not necessarily succinct.  For an
existence proof, see Guy Steele's "Common Lisp the Language".   I
don't think it's a coincidence that Steele worked on the readable "The
Java Language Specification" either, or on the original Scheme spec. 
Google should hire him to work on Python docs now ;-)

From fredrik at pythonware.com  Mon May  1 21:38:42 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 21:38:42 +0200
Subject: [Python-Dev] introducing the experimental pyref wiki
References: <e30coa$u43$1@sea.gmane.org><20060501174711.GA12953@localhost.localdomain>
	<1f7befae0605011227o4b8873d4sf35e0951a4ecdd3e@mail.gmail.com>
Message-ID: <e35o45$spj$1@sea.gmane.org>

Tim Peters wrote:

> > (Or are the two goals -- completeness and readability --
> > incompossible, unable to be met at the same time by one document?)
>
> No, but it's not easy, and it's not necessarily succinct.  For an
> existence proof, see Guy Steele's "Common Lisp the Language".   I
> don't think it's a coincidence that Steele worked on the readable "The
> Java Language Specification" either, or on the original Scheme spec.
> Google should hire him to work on Python docs now ;-)

on the other hand, it's important to realize that the Python audience
have changed a lot since Guido wrote the first (carefully crafted, and
mostly excellent) version of the language reference.

I'm sure Guy could create a document that even a martian could read
[1], and I'm pretty sure that we could untangle the huge pile of peep-
hole tweaks that the reference has accumulated and get back to some-
thing close to Guido's original, but I'm not sure that is what the Python
community needs.

(my goal is to turn pyref into more of a random-access encyclopedia,
and less of an ISO-style "it's all there; just keep reading it over and
over again until you get it" specification.  it should be possible to link
from the tutorial to a reference page without causing brain implosions)

</F>

1) see http://pyref.infogami.com/introduction




From fredrik at pythonware.com  Mon May  1 21:50:30 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 21:50:30 +0200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org>	<e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de>	<e33mj1$5nd$1@sea.gmane.org><4455A0CF.3080008@v.loewis.de><445600B3.7040903@gradient.cis.upenn.edu><44562A4D.30102@v.loewis.de><e35a4q$c23$1@sea.gmane.org><e35g9r$275$1@sea.gmane.org>
	<e35isl$b4h$1@sea.gmane.org>
Message-ID: <e35oq8$v5g$1@sea.gmane.org>

Terry Reedy wrote:

> >> which reminds me of the following little absurdity gem from the language
> >> reference:
>
> > I am not sure of what you see as absurdity,
>
> Perhaps I do.  Were you referring to what I wrote in the last paragraph of
> my response to Guido?

I don't know; I've lost track of all the subthreads in this subthread.

it wasn't quite obvious to me that "be spelled exactly" meant "use the same
case" rather than "must have the same letters in the same order".  if you use
the latter interpretation, the paragraph looks a bit... odd.

</F>




From fdrake at acm.org  Mon May  1 22:14:28 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 1 May 2006 16:14:28 -0400
Subject: [Python-Dev] New methods for weakref.Weak*Dictionary types
Message-ID: <200605011614.30613.fdrake@acm.org>

I'd like to commit this for Python 2.5:

http://www.python.org/sf/1479988

The WeakKeyDictionary and WeakValueDictionary don't
provide any API to get just the weakrefs out, instead
of the usual mapping API. This can be desirable when
you want to get a list of everything without creating
new references to the underlying objects at that moment.

This patch adds methods to make the references
themselves accessible using the API, avoiding requiring
client code to have to depend on the implementation.
The WeakKeyDictionary gains the .iterkeyrefs() and
.keyrefs() methods, and the WeakValueDictionary gains
the .itervaluerefs() and .valuerefs() methods.

The patch includes tests and docs.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From sluggoster at gmail.com  Mon May  1 22:16:10 2006
From: sluggoster at gmail.com (Mike Orr)
Date: Mon, 1 May 2006 13:16:10 -0700
Subject: [Python-Dev] More Path comments (PEP 355)
Message-ID: <6e9196d20605011316w749afa07u9594169b39ae46b5@mail.gmail.com>

I just read over the changes to the proposed Path class since the
discussion last summer.  A big thanks to Bjorn Lindqvist for writing a
PEP, Jason Orendorff for the original path.py and his suggestions on
how the Path class should be different, and the writers of the
Python-Dev Summary for bringing the discussion to my attention.  I've
been testing/using the interim Path class in the Python subversion
(/sandbox/trunk/path, last modified in September), and have a few
comments about PEP 355:

- .walk*() return a list rather than an iterator.  Was this an
intentional change or a typo?  Most typical uses yield thousands of
paths which do not need to be in memory simultaneously.

- An equivalent to os.listdir() is frequently useful in applications. 
This would return a list of filenames (strings) without the parent
info.  Path.listdir() calls os.listdir() and wraps all the items into
Paths, and then I have to unwrap them again, which seems like a waste.
 I end up calling os.listdir(my_path) instead.  If we decide not to
subsume many os.* functions into Path, that's fine, but if we
deprecate os.listdir(), it's not.

-  -1 on removing .joinpath(), whatever it's called.  Path(basepath,
*args) is good but not the same.  (1) it's less intuitive: I expect
this to be a method on a directory.  (2) the class name is hardcoded:
do I really have to do self.__class__(self, *args) to make my code
forward compatible with whatever nifty subclasses might appear?

-  +1 on renaming .directory back to .parent.

-  -1 on losing a 1-liner to read/iterate a file's contents.  This is
a frequent operation, and having to write a 2-liner or a custom
function is a pain.

-  +1 on consolidating mkdir/makedirs and rmdir/rmdirs.  I'd also
suggest not raising an error if the operation is already done, and a
.purge() method that deletes recursively no matter what it is.  This
was suggested last summer as a rename for my .delete_dammit()
proposal.  Unsure what to do if permission errors prevent the
operation; I guess propagating the exception is best. This would make
.rmtree() redundant, which chokes if the item is a file.

-  +1 for rationalizing .copy*().

-  +1 for .chdir().  This is a frequent operation, and it makes no
sense not to include it.

--
Mike Orr <sluggoster at gmail.com>
(mso at oz.net address is semi-reliable)

From fredrik at pythonware.com  Mon May  1 22:21:30 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 22:21:30 +0200
Subject: [Python-Dev] more pyref: a better term for "string conversion"
Message-ID: <e35qkc$5t8$1@sea.gmane.org>

for some reason, the language reference uses the term "string con-
version" for the backtick form of "repr":

    http://docs.python.org/ref/string-conversions.html

any suggestions for a better term ?  should backticks be deprecated,
and documented in terms of repr (rather than the other way around) ?

</F>




From fuzzyman at voidspace.org.uk  Mon May  1 22:31:53 2006
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Mon, 01 May 2006 21:31:53 +0100
Subject: [Python-Dev] more pyref: a better term for "string conversion"
In-Reply-To: <e35qkc$5t8$1@sea.gmane.org>
References: <e35qkc$5t8$1@sea.gmane.org>
Message-ID: <44567039.7040408@voidspace.org.uk>

Fredrik Lundh wrote:
> for some reason, the language reference uses the term "string con-
> version" for the backtick form of "repr":
>   
The language reference also says that trailing commas for expressions 
work with backticks. This is incorrect. I think this is necessary to 
allow nested 'string conversions', so it is a doc error rather than an 
implementation error.

I can't think of a better term than string conversion. At least it is 
distinct from 'string formatting'.

Personally I think that backticks in code look like an ugly hack and 
``repr(expression)`` is clearer. If backticks were documented as a 
hackish shortcut for repr then great. :-)

Michael Foord

>     http://docs.python.org/ref/string-conversions.html
>
> any suggestions for a better term ?  should backticks be deprecated,
> and documented in terms of repr (rather than the other way around) ?
>
> </F>
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>
>   


From guido at python.org  Mon May  1 22:54:01 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 13:54:01 -0700
Subject: [Python-Dev] methods on the bytes object
In-Reply-To: <4456606E.2030100@v.loewis.de>
References: <20060430231024.674A.JCARLSON@uci.edu>
	<4455CD18.8030200@v.loewis.de> <20060501092208.6756.JCARLSON@uci.edu>
	<4456606E.2030100@v.loewis.de>
Message-ID: <ca471dc20605011354n2aa12387q4b445556a692d8e@mail.gmail.com>

This discussion seems to have gotten a bit out of hand. I believe it
belongs on the python-3000 list.

As a quick commentary, I see good points made by both sides. My
personal view is that we should *definitely* not introduce a third
type, and that *most* text-based activities should be done in the
(Unicode) string domain.

That said, I expect a certain amount of parsing to happen on bytes
objects -- for example, I would say that CPython's current parser is
parsing bytes since its input is UTF-8. There are also plenty of
text-based socket protocols that are explicitly defined in terms of
octets (mostly containing ASCII bytes only); I can see why some people
would want to write handlers that parse the bytes directly.

But instead of analyzing or arguing the situation to death, I'd like
to wait until we have a Py3k implementation that implements something
approximating the proposed end goal, where 'str' represents unicode
characters, and 'bytes' represents bytes, and we have separate I/O
APIs for binary (bytes) and character (str) data. I'm hoping to make
some progress towards this goal in the p3yk (sic) branch. It appears
that before we can switch the meaning of 'str' we will first have to
implement the new I/O library, which is what I'm focusing on right
now. I already have a fairly minimal but functional bytes type, which
I'll modify as I go along and understand more of the requirements.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From sluggoster at gmail.com  Mon May  1 22:54:21 2006
From: sluggoster at gmail.com (Mike Orr)
Date: Mon, 1 May 2006 13:54:21 -0700
Subject: [Python-Dev] Path.ancestor()
Message-ID: <6e9196d20605011354x429d0a56k48d9eb527f99061c@mail.gmail.com>

This is a potentially long discussion so I'm putting it in a separate thread.

When finding one file relative to another, it's difficult to read
multiple ".parent" attributes stacked together.  Worse, if you have
the wrong number, you end up at the wrong directory level, potentially
causing destructive damage.  Doubly worse, the number of ".parent" is
non-intuitive for those used to the near-universal "." and ".."
conventions.

Say I'm in apps/myapp/bin/myprogram.py and want to add apps/myapp/lib
and apps/shared/lib to sys.path  in a portable way.

    app_root = Path(__file__).realpath().abspath().parent.parent
    assert app_root.parent.name == 'apps'
    sys.path.insert(0, app_root.parent / 'shared/lib')
    sys.path.insert(0, app_root / 'lib')

Yikes!  At least it's better than:

    lib = os.path.join(os.path.dirname(os.path.dirname(x)), "lib")

which is completely unreadable.

(Silence to those who say __path__ is obsolete now that setuptools has
a function for finding a file in an egg.  (1) I don't understand that
part of the setuptools docs.  (2) It will be many months before most
Python programmers are ready to switch to it.)

The tricky thing with "." and ".." is they have a different meaning
depending on whether the original path is a file or directory.  With a
directory there's one less ".parent".  I've played a bit with the
argument and come up with this:

       # N is number of ".."; None (default arg) is special case for ".".
        .ancestor()  =>  "."       =>  p.parent  or  d
       .ancestor(0)  =>  ValueError
        .ancestor(1)  =>  ".."      =>  p.parent.parent  or d.parent
        .ancestor(2)  =>  "../.."  =>  p.parent.parent.parent  or 
d.parent.parent

The simplest alternative is making N the number of ".parent".  This
has some merit, and would solve the original problem of too many
".parent" stacking up.  But it means Path wouldn't have any equivalent
to "." and ".." behavior.

Another alternative is to make .ancestor(0) mean ".".  I don't like
this because "." is a special case, and this should be shown in the
syntax.

Another alternative is to move every number down by 1, so .ancestor(0)
is equivalent to "..".  The tidiness of this is outweighed by the
difficulty of remembering that N is not the number of "..".

--
Mike Orr <sluggoster at gmail.com>
(mso at oz.net address is semi-reliable)

From martin at v.loewis.de  Mon May  1 22:54:43 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 01 May 2006 22:54:43 +0200
Subject: [Python-Dev] more pyref: a better term for "string conversion"
In-Reply-To: <e35qkc$5t8$1@sea.gmane.org>
References: <e35qkc$5t8$1@sea.gmane.org>
Message-ID: <44567593.2010103@v.loewis.de>

Fredrik Lundh wrote:
> for some reason, the language reference uses the term "string con-
> version" for the backtick form of "repr":
> 
>     http://docs.python.org/ref/string-conversions.html
> 
> any suggestions for a better term ?  should backticks be deprecated,
> and documented in terms of repr (rather than the other way around) ?

I vaguely recall that they are deprecated, but I can't remember the
details. The one obvious way to invoke that functionality is the repr()
builtin.

Regards,
Martin

From tim.peters at gmail.com  Mon May  1 22:57:06 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 1 May 2006 16:57:06 -0400
Subject: [Python-Dev] New methods for weakref.Weak*Dictionary types
In-Reply-To: <200605011614.30613.fdrake@acm.org>
References: <200605011614.30613.fdrake@acm.org>
Message-ID: <1f7befae0605011357s6427602cnc6ca6a8bfa442ec8@mail.gmail.com>

[Fred L. Drake, Jr.]
> I'd like to commit this for Python 2.5:
>
> http://www.python.org/sf/1479988
>
> The WeakKeyDictionary and WeakValueDictionary don't
> provide any API to get just the weakrefs out, instead
> of the usual mapping API. This can be desirable when
> you want to get a list of everything without creating
> new references to the underlying objects at that moment.
>
> This patch adds methods to make the references
> themselves accessible using the API, avoiding requiring
> client code to have to depend on the implementation.
> The WeakKeyDictionary gains the .iterkeyrefs() and
> .keyrefs() methods, and the WeakValueDictionary gains
> the .itervaluerefs() and .valuerefs() methods.
>
> The patch includes tests and docs.

+1.  A real need for this is explained in ZODB's ZODB/util.py's
WeakSet class, which contains a WeakValueDictionary:

"""
    # Return a list of weakrefs to all the objects in the collection.
    # Because a weak dict is used internally, iteration is dicey (the
    # underlying dict may change size during iteration, due to gc or
    # activity from other threads).  as_weakref_list() is safe.
    #
    # Something like this should really be a method of Python's weak dicts.
    # If we invoke self.data.values() instead, we get back a list of live
    # objects instead of weakrefs.  If gc occurs while this list is alive,
    # all the objects move to an older generation (because they're strongly
    # referenced by the list!).  They can't get collected then, until a
    # less frequent collection of the older generation.  Before then, if we
    # invoke self.data.values() again, they're still alive, and if gc occurs
    # while that list is alive they're all moved to yet an older generation.
    # And so on.  Stress tests showed that it was easy to get into a state
    # where a WeakSet grows without bounds, despite that almost all its
    # elements are actually trash.  By returning a list of weakrefs instead,
    # we avoid that, although the decision to use weakrefs is now very
    # visible to our clients.
    def as_weakref_list(self):
        # We're cheating by breaking into the internals of Python's
        # WeakValueDictionary here (accessing its .data attribute).
        return self.data.data.values()
"""

As that implementation suggests, though, I'm not sure there's real
payback for the extra time taken in the patch's `valuerefs`
implementation to weed out weakrefs whose referents are already gone: 
the caller has to make this check anyway when it iterates over the
returned list of weakrefs.  Iterating inside the implementation, to
build the list via itervalues(), also creates that much more
vulnerability to "dict changed size during iteration" multi-threading
surprises.  For that last reason, if the patch went in as-is, I expect
ZODB would still need to "cheat"; obtaining the list of weakrefs
directly via plain .data.values() is atomic, and so immune to these
multi-threading surprises.

From guido at python.org  Mon May  1 23:05:43 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 14:05:43 -0700
Subject: [Python-Dev] more pyref: a better term for "string conversion"
In-Reply-To: <44567593.2010103@v.loewis.de>
References: <e35qkc$5t8$1@sea.gmane.org> <44567593.2010103@v.loewis.de>
Message-ID: <ca471dc20605011405o45107205ved003cb33d9b6386@mail.gmail.com>

Backticks certainly are deprecated -- Py3k won't have them (nor will
they become available for other syntax; they are undesirable
characters due to font issues and the tendency of word processing
tools to generate backticks in certain cases where you type forward
ticks).

So it would be a good idea to document them as a deprecated syntax for
spelling repr().

--Guido

On 5/1/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Fredrik Lundh wrote:
> > for some reason, the language reference uses the term "string con-
> > version" for the backtick form of "repr":
> >
> >     http://docs.python.org/ref/string-conversions.html
> >
> > any suggestions for a better term ?  should backticks be deprecated,
> > and documented in terms of repr (rather than the other way around) ?
>
> I vaguely recall that they are deprecated, but I can't remember the
> details. The one obvious way to invoke that functionality is the repr()
> builtin.
>
> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From fdrake at acm.org  Mon May  1 23:26:34 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 1 May 2006 17:26:34 -0400
Subject: [Python-Dev] New methods for weakref.Weak*Dictionary types
In-Reply-To: <1f7befae0605011357s6427602cnc6ca6a8bfa442ec8@mail.gmail.com>
References: <200605011614.30613.fdrake@acm.org>
	<1f7befae0605011357s6427602cnc6ca6a8bfa442ec8@mail.gmail.com>
Message-ID: <200605011726.34659.fdrake@acm.org>

On Monday 01 May 2006 16:57, Tim Peters wrote:
 > +1.  A real need for this is explained in ZODB's ZODB/util.py's
 > WeakSet class, which contains a WeakValueDictionary:
...
 > As that implementation suggests, though, I'm not sure there's real
 > payback for the extra time taken in the patch's `valuerefs`
 > implementation to weed out weakrefs whose referents are already gone:
 > the caller has to make this check anyway when it iterates over the

Good point; I've updated the patch accordingly.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From fredrik at pythonware.com  Mon May  1 23:41:54 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 23:41:54 +0200
Subject: [Python-Dev] more pyref: comparison precedence
Message-ID: <e35vb4$l0d$1@sea.gmane.org>

one last one for tonight; the operator precedence summary says that
"in" and "not in" has lower precedence than "is" and "is not", which
has lower precedence than "<, <=, >, >=, <>, !=, ==":

    http://docs.python.org/ref/summary.html

but the comparisions chapter

    http://docs.python.org/ref/comparisons.html

says that they all have the same priority.  which one is right ?

</F>




From guido at python.org  Mon May  1 23:43:43 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 1 May 2006 14:43:43 -0700
Subject: [Python-Dev] more pyref: comparison precedence
In-Reply-To: <e35vb4$l0d$1@sea.gmane.org>
References: <e35vb4$l0d$1@sea.gmane.org>
Message-ID: <ca471dc20605011443w3b13e392s10b494160e647a61@mail.gmail.com>

They're all the same priority.

On 5/1/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
> one last one for tonight; the operator precedence summary says that
> "in" and "not in" has lower precedence than "is" and "is not", which
> has lower precedence than "<, <=, >, >=, <>, !=, ==":
>
>     http://docs.python.org/ref/summary.html
>
> but the comparisions chapter
>
>     http://docs.python.org/ref/comparisons.html
>
> says that they all have the same priority.  which one is right ?
>
> </F>
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jcarlson at uci.edu  Mon May  1 23:50:10 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 01 May 2006 14:50:10 -0700
Subject: [Python-Dev] methods on the bytes object
In-Reply-To: <4456606E.2030100@v.loewis.de>
References: <20060501092208.6756.JCARLSON@uci.edu>
	<4456606E.2030100@v.loewis.de>
Message-ID: <20060501144214.6765.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
> Josiah Carlson wrote:
> >>> Certainly that is the case.  But how would you propose embedded bytes
> >>> data be represented? (I talk more extensively about this particular
> >>> issue later).
> >> Can't answer: I don't know what "embedded bytes data" are.
> 
> Ok. I think I would use base64, of possibly compressed content. It's
> more compact than your representation, as it only uses 1.3 characters
> per byte, instead of the up-to-four bytes that the img2py uses.

I never said it was the most efficient representation, just one that was
being used (and one in which I had no control over previously defining).
What I provided was automatically generated by a script provided with
wxPython.


> If ease-of-porting is an issue, img2py should just put an
> .encode("latin-1") at the end of the string.

Ultimately, this is still the storage of bytes in a textual string.  It
may be /encoded/ as text, but it is still conceptually bytes in text,
which is at least as confusing as text in bytes.


> >     return zlib.decompress(
> > 'x\xda\x01\x14\x02\xeb\xfd\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00 \
> [...]
> 
> > That data is non-textual.  It is bytes within a string literal.  And it
> > is embedded (within a .py file).
> 
> In Python 2.x, it is that, yes. In Python 3, it is a (meaningless)
> text.

Toss an .encode('latin-1'), and it isn't meaningless.

    >>> type(x)
    <type 'unicode'>
    >>> zlib.decompress(x.encode('latin-1'))[:4]
    '\x89PNG'


> >>> I am apparently not communicating this particular idea effectively
> >>> enough.  How would you propose that I store parsing literals for
> >>> non-textual data, and how would you propose that I set up a dictionary
> >>> to hold some non-trivial number of these parsing literals?
> > 
> > An operationX literal is a symbol that describes how to interpret the
> > subsequent or previous data.  For an example of this, see the pickle
> > module (portions of which I include below).
> 
> I don't think there can be, or should be, a general solution for
> all operationX literals, because the different applications of
> operationX all have different requirements wrt. their literals.
> 
> In binary data, integers are the most obvious choice for
> operationX literals. In text data, string literals are.
[snip]
> Yes. For pickle, the ordinals of the type code make good operationX
> literals.

But, as I brought up before, while single integers are sufficient for
some operationX literals, that may not be the case for others.  Say, for
example, a tool which discovers the various blobs from quicktime .mov
files (movie portions, audio portions, images, etc.).  I don't remember
all of the precise names to parse, but I do remember that they were all
4 bytes long.

This means that we would generally use the following...

    dispatch = {(ord(ch), ord(ch), ord(ch), ord(ch)): ...,
                #or
                tuple(ord(i) for i in '...'): ...,
               }

And in the actual operationX process...

    #if we are reading bytes...
        key = tuple(read(4))

    #if we are reading str...
        key = tuple(bytes(read(4), 'latin-1'))
        #or   tuple(read(4).encode('latin-1'))
        #or   tuple(ord(i) for i in read(4))

There are, of course, other options which could use struct and 8, 16, 32,
and/or 64 bit integers (with masks and/or shifts), for the dispatch = ...
or key = ... cases, but those, again, would rely on using Python 3.x
strings as a container for non-text data.


> > I described before how you would use this kind of thing to perform
> > operationX on structured information.  It turns out that pickle (in
> > Python) uses a dictionary of operationX symbols/literals -> unbound
> > instance methods to perform operationX on the pickled representation of
> > Python objects (literals where XXXX = '...' are defined, and symbols
> > using the XXXX names). The relevant code for unpickling is the while 1:
> > section of the following.
> 
> Right. I would convert the top of pickle.py to read
> 
> MARK            = ord('(')
> STOP            = ord('.')
> ...
> 
> > For an example of where people use '...' to represent non-textual
> > information in a literal, see the '# Protocol 2' section of pickle.py ...
> 
> Right.
> 
> > # Protocol 2
> > 
> > PROTO           = '\x80'  # identify pickle protocol
> 
> This should be changed to
> 
> PROTO           = 0x80  # identify pickle protocol
> etc.

I see that you don't see ord(...) as a case where strings are being used
to hold bytes data.  I would disagree, in much the same way that I would
disagree with the idea that bytes.encode('base64') only holds text. But
then again, I also see that the majority of this "rethink your data
structures and dispatching" would be unnecessary if there were an
immutable bytes literal in Python 3.x. People could then use...

    MARK            = b'('
    STOP            = b'.'
    ...
    PROTO           = b'\x80'
    ...

    dispatch = {b'...': fcn}

    key = read(X)
    dispatch[X](self)
    #regardless of X

... etc., as they have already been doing (only without the 'b' or other
prefix).


> >                 key = read(1)
> 
> and then this to
>                   key = ord(read(1))

This, of course, presumes that read() will return Python 3.x strings,
which may be ambiguous and/or an error in the binary pickle case
(especially if people don't pass 'latin-1' as the encoding to the open()
call).


> > See any line-based socket protocol for where .find() is useful.
> 
> Any line-based protocol is textual, usually based on ASCII.

Not all of ASCII 0...127 is text, and the RFC for telnet describes how
ASCII 128...255 can be used as optional extensions.  Further, the FTP
protocol defines a mechanism where by in STREAM or BLOCK modes, data can
be terminated by an EOR, EOF, or even a different specified marker
(could be multiple contiguous bytes).

It is also the case that some filesystems, among other things, define
file names as a null-terminated string in a variable-lengthed record, or
even pad the remaining portion of the field with nulls (which one would
presumably .rstrip('\0') in Python 2.x).  (On occasion, I have found the
need to write a filesystem explorer which opens volumes raw, especially
on platforms without drivers for that particular filesystem).


 - Josiah


From jcarlson at uci.edu  Mon May  1 23:52:31 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 01 May 2006 14:52:31 -0700
Subject: [Python-Dev] methods on the bytes object
In-Reply-To: <ca471dc20605011354n2aa12387q4b445556a692d8e@mail.gmail.com>
References: <4456606E.2030100@v.loewis.de>
	<ca471dc20605011354n2aa12387q4b445556a692d8e@mail.gmail.com>
Message-ID: <20060501145130.6768.JCARLSON@uci.edu>


"Guido van Rossum" <guido at python.org> wrote:
> This discussion seems to have gotten a bit out of hand. I believe it
> belongs on the python-3000 list.

I accidentally jumped the gun on hitting 'send' on my most recent reply,
I'll repost it in the Py3k list and expect further discussion to proceed
there.

 - Josiah


From fredrik at pythonware.com  Mon May  1 23:55:01 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 1 May 2006 23:55:01 +0200
Subject: [Python-Dev] methods on the bytes object
References: <20060430231024.674A.JCARLSON@uci.edu>	<4455CD18.8030200@v.loewis.de><20060501092208.6756.JCARLSON@uci.edu>
	<4456606E.2030100@v.loewis.de>
Message-ID: <e3603n$n5g$1@sea.gmane.org>

Martin v. Löwis wrote:


> Ok. I think I would use base64, of possibly compressed content. It's
> more compact than your representation, as it only uses 1.3 characters
> per byte, instead of the up-to-four bytes that the img2py uses.

only if you're shipping your code as PY files.  in PYC format (ZIP, PY2EXE,
etc), the img2py format is more efficient.

</F>




From fredrik at pythonware.com  Tue May  2 00:01:52 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 2 May 2006 00:01:52 +0200
Subject: [Python-Dev] more pyref: comparison precedence
References: <e35vb4$l0d$1@sea.gmane.org>
	<ca471dc20605011443w3b13e392s10b494160e647a61@mail.gmail.com>
Message-ID: <e360gi$oe0$1@sea.gmane.org>

Guido van Rossum wrote:

> They're all the same priority.

yet another description that is obvious only if you already know what
it says, in other words:

    Operators in the same box have the same precedence. /.../
    Operators in the same box group left to right (except for com-
    parisons, including tests, which all have the same precedence
    and chain from left to right /.../

I think I'll do something about this one too ;-)

</F>




From jjl at pobox.com  Tue May  2 00:25:57 2006
From: jjl at pobox.com (John J Lee)
Date: Mon, 1 May 2006 22:25:57 +0000 (UTC)
Subject: [Python-Dev] Assigning "Group" on SF tracker?
Message-ID: <Pine.LNX.4.64.0605012219350.473@localhost>

When opening patches on the SF tracker for bugs that affect Python 2.5, 
but may be candidates for backporting (to 2.4 ATM), should I leave "Group" 
as "None", or set it to "Python 2.5" to indicate it affects 2.5?

If it's known to be a candidate for backporting, should I set it to 
"Python 2.4" to indicate that?

I'm guessing I should always select "Python 2.5" if it affects 2.5, but 
I've been using "None" up till now, I think...


John


From tdelaney at avaya.com  Tue May  2 00:52:01 2006
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Tue, 2 May 2006 08:52:01 +1000
Subject: [Python-Dev] elimination of scope bleeding ofiteration	variables
Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1E6AB@au3010avexu1.global.avaya.com>

Greg Ewing wrote:

>    for x in stuff:
>      for x in otherstuff:
>        dosomethingelse(x)
> 
> would be a SyntaxError because the inner loop
> is trying to use x while it's still in use by the
> outer loop.

So would this also be a SyntaxError?

    for x in stuff:
        x = somethingelse

Tim Delaney

From tim.peters at gmail.com  Tue May  2 02:41:14 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 1 May 2006 20:41:14 -0400
Subject: [Python-Dev] Assigning "Group" on SF tracker?
In-Reply-To: <Pine.LNX.4.64.0605012219350.473@localhost>
References: <Pine.LNX.4.64.0605012219350.473@localhost>
Message-ID: <1f7befae0605011741i4541397btf60a1a78c33497b5@mail.gmail.com>

[John J Lee]
> When opening patches on the SF tracker for bugs that affect Python 2.5,
> but may be candidates for backporting (to 2.4 ATM), should I leave "Group"
> as "None", or set it to "Python 2.5" to indicate it affects 2.5?
>
> If it's known to be a candidate for backporting, should I set it to
> "Python 2.4" to indicate that?
>
> I'm guessing I should always select "Python 2.5" if it affects 2.5, but
> I've been using "None" up till now, I think...

I think it's best to set it to the earliest still-maintained Python
version to which it applies.  So that would be 2.4 now.  The body of
the report should say that the problem still exists in 2.5 (assuming
it does).

Or ;-) you could set it to 2.5, and note in the body that it's also a
bug in 2.4.

The _helpful_ part is that it be clear the bug exists in both 2.4 and
2.5 (when you know that), so that the next helpful elf doesn't have to
figure that out again.

From tjreedy at udel.edu  Tue May  2 04:12:31 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 1 May 2006 22:12:31 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org><e31b0t$7i4$1@sea.gmane.org>	<445476CC.3080008@v.loewis.de><e33mj1$5nd$1@sea.gmane.org>	<4455A0CF.3080008@v.loewis.de><e35ds3$p77$1@sea.gmane.org>
	<445644D1.3050200@v.loewis.de>
Message-ID: <e36f6e$pns$1@sea.gmane.org>


""Martin v. Löwis"" <martin at v.loewis.de> wrote in message 
news:445644D1.3050200 at v.loewis.de...
> You weren't asking for a reason, you were asking for an example:

No wonder we weren't connecting very well.  You somehow have it backwards. 
'Why' means "for what reason".  But to continue with examples:

my way to call your example (given the data in separate variables):
  make_person(name, age, phone, location)
your way:
  make_person(name=name, age=age, phone=phone, location = location)

my way (given the data in one sequence):
  make_person(*person_data)
your way:
  make_person(name=person_data[0], age=person_data[2],
      phone=person_data[3], location=person_data[3])

> Because there should be preferably only one obvious way to call that
> function.

It is a feature of Python that arguments can usually be matched to 
parameters either by position or name, as the *caller* chooses.  But if you 
want to (ab)use (my opinion) the 'one obvious way' mantra to disable that , 
then 'my way' above is obviously the more obvious way to do so ;-).  It is 
to me anyway.  Try typing the 'your way' version of the second pair without 
making and having to correct typos,

>Readers of the code should not need to remember the
> order of parameters,

And they need not; it is right there in front of them.  As for writers, 
modern IDEs should try to list the parameter signature upon typing 'func('.

> instead, the meaning of the parameters should
> be obvious in the call.

With good naming of variables in the calling code, I think it is.

 > I don't *want* callers to pass these
> parameters positionally, to improve readability.

The code I write is my code, not yours, and I consider your version to be 
less readable, as well as harder to type without mistake.  Do you think 
Python should be changed to prohited more that one, or maybe two named 
positional parameters?

Terry Jan Reedy




From brett at python.org  Tue May  2 04:39:41 2006
From: brett at python.org (Brett Cannon)
Date: Mon, 1 May 2006 19:39:41 -0700
Subject: [Python-Dev] signature object issues (to discuss while I am out
	of contact)
In-Reply-To: <20060501192527.GA11018@panix.com>
References: <bbaeab100605011042y501482fauf830d9803a3a4d67@mail.gmail.com>
	<20060501192527.GA11018@panix.com>
Message-ID: <bbaeab100605011939y44256370vb57a5a2283fde5cd@mail.gmail.com>

On 5/1/06, Aahz <aahz at pythoncraft.com> wrote:
> On Mon, May 01, 2006, Brett Cannon wrote:
> >
> > But there are two things that I can't quite decide upon.
> >
> > One is whether a signature object should be automatically created
> > for every function.  As of right now the PEP I am drafting has it
> > on a per-need basis and have it assigned to __signature__ through
> > a built-in function or putting it 'inspect'.  Now automatically
> > creating the object would possibly make it more useful, but it
> > could also be considered overkill.  Also not doing it automatically
> > allows signature objects to possibly make more sense for classes (to
> > represent __init__) and instances (to represent __call__).  But having
> > that same support automatically feels off for some reason to me.
>
> My take is that we should do it automatically and provide a helper
> function that does additional work.  The class case is already
> complicated by __new__(); we probably don't want to automatically sort
> out __init__() vs __new__(), but I think we do want regular functions and
> methods to automatically have a __signature__ attribute.  Aside from the
> issue with classes, are there any other drawbacks to automatically
> creating __signature__?

Well, one issue is the dichotomy between Python and C functions not
both providing a signature object.  There is no good way to provide a
signature object automatically for C functions (at least at the
moment; could add the signature string for PyArg_ParseTuple() to the
PyMethodDef and have it passed in to the wrapped C function so that
initialization of the class can get to the parameters string).  So you
can't fully rely on the object being available for all functions and
methods unless a worthless signature object is placed for C functions.

-Brett

From aahz at pythoncraft.com  Tue May  2 05:49:34 2006
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 1 May 2006 20:49:34 -0700
Subject: [Python-Dev] signature object issues (to discuss while I am out
	of contact)
In-Reply-To: <bbaeab100605011939y44256370vb57a5a2283fde5cd@mail.gmail.com>
References: <bbaeab100605011042y501482fauf830d9803a3a4d67@mail.gmail.com>
	<20060501192527.GA11018@panix.com>
	<bbaeab100605011939y44256370vb57a5a2283fde5cd@mail.gmail.com>
Message-ID: <20060502034934.GA10329@panix.com>

On Mon, May 01, 2006, Brett Cannon wrote:
> On 5/1/06, Aahz <aahz at pythoncraft.com> wrote:
>>On Mon, May 01, 2006, Brett Cannon wrote:
>>>
>>> But there are two things that I can't quite decide upon.
>>>
>>> One is whether a signature object should be automatically created
>>> for every function.  As of right now the PEP I am drafting has it
>>> on a per-need basis and have it assigned to __signature__ through
>>> a built-in function or putting it 'inspect'.  Now automatically
>>> creating the object would possibly make it more useful, but it
>>> could also be considered overkill.  Also not doing it automatically
>>> allows signature objects to possibly make more sense for classes (to
>>> represent __init__) and instances (to represent __call__).  But having
>>> that same support automatically feels off for some reason to me.
>>
>>My take is that we should do it automatically and provide a helper
>>function that does additional work.  The class case is already
>>complicated by __new__(); we probably don't want to automatically sort
>>out __init__() vs __new__(), but I think we do want regular functions and
>>methods to automatically have a __signature__ attribute.  Aside from the
>>issue with classes, are there any other drawbacks to automatically
>>creating __signature__?
> 
> Well, one issue is the dichotomy between Python and C functions not
> both providing a signature object.  There is no good way to provide a
> signature object automatically for C functions (at least at the
> moment; could add the signature string for PyArg_ParseTuple() to the
> PyMethodDef and have it passed in to the wrapped C function so that
> initialization of the class can get to the parameters string).  So you
> can't fully rely on the object being available for all functions and
> methods unless a worthless signature object is placed for C functions.

>From my POV, that suggests changing the C API rather than not having
automatic signatures.  That probably requires Py3K, though.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Argue for your limitations, and sure enough they're yours."  --Richard Bach

From tim.peters at gmail.com  Tue May  2 07:52:57 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 2 May 2006 01:52:57 -0400
Subject: [Python-Dev] [Python-checkins] r45850 - in python/trunk:
	Doc/lib/libfuncs.tex Lib/test/test_subprocess.py Misc/NEWS
	Objects/fileobject.c Python/bltinmodule.c
In-Reply-To: <20060502044316.408951E4010@bag.python.org>
References: <20060502044316.408951E4010@bag.python.org>
Message-ID: <1f7befae0605012252s1c563b04o8ed6a797d2861f87@mail.gmail.com>

> Author: neal.norwitz
> Date: Tue May  2 06:43:14 2006
> New Revision: 45850
>
> Modified:
>    python/trunk/Doc/lib/libfuncs.tex
>    python/trunk/Lib/test/test_subprocess.py
>    python/trunk/Misc/NEWS
>    python/trunk/Objects/fileobject.c
>    python/trunk/Python/bltinmodule.c
> Log:
> SF #1479181: split open() and file() from being aliases for each other.

Umm ... why?  I suppose I wouldn't care, except it left
test_subprocess failing on all the Windows buildbots, and I don't feel
like figuring out why.  To a first approximation,
test_universal_newlines_communicate() now takes the

                # Interpreter without universal newline support

branch on Windows, but shouldn't.

From greg.ewing at canterbury.ac.nz  Tue May  2 08:19:33 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 02 May 2006 18:19:33 +1200
Subject: [Python-Dev] elimination of scope bleeding ofiteration	variables
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1E6AB@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5FF1E6AB@au3010avexu1.global.avaya.com>
Message-ID: <4456F9F5.1050701@canterbury.ac.nz>

Delaney, Timothy (Tim) wrote:

> So would this also be a SyntaxError?
> 
>     for x in stuff:
>         x = somethingelse

That would be something to be debated. I don't
really mind much one way or the other.

--
Greg

From nnorwitz at gmail.com  Tue May  2 08:25:47 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 1 May 2006 23:25:47 -0700
Subject: [Python-Dev] [Python-checkins] r45850 - in python/trunk:
	Doc/lib/libfuncs.tex Lib/test/test_subprocess.py Misc/NEWS
	Objects/fileobject.c Python/bltinmodule.c
In-Reply-To: <1f7befae0605012252s1c563b04o8ed6a797d2861f87@mail.gmail.com>
References: <20060502044316.408951E4010@bag.python.org>
	<1f7befae0605012252s1c563b04o8ed6a797d2861f87@mail.gmail.com>
Message-ID: <ee2a432c0605012325n2467cef8g9259d121b71e67a8@mail.gmail.com>

On 5/1/06, Tim Peters <tim.peters at gmail.com> wrote:
> > Author: neal.norwitz
> > Date: Tue May  2 06:43:14 2006
> > New Revision: 45850
> >
> > Modified:
> >    python/trunk/Doc/lib/libfuncs.tex
> >    python/trunk/Lib/test/test_subprocess.py
> >    python/trunk/Misc/NEWS
> >    python/trunk/Objects/fileobject.c
> >    python/trunk/Python/bltinmodule.c
> > Log:
> > SF #1479181: split open() and file() from being aliases for each other.
>
> Umm ... why?  I suppose I wouldn't care, except it left

I'll let Aahz answer that, it's his patch.

> test_subprocess failing on all the Windows buildbots, and I don't feel
> like figuring out why.  To a first approximation,
> test_universal_newlines_communicate() now takes the
>
>                 # Interpreter without universal newline support
>
> branch on Windows, but shouldn't.

I tried to fix that breakage.

n

From tim.peters at gmail.com  Tue May  2 08:51:13 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 2 May 2006 02:51:13 -0400
Subject: [Python-Dev] [Python-checkins] r45850 - in python/trunk:
	Doc/lib/libfuncs.tex Lib/test/test_subprocess.py Misc/NEWS
	Objects/fileobject.c Python/bltinmodule.c
In-Reply-To: <ee2a432c0605012325n2467cef8g9259d121b71e67a8@mail.gmail.com>
References: <20060502044316.408951E4010@bag.python.org>
	<1f7befae0605012252s1c563b04o8ed6a797d2861f87@mail.gmail.com>
	<ee2a432c0605012325n2467cef8g9259d121b71e67a8@mail.gmail.com>
Message-ID: <1f7befae0605012351k4655e6ebr1840d4661df3c01e@mail.gmail.com>

[Tim]
> ...
>> test_subprocess failing on all the Windows buildbots

[Neal]
> I tried to fix that breakage.

You succeeded!  Thanks.

From tim.peters at gmail.com  Tue May  2 09:47:42 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 2 May 2006 03:47:42 -0400
Subject: [Python-Dev] [Python-checkins] r45850 - in
	python/trunk:Doc/lib/libfuncs.tex Lib/test/test_subprocess.py
	Misc/NEWSObjects/fileobject.c Python/bltinmodule.c
In-Reply-To: <e37007$rpe$1@sea.gmane.org>
References: <20060502044316.408951E4010@bag.python.org>
	<1f7befae0605012252s1c563b04o8ed6a797d2861f87@mail.gmail.com>
	<e37007$rpe$1@sea.gmane.org>
Message-ID: <1f7befae0605020047o4c138450vd4dbc1385fa422a9@mail.gmail.com>

>>> SF #1479181: split open() and file() from being aliases for each other.

>> Umm ... why?

[/F]
> so that introspection tools can support GvR's pronouncement that "open"
> should be used to open files, and "file" should be used as a type representing
> standard (current stdio-based) file handles.

Maybe some of the intended changes are missing?  The post-patch
docstrings don't draw this distinction, and I'm lost about what else
introspection tools could be looking at to make the distinction
(special-casing the names?  but they could have done that before):

"""
>>> print open.__doc__
open(name[, mode[, buffering]]) -> file object

Open a file using the file() type, returns a file object.

>>> print file.__doc__
file(name[, mode[, buffering]]) -> file object

Open a file.  The mode can be 'r', 'w' or 'a' for reading (default),
writing or appending.  The file will be created if it doesn't exist
when opened for writing or appending; it will be truncated when
opened for writing.  Add a 'b' to the mode for binary files.
Add a '+' to the mode to allow simultaneous reading and writing.
If the buffering argument is given, 0 means unbuffered, 1 means line
buffered, and larger numbers specify the buffer size.
Add a 'U' to mode to open the file for input with universal newline
support.  Any line ending in the input file will be seen as a '\n'
in Python.  Also, a file so opened gains the attribute 'newlines';
the value for this attribute is one of None (no newline read yet),
'\r', '\n', '\r\n' or a tuple containing all the newline types seen.

'U' cannot be combined with 'w' or '+' mode.

>>>
"""

In Python 2.4, the docstrings were of course the same (the current
trunk's file.__doc__ except for a line at the end).  Since all useful
info about how to open a file has been purged from open()'s docstring,
if you've got the intent right, looks like the implementation got it
backwards ;-)

OK, maybe this is what you have in mind:

>>> type(open)
<type 'builtin_function_or_method'>
>>> type(file)
<type 'type'>

Fine, if that's all there really is to this, but I think it's a net
loss if open()'s docstring regression stands -- someone should finish
this job, or it should be reverted.

From jcarlson at uci.edu  Tue May  2 10:09:40 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 02 May 2006 01:09:40 -0700
Subject: [Python-Dev] elimination of scope bleeding ofiteration	variables
In-Reply-To: <4456F9F5.1050701@canterbury.ac.nz>
References: <2773CAC687FD5F4689F526998C7E4E5FF1E6AB@au3010avexu1.global.avaya.com>
	<4456F9F5.1050701@canterbury.ac.nz>
Message-ID: <20060502010809.677B.JCARLSON@uci.edu>


Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
> Delaney, Timothy (Tim) wrote:
> 
> > So would this also be a SyntaxError?
> > 
> >     for x in stuff:
> >         x = somethingelse
> 
> That would be something to be debated. I don't
> really mind much one way or the other.


    for line in lines:
        line = line.rstrip()
        ...

I'm generally -0 on the "raise a SyntaxError" in this particular case,
and am +0 on the double use below:

    for x in y:
        for x in z:


 - Josiah


From pedronis at strakt.com  Tue May  2 10:25:36 2006
From: pedronis at strakt.com (Samuele Pedroni)
Date: Tue, 02 May 2006 10:25:36 +0200
Subject: [Python-Dev] Reminder: call for proposals "Python Language and
 Libraries Track" for Europython 2006
Message-ID: <44571780.8080808@strakt.com>

"""
Python Language and Libraries

A track about Python the Language, all batteries included. Talks about 
the language, language evolution, patterns and idioms, implementations 
(CPython, IronPython, Jython, PyPy ...) and implementation issues belong 
to the track. So do talks about the standard library or interesting 
3rd-party libraries (and frameworks), unless the gravitational pull of 
other tracks is stronger.
"""

Of course talks to detail new features in the upcoming 2.5 release are 
more than welcome.

Deadline is 31th of May.

The full call and submission links are at:

http://www.europython.org/sections/tracks_and_talks/call-for-proposals

Thanks, Samuele Pedroni

From fredrik at pythonware.com  Tue May  2 10:47:08 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 2 May 2006 10:47:08 +0200
Subject: [Python-Dev] [Python-checkins] r45850 -
	inpython/trunk:Doc/lib/libfuncs.tex
	Lib/test/test_subprocess.pyMisc/NEWSObjects/fileobject.c
	Python/bltinmodule.c
References: <20060502044316.408951E4010@bag.python.org><1f7befae0605012252s1c563b04o8ed6a797d2861f87@mail.gmail.com><e37007$rpe$1@sea.gmane.org>
	<1f7befae0605020047o4c138450vd4dbc1385fa422a9@mail.gmail.com>
Message-ID: <e376ac$eud$1@sea.gmane.org>

Tim Peters wrote:

>>>> SF #1479181: split open() and file() from being aliases for each other.
>
>>> Umm ... why?
>
> [/F]
>> so that introspection tools can support GvR's pronouncement that "open"
>> should be used to open files, and "file" should be used as a type representing
>> standard (current stdio-based) file handles.
>
> Maybe some of the intended changes are missing?  The post-patch
> docstrings don't draw this distinction, and I'm lost about what else
> introspection tools could be looking at to make the distinction
> (special-casing the names?  but they could have done that before):
>
> """
>>>> print open.__doc__
> open(name[, mode[, buffering]]) -> file object
>
> Open a file using the file() type, returns a file object.
>
>>>> print file.__doc__
> file(name[, mode[, buffering]]) -> file object
>
> Open a file.  The mode can be 'r', 'w' or 'a' for reading (default),
> writing or appending.  The file will be created if it doesn't exist
> when opened for writing or appending; it will be truncated when
> opened for writing.  Add a 'b' to the mode for binary files.
> Add a '+' to the mode to allow simultaneous reading and writing.
> If the buffering argument is given, 0 means unbuffered, 1 means line
> buffered, and larger numbers specify the buffer size.
> Add a 'U' to mode to open the file for input with universal newline
> support.  Any line ending in the input file will be seen as a '\n'
> in Python.  Also, a file so opened gains the attribute 'newlines';
> the value for this attribute is one of None (no newline read yet),
> '\r', '\n', '\r\n' or a tuple containing all the newline types seen.
>
> 'U' cannot be combined with 'w' or '+' mode.
>
>>>>
> """
>
> In Python 2.4, the docstrings were of course the same (the current
> trunk's file.__doc__ except for a line at the end).  Since all useful
> info about how to open a file has been purged from open()'s docstring,
> if you've got the intent right, looks like the implementation got it
> backwards ;-)

agreed.  imho, the detailed description should be moved to open, and the file
docstring should refer to open for a full description of how the arguments are
used.

and the open docstring could say something like "returns a file object.  in the
current version, this is always an instance of the file() type."

unless I'm missing something here, of course.

(I was asked to review this patch, but those mails arrived around 2 am, so I
didn't see them until now...)

</F> 




From ncoghlan at gmail.com  Tue May  2 12:13:16 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 02 May 2006 20:13:16 +1000
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
 decorator
In-Reply-To: <ca471dc20605010718v769d7215l81a83029e70ec5a3@mail.gmail.com>
References: <4454BA62.5080704@iinet.net.au>	
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>	
	<4455EB14.6000709@canterbury.ac.nz>
	<4455F415.3040104@gmail.com>	 <4455FBFF.3080301@gmail.com>	
	<8D9580FC-95C5-4A15-994C-6F91E31E3F45@fuhm.net>
	<ca471dc20605010718v769d7215l81a83029e70ec5a3@mail.gmail.com>
Message-ID: <445730BC.9030708@gmail.com>

Guido van Rossum wrote:
> On 5/1/06, James Y Knight <foom at fuhm.net> wrote:
>> Don't forget that the majority of users will never have heard any of
>> these discussions nor have used 2.5a1 or 2.5a2. Choose the best term
>> for them, not for the readers of python-dev.
> 
> I couldn't agree more! (Another thought, occasionally useful,is to
> consider that surely the number of Python programs yet to be written,
> and the number of Python programmers who yet have to learn the
> language, must surely exceed the current count. At least, one would
> hope so -- if that's not true, we might as well stop now. :-)

If reason 1 had been my only reason for agreeing with Greg, I wouldn't have 
said anything :)

The conflict with 'managed code' and thinking about the number of objects 
named 'manager' I've seen in different programs were enough to give me pause, 
though.

I've got no real idea as to how we can get a better feel for which terminology 
would be clearer to people that haven't already been exposed to this topic for 
months, though :(

> Nick, do you have it in you to fix PEP 343? Or at least come up with a
> draft patch? We can take this off-linel with all the +0's and +1's
> coming in I'm pretty comfortable with this change now, although we
> should probably wait until later today to commit.

I can handle the PEP (just getting rid of __context__ for now). I'll willingly 
cede the joy of actually fixing SVN to someone else, though :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Tue May  2 12:36:42 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 02 May 2006 20:36:42 +1000
Subject: [Python-Dev] Adding functools.decorator
In-Reply-To: <ca471dc20605010751s50394141lf2a966beb15882b9@mail.gmail.com>
References: <4454C203.2060707@iinet.net.au>
	<e32kqa$bq6$1@sea.gmane.org>	<ca471dc20604300936s2a1bdf0fof8395428230963d9@mail.gmail.com>	<e32r0a$s69$1@sea.gmane.org>
	<ca471dc20605010751s50394141lf2a966beb15882b9@mail.gmail.com>
Message-ID: <4457363A.8050802@gmail.com>

Guido van Rossum wrote:
> On 4/30/06, Georg Brandl <g.brandl at gmx.net> wrote:
>> Guido van Rossum wrote:
>>> I expect that at some point people will want to tweak what gets copied
>>> by _update_wrapper() -- e.g. some attributes may need to be
>>> deep-copied, or personalized, or skipped, etc.
>> What exactly do you have in mind there? If someone wants to achieve this,
>> she can write his own version of @decorator.
> 
> I meant that the provided version should make writing your own easier
> than copying the source and editing it. Some form of subclassing might
> make sense, or a bunch of smaller functions that can be called for
> various actions. You'll probably have to discover some real use cases
> before you'll be able to design the right API for this.

Maybe we should just expose a basic interface to replace the four lines of 
boilerplate (the docstring makes this look more complicated than it really is!):

   WRAPPER_ASSIGNMENTS = ('__module__', '__name__', '__doc__')
   WRAPPER_UPDATES = ('__dict__')
   def update_wrapper(wrapper, wrapped,
                      assigned = WRAPPER_ASSIGNMENTS,
                      updated = WRAPPER_UPDATES):
       """Update a wrapper function to look like the wrapped function

          Attributes of the wrapped function named in the assigned
          argument are assigned directly to the corresponding attributes
          of the wrapper function.
          The update() method of wrapper function attributes named in the
          updated argument are called with the corresponding attribute of
          the wrapped function as their sole argument.
       """
       for attr in assigned:
           setattr(wrapper, attr, getattr(wrapped, attr))
       for attr in updated:
           getattr(wrapper, attr).update(getattr(wrapped, attr))


The two global constants provide clear documentation of the default behaviour 
and the keyword arguments make it easy to copy or update a couple of extra 
arguments if you need to, or to prevent the standard copying.

Including the return statement allows this to be used as a decorator if you 
prefer:

from functools import partial, update_wrapper

@partial(update_wrapper, func)
def wrapper(*args, **kwds):
     # wrap func here.

Cheers,
Nick.



-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Tue May  2 13:04:06 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 02 May 2006 21:04:06 +1000
Subject: [Python-Dev] signature object issues (to discuss while I am out
 of	contact)
In-Reply-To: <bbaeab100605011042y501482fauf830d9803a3a4d67@mail.gmail.com>
References: <bbaeab100605011042y501482fauf830d9803a3a4d67@mail.gmail.com>
Message-ID: <44573CA6.5010608@gmail.com>

Brett Cannon wrote:
> One is whether a signature object should be automatically created for
> every function.  As of right now the PEP I am drafting has it on a
> per-need basis and have it assigned to __signature__ through a
> built-in function or putting it 'inspect'.  Now automatically creating
> the object would possibly make it more useful, but it could also be
> considered overkill.  Also not doing it automatically allows signature
> objects to possibly make more sense for classes (to represent
> __init__) and instances (to represent __call__).  But having that same
> support automatically feels off for some reason to me.

My current impulse is to put the signature object in the inspect module to 
start with, and don't give it a special attribute at all.

All of the use cases I can think of (introspection for documentation purposes 
or argument checking purposes) don't really suffer either way regardless of 
whether the signature retrieval is spelt "obj.__signature__" or 
"inspect.getsignature(obj)".

Since it doesn't make much difference from a usability point of view, let's 
start with the object in the library module first.

> The second question is whether it is worth providing a function that
> will either figure out if a tuple and dict representing arguments
> would work in calling the function.  Some have even suggested a
> function that returns the actual bindings if the call were to occur. 
> Personally I don't see a huge use for either, but even less for the
> latter version.  If people have a legit use case for either please
> speak up, otherwise I am tempted to keep the object simple.

A "bind" method on the signature objects is pretty much essential for any kind 
of argument checking usage.

In addition to Aahz's precondition checking example, Talin gave a good example 
on the Py3k list of a function decorator for logging all calls to a function, 
and including the argument bindings in the log message.

And just in case you think the operation would be easy to implement if you 
need it, I've included below the bind method from the signature object I wrote 
to play around with the ideas posted to the Py3k list. It took a fair bit of 
work to get it to spit out the right answers :)

Cheers,
Nick.

     def bind(*args, **kwds):
         """Return a dict mapping parameter names to bound arguments"""
         self = args[0]
         args = args[1:]
         bound_params = {}
         num_args = len(args)
         missing_args = set(self.required_args)
         arg_names = self.required_args + self.optional_args
         num_names = len(arg_names)
         # Handle excess positional arguments
         if self.extra_args:
             bound_params[self.extra_args] = tuple(args[num_names:])
         elif num_args > num_names:
             self._raise_args_error(num_args)
         # Bind positional arguments
         for name, value in zip(arg_names, args):
             bound_params[name] = value
             missing_args -= set((name,))
         # Bind keyword arguments (and handle excess)
         if self.extra_kwds:
             extra_kwds = dict()
             bound_params[self.extra_kwds] = extra_kwds
         else:
             extra_kwds = None
         for name, value in kwds.items():
             if name in bound_params:
                 raise TypeError(
                     "Got multiple values for argument '%s'" % name)
             elif name in arg_names:
                 missing_args -= set((name,))
                 bound_params[name] = value
             elif extra_kwds is not None:
                 extra_kwds[name] = value
             else:
                 raise TypeError(
                     "Got unexpected keyword argument '%s'" % name)
         if missing_args:
             self._raise_args_error(num_args)
         # All done
         return bound_params


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From greg.ewing at canterbury.ac.nz  Tue May  2 13:07:33 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 02 May 2006 23:07:33 +1200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <e36f6e$pns$1@sea.gmane.org>
References: <4453B025.3080100@acm.org> <e31b0t$7i4$1@sea.gmane.org>
	<445476CC.3080008@v.loewis.de> <e33mj1$5nd$1@sea.gmane.org>
	<4455A0CF.3080008@v.loewis.de> <e35ds3$p77$1@sea.gmane.org>
	<445644D1.3050200@v.loewis.de> <e36f6e$pns$1@sea.gmane.org>
Message-ID: <44573D75.40007@canterbury.ac.nz>

Terry Reedy wrote:

> my way to call your example (given the data in separate variables):
>   make_person(name, age, phone, location)
> your way:
>   make_person(name=name, age=age, phone=phone, location = location)

For situations like that, I've sometimes thought
it would be useful to be able to say something like

   make_person(=name, =age, =phone, =location)

> It is a feature of Python that arguments can usually be matched to 
> parameters either by position or name, as the *caller* chooses.

Except that the caller doesn't always get that option
even now, if the callee has chosen to use *args or **kwds.
So I wouldn't consider that a very strong argument.

> And they need not; it is right there in front of them.  As for writers, 
> modern IDEs should try to list the parameter signature upon typing 'func('.

Which is extremely difficult for an IDE to do without
static type information.

--
Greg

From greg.ewing at canterbury.ac.nz  Tue May  2 13:13:56 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 02 May 2006 23:13:56 +1200
Subject: [Python-Dev] elimination of scope bleeding ofiteration	variables
In-Reply-To: <20060502010809.677B.JCARLSON@uci.edu>
References: <2773CAC687FD5F4689F526998C7E4E5FF1E6AB@au3010avexu1.global.avaya.com>
	<4456F9F5.1050701@canterbury.ac.nz>
	<20060502010809.677B.JCARLSON@uci.edu>
Message-ID: <44573EF4.9080306@canterbury.ac.nz>

Josiah Carlson wrote:

>     for line in lines:
>         line = line.rstrip()
>         ...
> 
> I'm generally -0 on the "raise a SyntaxError" in this particular case,

That's a good point. I'm inclined to agree.
I think I might have even done something like
that recently, but I can't remember the
details.

> and am +0 on the double use below:
> 
>     for x in y:
>         for x in z:

Can anyone think of a plausible use case for that?

--
Greg

From ncoghlan at gmail.com  Tue May  2 13:28:49 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 02 May 2006 21:28:49 +1000
Subject: [Python-Dev] elimination of scope bleeding ofiteration	variables
In-Reply-To: <44573EF4.9080306@canterbury.ac.nz>
References: <2773CAC687FD5F4689F526998C7E4E5FF1E6AB@au3010avexu1.global.avaya.com>	<4456F9F5.1050701@canterbury.ac.nz>	<20060502010809.677B.JCARLSON@uci.edu>
	<44573EF4.9080306@canterbury.ac.nz>
Message-ID: <44574271.7050408@gmail.com>

Greg Ewing wrote:
> Josiah Carlson wrote:
>> and am +0 on the double use below:
>>
>>     for x in y:
>>         for x in z:
> 
> Can anyone think of a plausible use case for that?

This really seems more like the domain of pychecker/pylint rather than the 
compiler. The code may be a bad idea, but I don't see any reason to make it a 
syntax error.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Tue May  2 13:32:15 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 02 May 2006 21:32:15 +1000
Subject: [Python-Dev] global variable modification in functions [Re:
 elimination of scope bleeding of iteration variables]
In-Reply-To: <5.1.1.6.0.20060501112424.03760840@mail.telecommunity.com>
References: <445584BB.8090708@666.com>
	<44532F0B.5050804@666.com>	<4454C87A.7020701@gmail.com>
	<445584BB.8090708@666.com>
	<5.1.1.6.0.20060501112424.03760840@mail.telecommunity.com>
Message-ID: <4457433F.8080306@gmail.com>

Phillip J. Eby wrote:
> And for the case where the compiler can tell the variable is accessed 
> before it's defined, there's definitely something wrong.  This code, for 
> example, is definitely missing a "global" and the compiler could in 
> principle tell:
> 
>      foo = 1
> 
>      def bar():
>          foo+=1
> 
> So I see no problem (in principle, as opposed to implementation) with 
> issuing a warning or even a compilation error for that code.  (And it's 
> wrong even if the snippet I showed is in a nested function definition, 
> although the error would be different.)
> 
> If I recall correctly, the new compiler uses a control-flow graph that 
> could possibly be used to determine whether there is a path on which a 
> local could be read before it's stored.

I think symtable.c could wander back up the lexical block stack checking that 
target names are defined for augmented assignment statements.

That said, while I'd be happy to review a patch, I'm not going to try to write 
one :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From aahz at pythoncraft.com  Tue May  2 16:02:37 2006
From: aahz at pythoncraft.com (Aahz)
Date: Tue, 2 May 2006 07:02:37 -0700
Subject: [Python-Dev] elimination of scope bleeding ofiteration	variables
In-Reply-To: <44574271.7050408@gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5FF1E6AB@au3010avexu1.global.avaya.com>
	<4456F9F5.1050701@canterbury.ac.nz>
	<20060502010809.677B.JCARLSON@uci.edu>
	<44573EF4.9080306@canterbury.ac.nz> <44574271.7050408@gmail.com>
Message-ID: <20060502140237.GA8154@panix.com>

On Tue, May 02, 2006, Nick Coghlan wrote:
> Greg Ewing wrote:
>> Josiah Carlson wrote:
>>>
>>> and am +0 on the double use below:
>>>
>>>     for x in y:
>>>         for x in z:
>> 
>> Can anyone think of a plausible use case for that?
> 
> This really seems more like the domain of pychecker/pylint rather than
> the compiler. The code may be a bad idea, but I don't see any reason
> to make it a syntax error.

My sentiments precisely.  Unless someone can come up with a good reason
for changing the current semantics, I'm -1.

Side note: if people do want to continue discussing this, I think it
should go to python-3000.  There is absolutely no reason to break
currently-running code that happens to use this pattern.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Argue for your limitations, and sure enough they're yours."  --Richard Bach

From amk at amk.ca  Tue May  2 17:32:21 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Tue, 2 May 2006 11:32:21 -0400
Subject: [Python-Dev] Date for DC-area Python sprint?
Message-ID: <20060502153221.GA29734@rogue.amk.ca>

I'm working on setting up a sprint in the Washington DC area;
currently I have a lead that would be in Arlington.  I'd like to
discuss the date on python-dev in order to coordinate with other
events.

The sprint has no particular goal, so people might just hack on their
core-related projects.  It certainly doesn't need to be limited to
core Python; if space is available, people could certainly come to
work on other projects.

I'm thinking May 27th might be a good date.  The Need For Speed sprint
is May 21-28 (Sunday to Sunday), so the DC-area people can coordinate
with the other sprinters via IRC.  On the 21st the Need For Speed
sprinters might be scattered, with people still arriving; on the 28th
people will be leaving.  The 27th seems like a good compromise (though
maybe everyone in Iceland will be burnt out by that point).

Also, does someone want to organize a bug day for the same date, or a
sprint on the West Coast or in the Midwest?

--amk


From aahz at pythoncraft.com  Tue May  2 16:34:28 2006
From: aahz at pythoncraft.com (Aahz)
Date: Tue, 2 May 2006 07:34:28 -0700
Subject: [Python-Dev] [Python-checkins] r45850 - in python/trunk:
	Doc/lib/libfuncs.tex Lib/test/test_subprocess.py Misc/NEWS
	Objects/fileobject.c Python/bltinmodule.c
In-Reply-To: <1f7befae0605012252s1c563b04o8ed6a797d2861f87@mail.gmail.com>
References: <20060502044316.408951E4010@bag.python.org>
	<1f7befae0605012252s1c563b04o8ed6a797d2861f87@mail.gmail.com>
Message-ID: <20060502143428.GA4605@panix.com>

On Tue, May 02, 2006, Tim Peters wrote:
>
>> Author: neal.norwitz
>> Date: Tue May  2 06:43:14 2006
>> New Revision: 45850
>>
>> Modified:
>>    python/trunk/Lib/test/test_subprocess.py
>>    python/trunk/Objects/fileobject.c
>>    python/trunk/Python/bltinmodule.c
>> Log:
>> SF #1479181: split open() and file() from being aliases for each other.
> 
> Umm ... why?  I suppose I wouldn't care, except it left
> test_subprocess failing on all the Windows buildbots, and I don't feel
> like figuring out why.  To a first approximation,
> test_universal_newlines_communicate() now takes the
> 
>                 # Interpreter without universal newline support
> 
> branch on Windows, but shouldn't.

That's a bug in the Windows implementation or in the docs.
test_subprocess failed because open() was no longer an alias for file();
I originally planned to just s/open/file/ but when I read the docs, the
docs said that newlines attribute was supposed to go on file objects, not
the file type.  Neal's decision to use my original idea is probably fine,
but I'm not sure.

http://docs.python.org/dev/lib/bltin-file-objects.html

I'll answer your other post later when I have more time, but you might
check

http://mail.python.org/pipermail/python-dev/2005-December/thread.html#59073

for the basic history.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Argue for your limitations, and sure enough they're yours."  --Richard Bach

From guido at python.org  Tue May  2 16:41:25 2006
From: guido at python.org (Guido van Rossum)
Date: Tue, 2 May 2006 07:41:25 -0700
Subject: [Python-Dev] More on contextlib - adding back a contextmanager
	decorator
In-Reply-To: <445730BC.9030708@gmail.com>
References: <4454BA62.5080704@iinet.net.au>
	<ca471dc20604300953q112991baw698564024b0eec15@mail.gmail.com>
	<4455EB14.6000709@canterbury.ac.nz> <4455F415.3040104@gmail.com>
	<4455FBFF.3080301@gmail.com>
	<8D9580FC-95C5-4A15-994C-6F91E31E3F45@fuhm.net>
	<ca471dc20605010718v769d7215l81a83029e70ec5a3@mail.gmail.com>
	<445730BC.9030708@gmail.com>
Message-ID: <ca471dc20605020741g72f0eaf2w461b7e077baf3c78@mail.gmail.com>

On 5/2/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > Nick, do you have it in you to fix PEP 343? Or at least come up with a
> > draft patch? We can take this off-linel with all the +0's and +1's
> > coming in I'm pretty comfortable with this change now, although we
> > should probably wait until later today to commit.
>
> I can handle the PEP (just getting rid of __context__ for now). I'll willingly
> cede the joy of actually fixing SVN to someone else, though :)

OK, if you fix the PEP, I'll fix the code to match; I added most of
those __context__ methods so I can delete them. Unless someone else
beats me to it. :-)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue May  2 16:42:07 2006
From: guido at python.org (Guido van Rossum)
Date: Tue, 2 May 2006 07:42:07 -0700
Subject: [Python-Dev] Adding functools.decorator
In-Reply-To: <4457363A.8050802@gmail.com>
References: <4454C203.2060707@iinet.net.au> <e32kqa$bq6$1@sea.gmane.org>
	<ca471dc20604300936s2a1bdf0fof8395428230963d9@mail.gmail.com>
	<e32r0a$s69$1@sea.gmane.org>
	<ca471dc20605010751s50394141lf2a966beb15882b9@mail.gmail.com>
	<4457363A.8050802@gmail.com>
Message-ID: <ca471dc20605020742k3b2abf0epd32f5335bd4ac294@mail.gmail.com>

On 5/2/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Maybe we should just expose a basic interface to replace the four lines of
> boilerplate (the docstring makes this look more complicated than it really is!):

+1

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue May  2 16:44:00 2006
From: guido at python.org (Guido van Rossum)
Date: Tue, 2 May 2006 07:44:00 -0700
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <44573D75.40007@canterbury.ac.nz>
References: <4453B025.3080100@acm.org> <e31b0t$7i4$1@sea.gmane.org>
	<445476CC.3080008@v.loewis.de> <e33mj1$5nd$1@sea.gmane.org>
	<4455A0CF.3080008@v.loewis.de> <e35ds3$p77$1@sea.gmane.org>
	<445644D1.3050200@v.loewis.de> <e36f6e$pns$1@sea.gmane.org>
	<44573D75.40007@canterbury.ac.nz>
Message-ID: <ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>

On 5/2/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Terry Reedy wrote:
>
> > my way to call your example (given the data in separate variables):
> >   make_person(name, age, phone, location)
> > your way:
> >   make_person(name=name, age=age, phone=phone, location = location)
>
> For situations like that, I've sometimes thought
> it would be useful to be able to say something like
>
>    make_person(=name, =age, =phone, =location)

And even with Terry's use case quoted I can't make out what you meant
that to do.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From g.brandl at gmx.net  Tue May  2 19:10:36 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 02 May 2006 19:10:36 +0200
Subject: [Python-Dev] more pyref: a better term for "string conversion"
In-Reply-To: <ca471dc20605011405o45107205ved003cb33d9b6386@mail.gmail.com>
References: <e35qkc$5t8$1@sea.gmane.org> <44567593.2010103@v.loewis.de>
	<ca471dc20605011405o45107205ved003cb33d9b6386@mail.gmail.com>
Message-ID: <e383qd$q4n$1@sea.gmane.org>

Guido van Rossum wrote:
> Backticks certainly are deprecated -- Py3k won't have them (nor will
> they become available for other syntax; they are undesirable
> characters due to font issues and the tendency of word processing
> tools to generate backticks in certain cases where you type forward
> ticks).

And would it be acceptable for 2.5 to issue a DeprecationWarning for
backticks?

Georg


From rhettinger at ewtllc.com  Tue May  2 19:24:33 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Tue, 02 May 2006 10:24:33 -0700
Subject: [Python-Dev] more pyref: a better term for "string conversion"
In-Reply-To: <e383qd$q4n$1@sea.gmane.org>
References: <e35qkc$5t8$1@sea.gmane.org>
	<44567593.2010103@v.loewis.de>	<ca471dc20605011405o45107205ved003cb33d9b6386@mail.gmail.com>
	<e383qd$q4n$1@sea.gmane.org>
Message-ID: <445795D1.8050403@ewtllc.com>

Georg Brandl wrote:

>Guido van Rossum wrote:
>  
>
>>Backticks certainly are deprecated -- Py3k won't have them (nor will
>>they become available for other syntax; they are undesirable
>>characters due to font issues and the tendency of word processing
>>tools to generate backticks in certain cases where you type forward
>>ticks).
>>    
>>
>
>And would it be acceptable for 2.5 to issue a DeprecationWarning for
>backticks?
>
>  
>
No.  Backticks do not disappear until Py3.0 and we're not going to 
muck-up the 2.x series with warnings and deprecations for everything 
that is going to change in Py3.0.


Raymond

From guido at python.org  Tue May  2 19:29:07 2006
From: guido at python.org (Guido van Rossum)
Date: Tue, 2 May 2006 10:29:07 -0700
Subject: [Python-Dev] test failures in test_ctypes (HEAD)
Message-ID: <ca471dc20605021029r44467206t7af7b7dd637bded8@mail.gmail.com>

I see test failures in current HEAD on my Google Red Hat Linux desktop
that the buildbots don't seem to have:

./python -E -tt ../Lib/test/regrtest.py test_ctypes
test_ctypes
test test_ctypes failed -- errors occurred; run in verbose mode for details

More details from running this manually:
$ ./python ../Lib/test/test_ctypes.py
.
. (lots of passing tests; then:)
.
test_gl (ctypes.test.test_find.Test_OpenGL_libs) ... ERROR
test_glu (ctypes.test.test_find.Test_OpenGL_libs) ... ERROR
test_glut (ctypes.test.test_find.Test_OpenGL_libs) ... ERROR

======================================================================
ERROR: test_gl (ctypes.test.test_find.Test_OpenGL_libs)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/guido/projects/python/trunk/Lib/ctypes/test/test_find.py",
line 42, in setUp
    self.glut = CDLL(lib_glut)
  File "/home/guido/projects/python/trunk/Lib/ctypes/__init__.py",
line 288, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /usr/lib/libglut.so.3: undefined symbol: XGetExtensionVersion

======================================================================
ERROR: test_glu (ctypes.test.test_find.Test_OpenGL_libs)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/guido/projects/python/trunk/Lib/ctypes/test/test_find.py",
line 42, in setUp
    self.glut = CDLL(lib_glut)
  File "/home/guido/projects/python/trunk/Lib/ctypes/__init__.py",
line 288, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /usr/lib/libglut.so.3: undefined symbol: XGetExtensionVersion

======================================================================
ERROR: test_glut (ctypes.test.test_find.Test_OpenGL_libs)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/guido/projects/python/trunk/Lib/ctypes/test/test_find.py",
line 42, in setUp
    self.glut = CDLL(lib_glut)
  File "/home/guido/projects/python/trunk/Lib/ctypes/__init__.py",
line 288, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /usr/lib/libglut.so.3: undefined symbol: XGetExtensionVersion

----------------------------------------------------------------------

This seems libglut related -- I don't know what that is but find /lib
/usr/lib -name \*glut\* produces this output:

/usr/lib/libglut.so.3
/usr/lib/libglut.so.3.7

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From benji at zope.com  Tue May  2 19:34:52 2006
From: benji at zope.com (Benji York)
Date: Tue, 02 May 2006 13:34:52 -0400
Subject: [Python-Dev] Positional-only Arguments
Message-ID: <4457983C.5080109@zope.com>

I've not followed the PEP 3102 (keyword-only arguments) discussion
closely enough to know if this has been mentioned, but we were
discussing a need at work today for the ability to enforce position-only
arguments.

The specific instance was an argument that was intended to be used as a
positional argument which a group had began using as a keyword argument
instead.  There was no way for the users to know it was intended for
positional use only.  And it makes refactoring the signature difficult.

Of course you could use *args from the get-go, but ugly signatures as
default isn't nice.

Is there any reason to pursue the idea, or this mostly a misguided
desire for symmetry?
--
Benji York


From g.brandl at gmx.net  Tue May  2 19:46:56 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 02 May 2006 19:46:56 +0200
Subject: [Python-Dev] test failures in test_ctypes (HEAD)
In-Reply-To: <ca471dc20605021029r44467206t7af7b7dd637bded8@mail.gmail.com>
References: <ca471dc20605021029r44467206t7af7b7dd637bded8@mail.gmail.com>
Message-ID: <e385ug$39q$1@sea.gmane.org>

Guido van Rossum wrote:
> I see test failures in current HEAD on my Google Red Hat Linux desktop
> that the buildbots don't seem to have:
> 
> ./python -E -tt ../Lib/test/regrtest.py test_ctypes
> test_ctypes
> test test_ctypes failed -- errors occurred; run in verbose mode for details
> 
> More details from running this manually:
> $ ./python ../Lib/test/test_ctypes.py
> .
> . (lots of passing tests; then:)
> .
> test_gl (ctypes.test.test_find.Test_OpenGL_libs) ... ERROR
> test_glu (ctypes.test.test_find.Test_OpenGL_libs) ... ERROR
> test_glut (ctypes.test.test_find.Test_OpenGL_libs) ... ERROR
> 
> ======================================================================
> ERROR: test_gl (ctypes.test.test_find.Test_OpenGL_libs)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/home/guido/projects/python/trunk/Lib/ctypes/test/test_find.py",
> line 42, in setUp
>     self.glut = CDLL(lib_glut)
>   File "/home/guido/projects/python/trunk/Lib/ctypes/__init__.py",
> line 288, in __init__
>     self._handle = _dlopen(self._name, mode)
> OSError: /usr/lib/libglut.so.3: undefined symbol: XGetExtensionVersion

You might be interested in
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1478253&group_id=5470

Georg


From ark at acm.org  Tue May  2 19:55:03 2006
From: ark at acm.org (Andrew Koenig)
Date: Tue, 2 May 2006 13:55:03 -0400
Subject: [Python-Dev] Any reason that any()/all() do not
	takea	predicateargument?
In-Reply-To: <4440B49C.7040902@sweetapp.com>
Message-ID: <002401c66e11$8d352e90$6402a8c0@arkdesktop>


> > How about this?
> >
> > 	if any(x==5 for x in seq):
> 
> Aren't all of these equivalent to:
> 
> if 5 in seq:
>      ...

Of course.  However, the original example was pretty clearly intended to be
an illustrative instance of a more general problem.  Rewriting the example
as any(x==5 for x in seq) preserves the generality; rewriting it as 5 in seq
doesn't.




From theller at python.net  Tue May  2 19:57:32 2006
From: theller at python.net (Thomas Heller)
Date: Tue, 02 May 2006 19:57:32 +0200
Subject: [Python-Dev] test failures in test_ctypes (HEAD)
In-Reply-To: <ca471dc20605021029r44467206t7af7b7dd637bded8@mail.gmail.com>
References: <ca471dc20605021029r44467206t7af7b7dd637bded8@mail.gmail.com>
Message-ID: <44579D8C.6080808@python.net>

Guido van Rossum wrote:
> I see test failures in current HEAD on my Google Red Hat Linux desktop
> that the buildbots don't seem to have:
> 
> ./python -E -tt ../Lib/test/regrtest.py test_ctypes
> test_ctypes
> test test_ctypes failed -- errors occurred; run in verbose mode for details
> 
> More details from running this manually:
> $ ./python ../Lib/test/test_ctypes.py
> .
> . (lots of passing tests; then:)
> .
> test_gl (ctypes.test.test_find.Test_OpenGL_libs) ... ERROR
> test_glu (ctypes.test.test_find.Test_OpenGL_libs) ... ERROR
> test_glut (ctypes.test.test_find.Test_OpenGL_libs) ... ERROR

Can you try this patch for Lib/ctypes/test/test_find, please?

Index: test_find.py
===================================================================
--- test_find.py	(Revision 45791)
+++ test_find.py	(Arbeitskopie)
@@ -39,9 +39,9 @@
         if lib_glu:
             self.glu = CDLL(lib_glu, RTLD_GLOBAL)
         if lib_glut:
-            self.glut = CDLL(lib_glut)
+            self.glut = CDLL(lib_glut, RTLD_GLOBAL)
         if lib_gle:
-            self.gle = CDLL(lib_gle)
+            self.gle = CDLL(lib_gle, RTLD_GLOBAL)
 
     if lib_gl:
         def test_gl(self):


The test should try to test the RTLD_GLOBAL flag for loading shared libs.
(The patch is pure guesswork, inspired by a recent thread on the ctypes-users list.
It would be great if someone could provide some insight into this, or suggest
a better test).

Thomas


From brett at python.org  Tue May  2 20:43:56 2006
From: brett at python.org (Brett Cannon)
Date: Tue, 2 May 2006 11:43:56 -0700
Subject: [Python-Dev] signature object issues (to discuss while I am out
	of contact)
In-Reply-To: <44573CA6.5010608@gmail.com>
References: <bbaeab100605011042y501482fauf830d9803a3a4d67@mail.gmail.com>
	<44573CA6.5010608@gmail.com>
Message-ID: <bbaeab100605021143w532d102fpc33250ca662d8cd5@mail.gmail.com>

On 5/2/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Brett Cannon wrote:
> > One is whether a signature object should be automatically created for
> > every function.  As of right now the PEP I am drafting has it on a
> > per-need basis and have it assigned to __signature__ through a
> > built-in function or putting it 'inspect'.  Now automatically creating
> > the object would possibly make it more useful, but it could also be
> > considered overkill.  Also not doing it automatically allows signature
> > objects to possibly make more sense for classes (to represent
> > __init__) and instances (to represent __call__).  But having that same
> > support automatically feels off for some reason to me.
>
> My current impulse is to put the signature object in the inspect module to
> start with, and don't give it a special attribute at all.
>
> All of the use cases I can think of (introspection for documentation purposes
> or argument checking purposes) don't really suffer either way regardless of
> whether the signature retrieval is spelt "obj.__signature__" or
> "inspect.getsignature(obj)".
>

It does for  decorators.  How do you  make sure that a decorator uses 
the signature object of the wrapped function instead of the decorator?
 Or are you saying to just not worry about that right now?

> Since it doesn't make much difference from a usability point of view, let's
> start with the object in the library module first.
>
> > The second question is whether it is worth providing a function that
> > will either figure out if a tuple and dict representing arguments
> > would work in calling the function.  Some have even suggested a
> > function that returns the actual bindings if the call were to occur.
> > Personally I don't see a huge use for either, but even less for the
> > latter version.  If people have a legit use case for either please
> > speak up, otherwise I am tempted to keep the object simple.
>
> A "bind" method on the signature objects is pretty much essential for any kind
> of argument checking usage.
>
> In addition to Aahz's precondition checking example, Talin gave a good example
> on the Py3k list of a function decorator for logging all calls to a function,
> and including the argument bindings in the log message.
>
> And just in case you think the operation would be easy to implement if you
> need it, I've included below the bind method from the signature object I wrote
> to play around with the ideas posted to the Py3k list. It took a fair bit of
> work to get it to spit out the right answers :)
>

Thanks, Nick!

-Brett

> Cheers,
> Nick.
>
>      def bind(*args, **kwds):
>          """Return a dict mapping parameter names to bound arguments"""
>          self = args[0]
>          args = args[1:]
>          bound_params = {}
>          num_args = len(args)
>          missing_args = set(self.required_args)
>          arg_names = self.required_args + self.optional_args
>          num_names = len(arg_names)
>          # Handle excess positional arguments
>          if self.extra_args:
>              bound_params[self.extra_args] = tuple(args[num_names:])
>          elif num_args > num_names:
>              self._raise_args_error(num_args)
>          # Bind positional arguments
>          for name, value in zip(arg_names, args):
>              bound_params[name] = value
>              missing_args -= set((name,))
>          # Bind keyword arguments (and handle excess)
>          if self.extra_kwds:
>              extra_kwds = dict()
>              bound_params[self.extra_kwds] = extra_kwds
>          else:
>              extra_kwds = None
>          for name, value in kwds.items():
>              if name in bound_params:
>                  raise TypeError(
>                      "Got multiple values for argument '%s'" % name)
>              elif name in arg_names:
>                  missing_args -= set((name,))
>                  bound_params[name] = value
>              elif extra_kwds is not None:
>                  extra_kwds[name] = value
>              else:
>                  raise TypeError(
>                      "Got unexpected keyword argument '%s'" % name)
>          if missing_args:
>              self._raise_args_error(num_args)
>          # All done
>          return bound_params
>
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> ---------------------------------------------------------------
>              http://www.boredomandlaziness.org
>

From tjreedy at udel.edu  Tue May  2 21:46:56 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 2 May 2006 15:46:56 -0400
Subject: [Python-Dev] Positional-only Arguments
References: <4457983C.5080109@zope.com>
Message-ID: <e38cve$uhs$1@sea.gmane.org>


"Benji York" <benji at zope.com> wrote in message 
news:4457983C.5080109 at zope.com...
> I've not followed the PEP 3102 (keyword-only arguments) discussion
> closely enough to know if this has been mentioned, but we were
> discussing a need at work today for the ability to enforce position-only
> arguments.

You could discourage name use by not documenting the actual, internal name 
of the parameters.  Something like

def weather_guess(temp, pres):
   '''Guess weather from temperature in degrees Kelvin and pressure
      in millibars passed by position in that order.'''

Terry Jan Reedy




From guido at python.org  Tue May  2 21:48:25 2006
From: guido at python.org (Guido van Rossum)
Date: Tue, 2 May 2006 12:48:25 -0700
Subject: [Python-Dev] Positional-only Arguments
In-Reply-To: <e38cve$uhs$1@sea.gmane.org>
References: <4457983C.5080109@zope.com> <e38cve$uhs$1@sea.gmane.org>
Message-ID: <ca471dc20605021248m6d68efe3n67134e85002e0234@mail.gmail.com>

I've used a double leading underscore on the name. Works great for methods!

On 5/2/06, Terry Reedy <tjreedy at udel.edu> wrote:
>
> "Benji York" <benji at zope.com> wrote in message
> news:4457983C.5080109 at zope.com...
> > I've not followed the PEP 3102 (keyword-only arguments) discussion
> > closely enough to know if this has been mentioned, but we were
> > discussing a need at work today for the ability to enforce position-only
> > arguments.
>
> You could discourage name use by not documenting the actual, internal name
> of the parameters.  Something like
>
> def weather_guess(temp, pres):
>    '''Guess weather from temperature in degrees Kelvin and pressure
>       in millibars passed by position in that order.'''
>
> Terry Jan Reedy
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rowen at cesmail.net  Tue May  2 22:22:45 2006
From: rowen at cesmail.net (Russell E. Owen)
Date: Tue, 02 May 2006 13:22:45 -0700
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org> <e31b0t$7i4$1@sea.gmane.org>
	<445476CC.3080008@v.loewis.de> <e33mj1$5nd$1@sea.gmane.org>
	<4455A0CF.3080008@v.loewis.de> <e35ds3$p77$1@sea.gmane.org>
	<445644D1.3050200@v.loewis.de> <e36f6e$pns$1@sea.gmane.org>
	<44573D75.40007@canterbury.ac.nz>
	<ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>
Message-ID: <rowen-28229D.13224502052006@sea.gmane.org>

In article 
<ca471dc20605020744m333c855cqbf9016340ea5d699 at mail.gmail.com>,
 "Guido van Rossum" <guido at python.org> wrote:

> On 5/2/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> > Terry Reedy wrote:
> >
> > > my way to call your example (given the data in separate variables):
> > >   make_person(name, age, phone, location)
> > > your way:
> > >   make_person(name=name, age=age, phone=phone, location = location)
> >
> > For situations like that, I've sometimes thought
> > it would be useful to be able to say something like
> >
> >    make_person(=name, =age, =phone, =location)
> 
> And even with Terry's use case quoted I can't make out what you meant
> that to do.

I'm pretty sure he wants it to mean:

make_person(name=name, age=age, phone=phone, location=location).

In other words it's a shortcut to avoid needless repetition.

Personally I'd like some way to do that, but the initial "=" is pretty 
startling at first glance. Not that I have a better suggestion.


As far as the rest of the thread goes (and I may be late to the party on 
this), I personally would *love* to be able to write:
def func(arg0, arg1, *args, key1=def1)
and force key1 to be specified by name. I've coded this before using 
**kargs for the keyword-only args, but I'd much rather be able to list 
them in the def (making it more self-documenting).

But that's as far as I'd take it.

I don't see the point to keyword-only arguments that do not have default 
values. And I don't think it's worth the potential confusion to allow 
keyword-only args after a fixed # of positional args. The proposed 
syntax reads like exactly the wrong thing to me; "|" as a separator 
might do if one is desperate enough for this feature, i.e.:
def foo(arg0, arg1 | karg=None):

-- Russell


From benji at benjiyork.com  Tue May  2 22:59:06 2006
From: benji at benjiyork.com (Benji York)
Date: Tue, 02 May 2006 16:59:06 -0400
Subject: [Python-Dev] Positional-only Arguments
In-Reply-To: <e38cve$uhs$1@sea.gmane.org>
References: <4457983C.5080109@zope.com> <e38cve$uhs$1@sea.gmane.org>
Message-ID: <4457C81A.3050104@benjiyork.com>

Terry Reedy wrote:
> You could discourage name use by not documenting the actual, internal name 
> of the parameters.

The issue we had was that the name wasn't documented at all, the users 
simply looked at the code and began using the keyword name.  This may 
well be an area where "we're all adults here" wins.

OTOH there is a /slight/ possibility that it'd be better to disallow 
using an argument as a keyword unless explicitly flagged as such. 
Similar to how the C API works now.

This has the same advantages of the keyword-only PEP (3102): tools will 
know which arguments are positional and which are keyword.  Doing so 
would recast PEP 3102 from "how to get a keyword-only argument" into 
"how to get a keyword (non-positional) argument".  A downside would be 
the need to specify when an argument can be positional *or* keyword.
--
Benji York

From benji at benjiyork.com  Tue May  2 23:01:50 2006
From: benji at benjiyork.com (Benji York)
Date: Tue, 02 May 2006 17:01:50 -0400
Subject: [Python-Dev] Positional-only Arguments
In-Reply-To: <ca471dc20605021248m6d68efe3n67134e85002e0234@mail.gmail.com>
References: <4457983C.5080109@zope.com> <e38cve$uhs$1@sea.gmane.org>
	<ca471dc20605021248m6d68efe3n67134e85002e0234@mail.gmail.com>
Message-ID: <4457C8BE.8030208@benjiyork.com>

Guido van Rossum wrote:
> I've used a double leading underscore on the name. Works great for methods!

We discussed that.  My main issue with that is that it's possible/likely 
that all arguments should be positional by default, should they all then 
begin with underscores?  Makes for ugly function bodies (or lots of name 
rebinding).
--
Benji York


From guido at python.org  Tue May  2 23:49:28 2006
From: guido at python.org (Guido van Rossum)
Date: Tue, 2 May 2006 14:49:28 -0700
Subject: [Python-Dev] mail to talin is bouncing
Message-ID: <ca471dc20605021449p74a5300ew7e61258efb45f495@mail.gmail.com>

Sorry to bother the list -- talin, mail to you is bouncing:

  ----- The following addresses had permanent fatal errors -----
<talin at viridia.org>

  ----- Transcript of session follows -----
... while talking to viridia.org
>>> RCPT To:<talin at viridia.org>
<<< 550 message to verify they are valid."


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tdelaney at avaya.com  Wed May  3 00:29:43 2006
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Wed, 3 May 2006 08:29:43 +1000
Subject: [Python-Dev] elimination of scope bleeding ofiteration	variables
Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1E6B3@au3010avexu1.global.avaya.com>

Josiah Carlson wrote:

>     for line in lines:
>         line = line.rstrip()
>         ...

Exactly the use case I was thinking of (and one I used yesterday BTW).

I'm -1 on *dis*allowing reusing a name bound in a for loop in any
construct i.e. +1 for the status quo.

Tim Delaney

From noamraph at gmail.com  Wed May  3 01:02:14 2006
From: noamraph at gmail.com (Noam Raphael)
Date: Wed, 3 May 2006 02:02:14 +0300
Subject: [Python-Dev] Alternative path suggestion
Message-ID: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>

Hello,

I saw the discussion about including the path type in the standard
library. As it turned out, I recently wrote a program which does quite
a lot of path manipulation. This caused me to think that the proposed
path module:

 * Makes path manipulation significantly easier
 * Can be improved.

So I tried to write my version of it. My basic problem with the
current proposed path module is that it's a bit... messy. It contains
a lot of methods, collected from various modules, and for me it looks
too crowded - there are too many methods and too many details for me
to easily learn.

So I tried to organize it all. I think that the result may make file
and path manipulation really easier.

Here are my ideas. It's a copy of what I posted a few minutes ago in
the wiki - you can view it at
http://wiki.python.org/moin/AlternativePathClass (it looks better
there).

You can find the implementation at
http://wiki.python.org/moin/AlternativePathModule?action=raw
(By the way, is there some "code wiki" available? It can simply be a
public svn repository. I think it will be useful for those things.)

All these are ideas - I would like to hear what you think about them.

= Major Changes =

== a tuple instead of a string ==

The biggest conceptual change is that my path object is a subclass of
''tuple'', not a subclass of str. For example,
{{{
>>> tuple(path('a/b/c'))
('a', 'b', 'c')
>>> tuple(path('/a/b/c'))
(path.ROOT, 'a', 'b', 'c')
}}}

This means that path objects aren't the string representation of a
path; they are a ''logical'' representation of a path. Remember why a
filesystem path is called a path - because it's a way to get from one
place on the filesystem to another. Paths can be relative, which means
that they don't define from where to start the walk, and can be not
relative, which means that they do. In the tuple representation,
relative paths are simply tuples of strings, and not relative paths
are tuples of strings with a first "root" element.

The advantage of using a logical representation is that you can forget
about the textual representation, which can be really complex. You
don't have to call normpath when you're unsure about how a path looks,
you don't have to search for seps and altseps, and... you don't need
to remember a lot of names of functions or methods. To show that, take
a look at those methods from the original path class and their
equivalent in my path class:

{{{
p.normpath()  -> Isn't needed - done by the constructor
p.basename()  -> p[-1]
p.splitpath() -> (p[:-1], p[-1])
p.splitunc()  -> (p[0], p[1:]) (if isinstance(p[0], path.UNCRoot))
p.splitall()  -> Isn't needed
p.parent      -> p[:-1]
p.name        -> p[-1]
p.drive       -> p[0] (if isinstance(p[0], path.Drive))
p.uncshare    -> p[0] (if isinstance(p[0], path.UNCRoot))

and of course:
p.join(q) [or anything like it] -> p + q
}}}

The only drawback I can see in using a logical representation is that
giving a path object to functions which expect a path string won't
work. The immediate solution is to simply use str(p) instead of p. The
long-term solution is to make all related functions accept a path
object.

Having a logical representation of a path calls for a bit of term
clearing-up. What's an absolute path? On POSIX, it's very simple: a
path starting with a '/'. But what about Windows? Is "\temp\file" an
absolute path? I claim that it isn't really. The reason is that if you
change the current working directory, its meaning changes: It's now
not "c:\temp\file", but "a:\temp\file". The same goes for
"c:temp\file". So I decided on these two definitions:

 * A ''relative path'' is a path without a root element, so it can be
concatenated to other paths.
 * An ''absolute path'' is a path whose meaning doesn't change when
the current working directory changes.

This means that paths starting with a drive letter alone
(!UnrootedDrive instance, in my module) and paths starting with a
backslash alone (the CURROOT object, in my module) are not relative
and not absolute.

I really think that it's a better way to handle paths. If you want an
example, compare the current implementation of relpathto and my
implementation.

== Easier attributes for stat objects ==

The current path objects includes:
 * isdir, isfile, islink, and -
 * atime, mtime, ctime, size.
The first line does file mode checking, and the second simply gives
attributes from the stat object.

I suggest that these should be added to the stat_result object. isdir,
isfile and islink are true if a specific bit in st_mode is set, and
atime, mtime, ctime and size are simply other names for st_atime,
st_mtime, st_ctime and st_size.

It means that instead of using the atime, mtime etc. methods, you will
write {{{ p.stat().atime }}}, {{{ p.stat().size }}}, etc.

This is good, because:
 * If you want to make only one system call, it's very easy to save
the stat object and use it.
 * If you have to deal with symbolic links, you can simply use {{{
p.lstat().mtime }}}. Yes, symbolic links have a modification time. The
alternative is to add three methods with ugly names (latime, lmtime,
lctime) or to have an incomplete interface without a good reason.

I think that isfile, isdir should be kept (along with lisfile,
lisdir), since I think that doing what they do is quite common, and
requires six lines:
{{{
try:
    st = p.stat()
except OSError:
    return False
else:
    return st.isdir
}}}

I think that still, isdir, isfile and islink should be added to
stat_result objects: They turned out pretty useful in writing some of
the more complex path methods.

== One Method for Finding Files ==

(They're actually two, but with exactly the same interface). The
original path object has these methods for finding files:

{{{
def listdir(self, pattern = None): ...
def dirs(self, pattern = None): ...
def files(self, pattern = None): ...
def walk(self, pattern = None): ...
def walkdirs(self, pattern = None): ...
def walkfiles(self, pattern = None): ...
def glob(self, pattern):
}}}

I suggest one method that replaces all those:
{{{
def glob(self, pattern='*', topdown=True, onlydirs=False, onlyfiles=False): ...
}}}

pattern is the good old glob pattern, with one additional extension:
"**" matches any number of subdirectories, including 0. This means
that '**' means "all the files in a directory", '**/a' means "all the
files in a directory called a", and '**/a*/**/b*' means "all the files
in a directory whose name starts with 'b' and the name of one of their
parent directories starts with 'a'".

onlydirs and onlyfiles filter the results (they can't be combined, of
course). topdown has the same meaning as in os.walk (it isn't
supported by the original path class). So, let's show how these
methods can be replaced:

{{{
p.listdir()   -> p.glob()
p.dirs()      -> p.glob(onlydirs=1)
p.files()     -> p.glob(onlyfiles=1)
p.walk()      -> p.glob('**')
p.walkdirs()  -> p.glob('**', onlydirs=1)
p.walkfiles() -> p.glob('**', onlyfiles=1)
p.glob(patt)  -> p.glob(patt)
}}}

Now, for the promised additional method. The current implementation of
glob doesn't follow symbolic links. In my implementation, there's
"lglob", which does what the current glob does. However, the (default)
glob does follow symbolic links. To avoid infinite recursion, it keeps
the set of filesystem ids on the current path, and checks each dir to
see if it was already encountered. (It does so only if there's '**' in
the pattern, because otherwise a finite number of results is
guaranteed.) Note that it doesn't keep the ids of all the files
traversed, only those on the path from the base node to the current
node. This means that as long as there're no cycles, everything will
go fine - for example, 'a' and 'b' pointing at the same dir will just
cause the same files to be reported twice, once under 'a' and once
under 'b'. One last note: On windows there are no file ids, but there
are no symbolic links, so everything is fine.

Oh, and it returns an iterator, not a list.

== Separation of Calculations and System Calls ==

I like to know when I'm using system calls and when I don't. It turns
out that using tuples instead of strings makes it possible to define
all operations which do not use system calls as properties or
operators, and all operations which do use system calls as methods.

The only exception currently is .match(). What can I do?

== Reduce the Number of Methods ==

I think that the number of methods should be reduced. The most obvious
example are the copy functions. In the current proposal:

{{{
def copyfile(self, dst): ...
def copymode(self, dst): ...
def copystat(self, dst): ...
def copy(self, dst): ...
def copy2(self, dst): ...
}}}

In my proposal:

{{{
def copy(self, dst, copystat=False): ...
}}}

It's just that I think that copyfile, copymode and copystat aren't
usually useful, and there's no reason not to unite copy and copy2.

= Other Changes =

Here is a list of the smaller things I've changed in my proposal.

The current normpath removes '..' with the name before them. I didn't
do that, because it doesn't return an equivalent path if the path
before the '..' is a symbolic link.

I removed the methods associated with file extensions. I don't recall
using them, and since they're purely textual and not OS-dependent, I
think that you can always do p[-1].rsplit('.', 1).

I removed renames. Why not use makedirs, rename, removedirs?

I removed unlink. It's an alias to remove, as far as I know.

I removed expand. There's no need to use normpath, so it's equivalent
to .expanduser().expandvars(), and I think that the explicit form is
better.

removedirs - I added another argument, basedir, which won't be removed
even if it's empty. I also allowed the first directory to be unempty
(I required that it should be a directory). This version is useful for
me.

readlinkabs - The current path class returns abspath(readlink). This
is meaningless - symbolic links are interpreted relative to the
directory they are in, not relative the the current working directory
of the program. Instead, I wrote readlinkpath, which returns the
correct path object. However, I'm not sure if it's needed - why not
use realpath()?

copytree - I removed it. In shutil it's documented as being mostly a
demonstration, and I'm not sure if it's really useful.

symlink - Instead of a function like copy, with the destination as the
second (actually, the only) argument, I wrote "writelink", which gets
a string and creates a symbolic link with that value. The reason is
that symbolic links can be any string, not necessarily a legal path.

I added mknod and mkfifo, which from some reason weren't there.

I added chdir, which I don't see why shouldn't be defined.

relpathto - I used realpath() instead of abspath(). abspath() may be
incorrect if some of the dirs are symlinks.

I removed relpath. It doesn't seem useful to me, and I think that
writing path.cwd().relpathto(p) is easy enough.

join - I decided that p+q should only work if q is a relative path. In
my first implementation, it returned q, which is consistent with the
current os.path.join(). However, I think that in the spirit of
"explicit is better than implicit", a code like
{{{
if q.isrel:
    return p + q
else:
    return q
}}}
is pretty easy and pretty clear. I think that many times you want q to
be relative, so an exception if it isn't would be helpful. I also
think that it's nice that {{{ len(p+q) == len(p) + len(q) }}}.

match - The current implementation matches the base name of the path
against a pattern. My implementation matches a relative path against a
pattern, which is also a relative path (it's of the same form as the
pattern of glob - may include '**')

matchcase - I removed it. If you see a reason for keeping it, tell me.

= Comparison to the Current Path Class =

Here's a comparison of doing things using the current path class and
doing things using my proposed path class.

{{{
# Operations on path strings:
p.cwd()        -> p.cwd()
p.abspath()    -> p.abspath()
p.normcase()   -> p.normcase
Also added p.normcasestr, to normalize path elements.
p.normpath()   -> Unneeded
p.realpath()   -> p.realpath()
p.expanduser() -> p.expanduser()
p.expandvars() -> p.expandvars()
p.basename()   -> p[-1]
p.expand()     -> p.expanduser().expandvars()
p.splitpath()  -> Unneeded
p.stripext()   -> p[-1].rsplit('.', 1)[0]
p.splitunc()   -> Unneeded
p.splitall()   -> Unneeded
p.relpath()    -> path.cwd().relpathto(p)
p.relpathto(dst) -> p.relpathto(dst)

# Properties about the path:
p.parent       -> p[:-1]
p.name         -> p[-1]
p.ext          -> ''.join(p[-1].rsplit('.', 1)[1:])
p.drive        -> p[0] if p and isinstance(p[0], path.Drive) else None
p.namebase     -> p[-1].rsplit('.', 1)[0]
p.uncshare     -> p[0] if p and isinstance(p[0], path.UNCRoot) else None

# Operations that return lists of paths:
p.listdir()    -> p.glob()
p.listdir(patt)-> p.glob(patt)
p.dirs()       -> p.glob(onlydirs=1)
p.dirs(patt)   -> p.glob(patt, onlydirs=1)
p.files()      -> p.glob(onlyfiles=1)
p.files(patt)  -> p.glob(patt, onlyfiles=1)
p.walk()       -> p.glob('**')
p.walk(patt)   -> p.glob('**/patt')
p.walkdirs()   -> p.glob('**', onlydirs=1)
p.walkdirs(patt) -> p.glob('**/patt', onlydirs=1)
p.walkfiles()  -> p.glob('**', onlyfiles=1)
p.walkfiles(patt) -> p.glob('**/patt', onlyfiles=1)
p.match(patt)  -> p[-1:].match(patt)
(The current match matches the base name. My matches a relative path)
p.matchcase(patt) -> Removed
p.glob(patt)   -> p.glob(patt)

# Methods for retrieving information about the filesystem
# path:
p.exists()     -> p.exists()
Added p.lexists()
p.isabs()      -> not p.isrel
(That's the meaning of the current isabs().)
Added p.isabs
p.isdir()      -> p.isdir()
Added p.lisdir()
p.isfile()     -> p.isfile()
Added p.lisfile()
p.islink()     -> p.islink()
p.ismount()    -> p.ismount()
p.samefile(other) -> p.samefile(other)
p.getatime()   -> p.stat().atime
p.getmtime()   -> p.stat().mtime
p.getctime()   -> p.stat().ctime
p.getsize()    -> p.stat().size
p.access(mode) -> p.access(mode)
p.stat()       -> p.stat()
p.lstat()      -> p.lstat()
p.statvfs()    -> p.statvfs()
p.pathconf(name) -> p.pathconf(name)

# Filesystem properties for path.
atime, mtime, ctime, size - Removed

# Methods for manipulating information about the filesystem
# path.
utime, chmod, chown, rename - unchanged
p.renames(new)   -> new[:-1].makedirs(); p.rename(new); p[:-1].removedirs()

# Create/delete operations on directories
mkdir, makedirs, rmdir, removedirs - unchanged (added an option to removedirs)

# Modifying operations on files
touch, remove - unchanged
unlink - removed

# Modifying operations on links
p.link(newpath)   -> p.link(newpath)
p.symlink(newlink) -> newlink.writelink(p)
p.readlink()      -> p.readlink()
p.readlinkabs()   -> p.readlinkpath()

# High-level functions from shutil
copyfile, copymode, copystat, copytree - removed
p.copy(dst)   -> p.copy(dst)
p.copy2(dst)  -> p.copt(dst, copystat=1)
move, rmtree - unchanged.

# Special stuff from os
chroot, startfile - unchanged.

}}}

= Open Issues =

Unicode - I have no idea about unicode paths. My current
implementation simply uses str. This should be changed, I guess.

Slash-terminated paths - In my current implementation, paths ending
with a slash are normalized to paths without a slash (this is also the
behaviour of os.path.normpath). However, they aren't really the same:
stat() on paths ending with a slash fails if they aren't directories,
and lstat() treats them as directories even if they are symlinks.
Perhaps a final empty string should be allowed.

= Finally =

Please say what you think, either here or on the wiki. Not every
change that I suggested must be accepted, but I would be happy if they
were considered.

I hope it proves useful.
Noam

From guido at python.org  Wed May  3 01:46:43 2006
From: guido at python.org (Guido van Rossum)
Date: Tue, 2 May 2006 16:46:43 -0700
Subject: [Python-Dev] elimination of scope bleeding ofiteration variables
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1E6B3@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5FF1E6B3@au3010avexu1.global.avaya.com>
Message-ID: <ca471dc20605021646y1838de9fx86e4765769833f3d@mail.gmail.com>

Don't worry. This isn't going to change. Someone please update PEP 3099.

On 5/2/06, Delaney, Timothy (Tim) <tdelaney at avaya.com> wrote:
> Josiah Carlson wrote:
>
> >     for line in lines:
> >         line = line.rstrip()
> >         ...
>
> Exactly the use case I was thinking of (and one I used yesterday BTW).
>
> I'm -1 on *dis*allowing reusing a name bound in a for loop in any
> construct i.e. +1 for the status quo.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Wed May  3 03:22:50 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 03 May 2006 13:22:50 +1200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>
References: <4453B025.3080100@acm.org> <e31b0t$7i4$1@sea.gmane.org>
	<445476CC.3080008@v.loewis.de> <e33mj1$5nd$1@sea.gmane.org>
	<4455A0CF.3080008@v.loewis.de> <e35ds3$p77$1@sea.gmane.org>
	<445644D1.3050200@v.loewis.de> <e36f6e$pns$1@sea.gmane.org>
	<44573D75.40007@canterbury.ac.nz>
	<ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>
Message-ID: <445805EA.7000501@canterbury.ac.nz>

Guido van Rossum wrote:
> On 5/2/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

> >    make_person(=name, =age, =phone, =location)
> 
> And even with Terry's use case quoted I can't make out what you meant
> that to do.

I meant it to do the same thing as

   make_person(name=name, age=age, phone=phone, location=location)

I come across use cases for this fairly frequently, usually
when I have an __init__ method that supplies default values
for a bunch of arguments, and then wants to pass them on to
an inherited __init__ with the same names. It feels very
wanky having to write out all those foo=foo expressions.

--
Greg

From guido at python.org  Wed May  3 04:32:48 2006
From: guido at python.org (Guido van Rossum)
Date: Tue, 2 May 2006 19:32:48 -0700
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <445805EA.7000501@canterbury.ac.nz>
References: <4453B025.3080100@acm.org> <445476CC.3080008@v.loewis.de>
	<e33mj1$5nd$1@sea.gmane.org> <4455A0CF.3080008@v.loewis.de>
	<e35ds3$p77$1@sea.gmane.org> <445644D1.3050200@v.loewis.de>
	<e36f6e$pns$1@sea.gmane.org> <44573D75.40007@canterbury.ac.nz>
	<ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>
	<445805EA.7000501@canterbury.ac.nz>
Message-ID: <ca471dc20605021932j1a94a8bcjef3bd9258245f547@mail.gmail.com>

On 5/2/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
> > On 5/2/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>
> > >    make_person(=name, =age, =phone, =location)
> >
> > And even with Terry's use case quoted I can't make out what you meant
> > that to do.
>
> I meant it to do the same thing as
>
>    make_person(name=name, age=age, phone=phone, location=location)
>
> I come across use cases for this fairly frequently, usually
> when I have an __init__ method that supplies default values
> for a bunch of arguments, and then wants to pass them on to
> an inherited __init__ with the same names. It feels very
> wanky having to write out all those foo=foo expressions.

Sorry, but leading = signs feel even more wanky. (That's a technical
term. ;-) It violates the guideline that Python's punctuation should
preferably mimic English; or other mainstram languages (as with 'x.y'
and '@deco').

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From fdrake at acm.org  Wed May  3 05:06:02 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 2 May 2006 23:06:02 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <ca471dc20605021932j1a94a8bcjef3bd9258245f547@mail.gmail.com>
References: <4453B025.3080100@acm.org> <445805EA.7000501@canterbury.ac.nz>
	<ca471dc20605021932j1a94a8bcjef3bd9258245f547@mail.gmail.com>
Message-ID: <200605022306.03128.fdrake@acm.org>

On Tuesday 02 May 2006 22:32, Guido van Rossum wrote:
 > and '@deco').

Pronounced "at-deck-oh", @deco is an art-deco variant favored in "r"-deprived 
regions.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From murman at gmail.com  Wed May  3 05:49:33 2006
From: murman at gmail.com (Michael Urman)
Date: Tue, 2 May 2006 22:49:33 -0500
Subject: [Python-Dev] more pyref: continue in finally statements
In-Reply-To: <44565CF0.4000207@v.loewis.de>
References: <e35ivt$biu$1@sea.gmane.org> <44565CF0.4000207@v.loewis.de>
Message-ID: <dcbbbb410605022049j6f03d496u667aebe7b0328666@mail.gmail.com>

On 5/1/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> then what should be the meaning of "continue" here? The finally
> block *eventually* needs to re-raise the exception. When should
> that happen?

It should behave similarly to return and swallow the exception. In
your example this would result in an infinite loop. Alternately the
behavior of return should be changed, and the below code would no
longer work as it does today.

>>> def foo():
...   try: raise Exception
...   finally: return 'Done'
...
>>> foo()
'Done'

Michael
--
Michael Urman  http://www.tortall.net/mu/blog

From nnorwitz at gmail.com  Wed May  3 09:15:33 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Wed, 3 May 2006 00:15:33 -0700
Subject: [Python-Dev] Seeking students for the Summer of Code
Message-ID: <ee2a432c0605030015x3a974b06ke2ecc3b9e23b0ffa@mail.gmail.com>

There is less than a week left before students must submit a final
application.  There are a bunch of ideas up on the wiki:

    http://wiki.python.org/moin/SummerOfCode/

The wiki has instructions for how to submit a proposal.  There are
many different areas including:  core language features, libraries,
and applications. This is a great opportunity to get real coding
experience.  Not to mention the chance to work with a nice and fun
group of people.

The earlier you submit an application, the more feedback you can get
to improve it and increase your liklihood of getting accepted.

Feel free to contact me if you have any questions.

Cheers,
n

From jcarlson at uci.edu  Wed May  3 10:25:35 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 03 May 2006 01:25:35 -0700
Subject: [Python-Dev] binary trees. Review obmalloc.c
In-Reply-To: <44508866.60503@renet.ru>
References: <20060426094148.66EE.JCARLSON@uci.edu> <44508866.60503@renet.ru>
Message-ID: <20060502222033.678D.JCARLSON@uci.edu>


"Vladimir 'Yu' Stepanov" <vys at renet.ru> wrote:

> Comparison of functions of sorting and binary trees not absolutely 
> correctly. I think that function sort will lose considerably on
> greater lists. Especially after an insert or removal of all one element.

Generally speaking, people who understand at least some of Python's
internals (list internals specifically), will not be *removing* entries
from lists one at a time (at least not using list.pop(0) ), because that
is slow.  If they need to remove items one at a time from the smallest
to the largest, they will usually use list.reverse(), then repeatedly
list.pop(), as this is quite fast (in general).

However, as I just said, people usually don't remove items from
just-sorted lists, they tend to iterate over them via 'for i in list:' .


 - Josiah

As an aside, I have personally implemented trees a few times for
different reasons.  One of the issues I have with most tree
implementations is that it is generally difficult to do things like
"give me the kth smallest/largest item".  Of course the standard
solution is what is generally called a "partial order" or "order
statistic" tree, but then you end up with yet another field in your
structure.  One nice thing about Red-Black trees combined with
order-statistic trees, is that you can usually use the uppermost bit of
the order statistics for red/black information.  Of course, this is
really only interesting from an "implement this in C" perspective;
because if one were implementing it in Python, one may as well be really
lazy and not bother implementing guaranteed balanced trees and be
satisfied with randomized-balancing via Treaps.

Of course all of this discussion about trees in Python is fine, though
there is one major problem.  With minimal work, one can construct 3
values such that ((a < b < c) and (c < a)) is true.


From ncoghlan at gmail.com  Wed May  3 12:15:54 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 03 May 2006 20:15:54 +1000
Subject: [Python-Dev] signature object issues (to discuss while I am out
 of contact)
In-Reply-To: <bbaeab100605021143w532d102fpc33250ca662d8cd5@mail.gmail.com>
References: <bbaeab100605011042y501482fauf830d9803a3a4d67@mail.gmail.com>	
	<44573CA6.5010608@gmail.com>
	<bbaeab100605021143w532d102fpc33250ca662d8cd5@mail.gmail.com>
Message-ID: <445882DA.1020806@gmail.com>

Brett Cannon wrote:
> On 5/2/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> All of the use cases I can think of (introspection for documentation 
>> purposes
>> or argument checking purposes) don't really suffer either way 
>> regardless of
>> whether the signature retrieval is spelt "obj.__signature__" or
>> "inspect.getsignature(obj)".
>>
> 
> It does for  decorators.  How do you  make sure that a decorator uses 
> the signature object of the wrapped function instead of the decorator?
> Or are you saying to just not worry about that right now?

Ah, good point. In that case, I'd be in favour of including the attribute in 
the PEP, and having inspect.getsignature() check for that attribute before 
manually building a signature object from the function and its code object.

The other nice thing about the attribute is that it allows otherwise 
uninspectable functions (like C functions) to choose to declare their API for 
introspection purposes.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From amk at amk.ca  Wed May  3 18:08:07 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Wed, 3 May 2006 12:08:07 -0400
Subject: [Python-Dev] Date for DC-area Python sprint?
Message-ID: <20060503160807.GB30707@rogue.amk.ca>

I found out that May 27th is on Memorial Day weekend, and some people
will doubtless have travel plans.  Let's aim for the next Saturday,
June 3rd.  (No intersection with the Need for Speed sprint; oh well.)

I'll post a detailed announcement and set up a wiki page when things
are nailed down, but it looks like the sprint can be from 9 or 10AM to
4PM, or thereabouts.  Wireless will be available.

Does anyone else want to organize sprints for that day?

--amk

From coder_infidel at hotmail.com  Wed May  3 17:11:17 2006
From: coder_infidel at hotmail.com (Luke Dunstan)
Date: Wed, 3 May 2006 23:11:17 +0800
Subject: [Python-Dev] Python for Windows CE
Message-ID: <BAY109-DAV2BF904600E22663CCC5038AB70@phx.gbl>


Hi,

I would like to explore the possibility of submitting patches to allow 
Python to work on Windows CE out of the box. I have a few questions in 
preparation:

1. Is there any reason in principle why patches for Windows CE support would 
be rejected?

2. Should I submit unified diffs or context diffs?

One page says:
    "Context diffs are preferred, so generate the patch using diff -c."
    http://www.python.org/dev/tools/

Another says:
    "We like unified diffs. We grudgingly accept contextual diffs..."
    http://www.python.org/dev/patches/

3. The page http://www.python.org/dev/patches/style/ says that 
platform-specific code should use preprocessor symbols documented for the 
compiler, but existing Python code uses e.g. the invented MS_WIN32 symbol 
instead of the standard _WIN32. Should we create a symbol MS_WINCE for 
consistency or use the more common _WIN32_WCE?

4. The latest existing patch set uses os.name = "ce", and this can be used 
to distinguish Windows CE from other operating systems in Python code. The 
existing patches also set sys.platform = "Pocket PC", but I am not so keen 
on keeping this, mainly because Windows CE is equivalent to Win32 in many 
cases. Any comments?

5. Windows CE lacks some header files that are present on other platforms, 
e.g. <direct.h>. If we ignore the actual declarations in the header for a 
moment, there are a few ways to solve this problem:

(a) The obvious solution is to wrap each #include <direct.h> inside #ifndef 
MS_WINCE...#endif

(b) Instead we could use #ifdef HAVE_DIRECT_H, requiring patches to #define 
HAVE_DIRECT_H for other platforms

(c) The existing patch set uses a simpler solution by creating a (possibly 
empty) direct.h in a Windows CE-specific subdirectory of the Python source 
tree, and adding that directory to the compiler include path. This means 
that the source files probably won't need to be patched and it may help when 
building C extension modules.

Is there any objection to the method (c) of providing missing headers? What 
is the preferred solution?

In later emails I will of course go into more detail about the patches 
required. Any general tips on how to submit good patches are welcome.

Thanks,
Luke Dunstan

From theller at python.net  Wed May  3 20:53:48 2006
From: theller at python.net (Thomas Heller)
Date: Wed, 03 May 2006 20:53:48 +0200
Subject: [Python-Dev] Shared libs on Linux (Was: test failures in
	test_ctypes (HEAD))
In-Reply-To: <ca471dc20605021029r44467206t7af7b7dd637bded8@mail.gmail.com>
References: <ca471dc20605021029r44467206t7af7b7dd637bded8@mail.gmail.com>
Message-ID: <4458FC3C.1040407@python.net>

[Crossposting to both python-dev and ctypes-users, please respond to the list
that seems most appropriate]

Guido van Rossum wrote:
> I see test failures in current HEAD on my Google Red Hat Linux desktop
> that the buildbots don't seem to have:
> 
> ./python -E -tt ../Lib/test/regrtest.py test_ctypes
> test_ctypes
> test test_ctypes failed -- errors occurred; run in verbose mode for details
> 
> More details from running this manually:
> $ ./python ../Lib/test/test_ctypes.py
> .
> . (lots of passing tests; then:)
> .
> test_gl (ctypes.test.test_find.Test_OpenGL_libs) ... ERROR
> test_glu (ctypes.test.test_find.Test_OpenGL_libs) ... ERROR
> test_glut (ctypes.test.test_find.Test_OpenGL_libs) ... ERROR
> 
> ======================================================================
> ERROR: test_gl (ctypes.test.test_find.Test_OpenGL_libs)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/home/guido/projects/python/trunk/Lib/ctypes/test/test_find.py",
> line 42, in setUp
>     self.glut = CDLL(lib_glut)
>   File "/home/guido/projects/python/trunk/Lib/ctypes/__init__.py",
> line 288, in __init__
>     self._handle = _dlopen(self._name, mode)
> OSError: /usr/lib/libglut.so.3: undefined symbol: XGetExtensionVersion

I have now changed the test to ignore the case when the libglut.so library
cannot be loaded because of missing symbols.  I could not reproduce the failure
on any of the systems I have access to.  I've installed fedora 5, but the test
succeeds on this system (I assume redhat desktop is different from fedora, though).

Unfortunately I don't know enough about shared libs on linux to make the test more correct.

I would appreciate explanations or pointers to explanations how shard library loading
on linux works in detail.  The most important questions now is:

- When I load shared libs, sometimes it is required to use the RTLD_GLOBAL flag
 (RTLD_LOCAL is the default), although in most casesthis is not needed.  I think
  I do understand what RTLD_GLOBAL does, I just do not know how to determine if it
  is required or not.  The runtime loader seems to have access to this information -
  is there a way for me to find out?


Thanks,

Thomas


From martin at v.loewis.de  Wed May  3 23:03:27 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 03 May 2006 23:03:27 +0200
Subject: [Python-Dev] Python for Windows CE
In-Reply-To: <BAY109-DAV2BF904600E22663CCC5038AB70@phx.gbl>
References: <BAY109-DAV2BF904600E22663CCC5038AB70@phx.gbl>
Message-ID: <44591A9F.1090304@v.loewis.de>

Luke Dunstan wrote:
> 1. Is there any reason in principle why patches for Windows CE support would 
> be rejected?

No, not in principle. Of course,
- the patch shouldn't break anything
- you should state an explicit commitment to support the port for
  some years (or else it might get removed at the slightest problem)

> 2. Should I submit unified diffs or context diffs?

Yes, please :-) Actually, unified diffs are common these days,
in particular as subversion generates them.

> 3. The page http://www.python.org/dev/patches/style/ says that 
> platform-specific code should use preprocessor symbols documented for the 
> compiler, but existing Python code uses e.g. the invented MS_WIN32 symbol 
> instead of the standard _WIN32. Should we create a symbol MS_WINCE for 
> consistency or use the more common _WIN32_WCE?

Depends on who defines it. If the compiler defines it on its own, it is
best to use that. If some header file that gets unconditionally included
defines it, that is just as well.

If you explicitly define something, try to follow the conventions you
like best.

> 4. The latest existing patch set uses os.name = "ce", and this can be used 
> to distinguish Windows CE from other operating systems in Python code. The 
> existing patches also set sys.platform = "Pocket PC", but I am not so keen 
> on keeping this, mainly because Windows CE is equivalent to Win32 in many 
> cases. Any comments?

I would guide the decision based on the number of API changes you need
to make to the standard library (e.g. distutils). In any case, the
platform module should be able to reliably report the specific system
(including the specific processor architecture).

> (b) Instead we could use #ifdef HAVE_DIRECT_H, requiring patches to #define 
> HAVE_DIRECT_H for other platforms

That would be best. Python generally uses autoconf methodology, which
implies that conditionally-existing headers should be detected using
HAVE_*_H.

Regards,
Martin


From bjourne at gmail.com  Thu May  4 01:21:13 2006
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Thu, 4 May 2006 01:21:13 +0200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <445805EA.7000501@canterbury.ac.nz>
References: <4453B025.3080100@acm.org> <445476CC.3080008@v.loewis.de>
	<e33mj1$5nd$1@sea.gmane.org> <4455A0CF.3080008@v.loewis.de>
	<e35ds3$p77$1@sea.gmane.org> <445644D1.3050200@v.loewis.de>
	<e36f6e$pns$1@sea.gmane.org> <44573D75.40007@canterbury.ac.nz>
	<ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>
	<445805EA.7000501@canterbury.ac.nz>
Message-ID: <740c3aec0605031621v48f33dd0g591cb12a131764e8@mail.gmail.com>

> > >    make_person(=name, =age, =phone, =location)
> >
> > And even with Terry's use case quoted I can't make out what you meant
> > that to do.
>
> I meant it to do the same thing as
>
>    make_person(name=name, age=age, phone=phone, location=location)
>
> I come across use cases for this fairly frequently, usually
> when I have an __init__ method that supplies default values
> for a bunch of arguments, and then wants to pass them on to

Me too! I would have thought that the one obvious way to get rid of
the wanky feeling would have been to write:

def make_person(name, age, phone, location): ...

make_person(name, age, phone, location)

IMHO, keyword arguments are often overused like that. Many times they
don't improve readability any more than naming your variables sensibly
do. No, I have not studied how my API:s are used (and how they evolve)
over a longer period of time.

--
mvh Bj?rn

From talin at acm.org  Thu May  4 03:25:15 2006
From: talin at acm.org (Talin)
Date: Thu, 4 May 2006 01:25:15 +0000 (UTC)
Subject: [Python-Dev] mail to talin is bouncing
References: <ca471dc20605021449p74a5300ew7e61258efb45f495@mail.gmail.com>
Message-ID: <loom.20060504T032223-950@post.gmane.org>

Guido van Rossum <guido <at> python.org> writes:

> Sorry to bother the list -- talin, mail to you is bouncing:

Someone sent me mail? Cool! :)

Sorry about that, I'm in the process of migrating hosting providers, and I 
forgot to add an email account for myself :) It should be better now, I'll do 
some more tests tonight.

(This is all part of my mad scheme to get my TurboGears/AJAX-based online 
collaborative Thesaurus project available to the outside world.)

<End of Thread>

-- Talin



From xah at xahlee.org  Thu May  4 04:09:06 2006
From: xah at xahlee.org (xahlee)
Date: Wed, 3 May 2006 19:09:06 -0700
Subject: [Python-Dev] lambda in Python
Message-ID: <C4BE8293-5F0D-44EE-AE1F-6655B4A5474C@xahlee.org>

Today i ran into one of Guido van Rossum's blog article titled  
?Language Design Is Not Just Solving Puzzles? at
http://www.artima.com/weblogs/viewpost.jsp?thread=147358

The article reads very kooky. The bottom line is that Guido simply  
does not like the solution proposed for fixing the lambda construct  
in Python, and for whatever reasons thinks that no solution would  
satisfy him about this. But instead, he went thru sophistry on the  
ignorance and psychology of coder mass in the industry, with mentions  
of the mysterious Zen, the cool Google, the Right Brain, Rube  
Goldberg contraption irrelevancies.

 From his article, i noticed that there's largish thread of  
discussions on lambda.
The following is a essay i wrote after reading another one of Guido  
blog, in which shows prejudice and ignorance about functional  
programing. I hope it can reduce the ignorance about lambda and  
functional programing.

--------------------------
Lambda in Python 3000

Xah Lee, 20050930

On Guido van Rossum's website:
http://www.artima.com/weblogs/viewpost.jsp?thread=98196 (local copy)
dated 20050310, he muses with the idea that he would like to remove  
lambda, reduce(), filter() and map() constructs in a future version  
Python 3000.

Guido wrote:

     ?filter(P, S) is almost always written clearer as [x for x in S  
if P(x)], and this has the huge advantage that the most common usages  
involve predicates that are comparisons, e.g. x==42, and defining a  
lambda for that just requires much more effort for the reader (plus  
the lambda is slower than the list comprehension)?

The form ?[x for x in S if P(x)]? is certainly not more clear  
than ?filter(P, S)?. The latter is clearly a function, what is the  
former? A function every programer in any language can understand and  
appreciate its form and function. Why would anyone to expect everyone  
to appreciate a Python syntactical idiosyncrasy ?[x for ...]??

Also, the argument that the form ?filter(F,S)? being cumbersome  
because the first argument is a function and that mostly likely it  
would be a function that returns true/false thus most people will  
probably use the inline ?lambda? construct and that is quite  
cumbersome than if the whole thing is written with the syntactical  
idiosyncrasy ?[x for ...]?, is rather inane, as you can now see.

The filter(decision_function,list) form is clean, concise, and helps  
thinking. Why it helps thinking? Because it condenses the whole  
operation into its mathematical essence with the most clarity. That  
is, it filters, of a list, and by a yes/no decision function. Nothing  
is more, and nothing can be less. It is unfortunate that we have the  
jargon Lambda and Predicate developed by the tech geekers of the  
functional programing community. The lambda could be renamed Pure  
Function and the Predicate could be called True/False function, but  
the world of things being the way they are already, it is unwise to  
rewrite every existing Perl program just because somebody invented  
another language.

If the predicate P in filter(P,S) is cumbersome, so would exactly the  
same thing appear in the syntactical idiosyncrasy: ?[x for x in S if  
P(x)]?.

Guido added this sting as a afterthought:

     ?(plus the lambda is slower than the list comprehension)?

Which is faster is really the whim and capacity of Python compiler  
implementators. And, weren't we using clarity as the judgement a  
moment ago? The concept of a function every programer understands,  
but what the heck is a List Comprehension? Why don't you scrap list  
comprehension in Python 3000 and create a table() function that's  
simpler in syntax and more powerful in semantics? ( See http:// 
xahlee.org/perl-python/list_comprehension.html )

     ?Why drop lambda? Most Python users are unfamiliar with Lisp or  
Scheme, so the name is confusing; also, there is a widespread  
misunderstanding that lambda can do things that a nested function  
can't -- I still recall Laura Creighton's Aha!-erlebnis after I  
showed her there was no difference! Even with a better name, I think  
having the two choices side-by-side just requires programmers to  
think about making a choice that's irrelevant for their program; not  
having the choice streamlines the thought process. Also, once map(),  
filter() and reduce() are gone, there aren't a whole lot of places  
where you really need to write very short local functions; Tkinter  
callbacks come to mind, but I find that more often than not the  
callbacks should be methods of some state-carrying object anyway (the  
exception being toy programs).?

In the outset Guido here assumes a moronitude about the set of Python  
users and what they are familiar of. Python users 10 years ago are  
not the same Python users today, and will certainly not be the same  
10 years later if you chop off lambda. Things change, math literacy  
advances, and what users you have changes with what you are. A  
function is the gist of a mathematical idea embodied in computer  
languages, not something from LISP or Scheme as tech geekers wont to  
think.

     ?... there is a widespread misunderstanding that lambda can do  
things that a nested function can't...?

One is so insulted by a industrial big shot in quoting something so  
disparate then shot it down as if showing his perspicacity.

A lambda is a syntax for function or a name for the concept of  
function. What does it mean that a lambda isn't as powerful as nested  
function??

The lambda in Python is really ill. It is designed with a built-in  
limitation in the first place, and regarded as some foreign substance  
in the Imperative Crowd such as the Pythoners. If there's any problem  
with lambda, it is with lambda in Python and Pythoner's attitude.

     ?Also, once map(), filter() and reduce() are gone, there aren't  
a whole lot of places where you really need to write very short local  
functions;?

Of course, one begins to write things like Java: in three thousand  
words just to show you are a moron.

The removing of elements in a language is in general not a good idea.  
Removing powerful features so that coding monkeys can use it is  
moronic. (e.g. Java) Removing ?redundant? constructs is not always  
smart (e.g. Scheme), because it pinches on practicality. Removing  
existing language features by a visionary upgrade is a waste. It  
forces unnecessary shakeup and can cause death.

     ?So now reduce(). This is actually the one I've always hated  
most, because, apart from a few examples involving + or *, almost  
every time I see a reduce() call with a non-trivial function  
argument, I need to grab pen and paper to diagram what's actually  
being fed into that function before I understand what the reduce() is  
supposed to do. So in my mind, the applicability of reduce() is  
pretty much limited to associative operators, and in all other cases  
it's better to write out the accumulation loop explicitly.?

The existence of reduce() in Python is probably caused by tech  
geeking clowns of the computing industry. Basically, nobody really  
have a clear understanding of mathematics or computing semantics, but  
every elite tech geeker knew one bag of constructs of various  
languages. So, you add this, i want that, and the language becomes a  
incoherent soup of constructs, with the backlash of wanting to chop  
off things again, with that good things.

reduce() in fact embodies a form of iteration/recursion on lists,  
very suitable in a functional language environment. If Python's  
lambda and other functional facilities are more powerful or complete,  
reduce() would be a good addition. For instance, in functional  
programing, it is a paradigm to nest or sequence functions. (most  
readers will be familiar in the form of unix shell's ?pipe?). When  
you sequence functions, you can't stop in the middle and do a loop  
construct. So, reduce() and other functional forms of iteration are  
convenient and necessary.

Suggestions: lambda, reduce(), filter() and map() all should stay.  
I'm not sure exactly what's the ins and outs of Python 3000. If one  
wants to shake things up based on a vision: don't. There are already  
gazillion languages and visions; the world really don't need another  
bigshot's say for their personal advancement. As for improvement,  
lambda in Python should be expanded to remove its built-in limitation  
(and Imperative Programing Crowd such as Pythoners should cease and  
desist with their lambda attitude problem). The function map() could  
also be considered for expansion. (see ?What is Expressiveness in a  
Computer Language? at http://xahlee.org/perl-python/ 
what_is_expresiveness.html ) Function reduce() should stay because  
it's already there, even if it is not very useful and odd in Python.  
filter() should stay as it is as it is superb and proper.

...

the rest of the article explaining the functions of the functions in  
question is at:
http://xahlee.org/perl-python/python_3000.html

    Xah
    xah at xahlee.org
? http://xahlee.org/



?




From anthony at python.org  Thu May  4 05:20:27 2006
From: anthony at python.org (Anthony Baxter)
Date: Thu, 4 May 2006 13:20:27 +1000
Subject: [Python-Dev] lambda in Python
In-Reply-To: <C4BE8293-5F0D-44EE-AE1F-6655B4A5474C@xahlee.org>
References: <C4BE8293-5F0D-44EE-AE1F-6655B4A5474C@xahlee.org>
Message-ID: <200605041320.31979.anthony@python.org>

This is utterly irrelevant for python-dev. Please take it elsewhere.

Thanks,
Anthony

From greg.ewing at canterbury.ac.nz  Thu May  4 06:20:56 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 04 May 2006 16:20:56 +1200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <740c3aec0605031621v48f33dd0g591cb12a131764e8@mail.gmail.com>
References: <4453B025.3080100@acm.org> <445476CC.3080008@v.loewis.de>
	<e33mj1$5nd$1@sea.gmane.org> <4455A0CF.3080008@v.loewis.de>
	<e35ds3$p77$1@sea.gmane.org> <445644D1.3050200@v.loewis.de>
	<e36f6e$pns$1@sea.gmane.org> <44573D75.40007@canterbury.ac.nz>
	<ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>
	<445805EA.7000501@canterbury.ac.nz>
	<740c3aec0605031621v48f33dd0g591cb12a131764e8@mail.gmail.com>
Message-ID: <44598128.8060506@canterbury.ac.nz>

BJ?rn Lindqvist wrote:
> would have thought that the one obvious way to get rid of
> the wanky feeling would have been to write:
> 
> def make_person(name, age, phone, location): ...
> 
> make_person(name, age, phone, location)

This doesn't fly in something like PyGUI, where there
are literally dozens of potential arguments to many
of the constructors. The only sane way to deal with
that is for them to be keyword-only, at least
conceptually if not in actual implementation.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiam!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From me+python-dev at modelnine.org  Thu May  4 08:29:52 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Thu, 4 May 2006 08:29:52 +0200
Subject: [Python-Dev] Python long command line options
Message-ID: <200605040829.53140.me+python-dev@modelnine.org>

Hi all!

Martin von L?wis said that it'd be necessary to discuss a patch to Python's 
ability to parse command-line parameters to the interpreter here, and I 
thought I might start the ball rolling.

The associated patch of mine is:

http://sourceforge.net/tracker/index.php?func=detail&aid=1481112&group_id=5470&atid=305470

which refers to:

http://groups.google.de/group/comp.lang.python/browse_thread/thread/e3db1619040d7c6c/3c119e09122c83ed?hl=de#3c119e09122c83ed

(sorry for the long URLs, tinyurl isn't working for me at the moment...)

The patch itself is an extension of the Python/getopt.c module to handle long 
command-line parameters (including option data passed as --<name>=<data>), 
which are specified in the optstring as:

"<shortopt>[(longopt)][:]"

for example:

"V(version)"

or

"c(command):"

The patch doesn't change behaviour on old optstrings, except where an old 
optstring relied on the fact that one of :, (, ) are valid parameter names 
(because of the unconditional strchr to find the option name). I've not found 
any reference to _PyOS_GetOpt besides the one in Modules/main.c, so I don't 
think this change in behaviour breaks any existing code. The patch relies 
only on usability of strchr (which the old getopt.c did too), removes a usage 
of strcmp which was completely unneccesary, fixes some checks which were 
unneccesarily broad (>= replaced by ==), and compilation doesn't show any 
warnings on gcc 3.4.6; as for Windows (and MSVC), I don't have the means to 
test there.

The current patch offers prefix matching: when an option is only specified 
with a significant amount of prefix characters which doesn't match any other 
long option, the patch assumes that the user meant this option. For example:

--ver

would also be interpreted as command-line parameter

--version

just as would

--v
--ve
--vers
--versi
--versio

if there are no other long options that start with the corresponding prefix. 
This "trick" is easy enough to remove from the sources, though, so that only 
correctly spelled options are actually returned.

Anyway, back on topic, I personally agree with the people who posted to 
comp.lang.python that --version (and possibly --help, possibly other long 
parameters too) would be useful additions to Pythons command-line parameters, 
as it's increasingly getting more common amongst GNU and BSD utilities to 
support these command-line options to get information about the utility at 
hand (very popular amongst system administrators) and to use long commandline 
options to be able to easily see what an option does when encountered in a 
shell script of some sort.

And, as this doesn't break any old code (-V isn't going away by the patch), I 
personally find this to be a very useful change.

Your thoughts?

--- Heiko.

From talin at acm.org  Thu May  4 08:47:20 2006
From: talin at acm.org (Talin)
Date: Thu, 4 May 2006 06:47:20 +0000 (UTC)
Subject: [Python-Dev] lambda in Python
References: <C4BE8293-5F0D-44EE-AE1F-6655B4A5474C@xahlee.org>
Message-ID: <loom.20060504T084448-523@post.gmane.org>

xahlee <xah <at> xahlee.org> writes:

> Today i ran into one of Guido van Rossum's blog article titled  
> ?Language Design Is Not Just Solving Puzzles? at
> http://www.artima.com/weblogs/viewpost.jsp?thread=147358

The confrontational tone of this post makes it pretty much impossible
to have a reasonable debate on the subject. I'd suggest that if you
really want to be heard (instead of merely having that "I'm right"
feeling) that you try a different approach.

-- Talin



From sluggoster at gmail.com  Thu May  4 10:13:18 2006
From: sluggoster at gmail.com (Mike Orr)
Date: Thu, 4 May 2006 01:13:18 -0700
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
Message-ID: <6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>

On 5/2/06, Noam Raphael <noamraph at gmail.com> wrote:
> Here are my ideas. It's a copy of what I posted a few minutes ago in
> the wiki - you can view it at
> http://wiki.python.org/moin/AlternativePathClass (it looks better
> there).
>
> You can find the implementation at
> http://wiki.python.org/moin/AlternativePathModule?action=raw
> (By the way, is there some "code wiki" available? It can simply be a
> public svn repository. I think it will be useful for those things.)


Intriguing idea, Noam, and excellent thinking.  I'd say it's worth a
separate PEP.  It's too different to fit into PEP 355, and too big to
be summarized in the "Open Issues" section.  Of course, one PEP will
be rejected if the other is approved.

The main difficulty with this approach is it's so radical.  It would
require a serious champion to convince people it's as good as our
tried-and-true strings.


> == a tuple instead of a string ==
>
> The biggest conceptual change is that my path object is a subclass of
> ''tuple'', not a subclass of str. For example,
> {{{
> >>> tuple(path('a/b/c'))
> ('a', 'b', 'c')
> >>> tuple(path('/a/b/c'))
> (path.ROOT, 'a', 'b', 'c')
> }}}


How about  an .isabsolute attribute instead of prepending path.ROOT? 
I can see arguments both ways.  An attribute is easy to query and easy
for str() to use, but it wouldn't show up in a tuple-style repr().


> This means that path objects aren't the string representation of a
> path; they are a ''logical'' representation of a path. Remember why a
> filesystem path is called a path - because it's a way to get from one
> place on the filesystem to another. Paths can be relative, which means
> that they don't define from where to start the walk, and can be not
> relative, which means that they do. In the tuple representation,
> relative paths are simply tuples of strings, and not relative paths
> are tuples of strings with a first "root" element.
>
> The advantage of using a logical representation is that you can forget
> about the textual representation, which can be really complex. You
> don't have to call normpath when you're unsure about how a path looks,
> you don't have to search for seps and altseps, and... you don't need
> to remember a lot of names of functions or methods. To show that, take
> a look at those methods from the original path class and their
> equivalent in my path class:
>
> {{{
> p.normpath()  -> Isn't needed - done by the constructor
> p.basename()  -> p[-1]
> p.splitpath() -> (p[:-1], p[-1])
> p.splitunc()  -> (p[0], p[1:]) (if isinstance(p[0], path.UNCRoot))
> p.splitall()  -> Isn't needed
> p.parent      -> p[:-1]
> p.name        -> p[-1]
> p.drive       -> p[0] (if isinstance(p[0], path.Drive))
> p.uncshare    -> p[0] (if isinstance(p[0], path.UNCRoot))
>
> and of course:
> p.join(q) [or anything like it] -> p + q
> }}}


All that slicing is cool.


> The only drawback I can see in using a logical representation is that
> giving a path object to functions which expect a path string won't
> work. The immediate solution is to simply use str(p) instead of p. The
> long-term solution is to make all related functions accept a path
> object.


That's a big drawback.  PEP 355 can choose between string and
non-string, but this way is limited to non-string.  That raises the
minor issue of changing the open() functions etc in the standard
library, and the major issue of changing them in third-party
libraries.


> Having a logical representation of a path calls for a bit of term
> clearing-up. What's an absolute path? On POSIX, it's very simple: a
> path starting with a '/'. But what about Windows? Is "\temp\file" an
> absolute path? I claim that it isn't really. The reason is that if you
> change the current working directory, its meaning changes: It's now
> not "c:\temp\file", but "a:\temp\file". The same goes for
> "c:temp\file". So I decided on these two definitions:
>
>  * A ''relative path'' is a path without a root element, so it can be
> concatenated to other paths.
>  * An ''absolute path'' is a path whose meaning doesn't change when
> the current working directory changes.
>
> This means that paths starting with a drive letter alone
> (!UnrootedDrive instance, in my module) and paths starting with a
> backslash alone (the CURROOT object, in my module) are not relative
> and not absolute.


I guess that's plausable.  We'll need feedback from Windows users.


> I really think that it's a better way to handle paths. If you want an
> example, compare the current implementation of relpathto and my
> implementation.


In my enhanced Path class (I posted the docstring last summer), I made
a .relpathfrom() function because .relpathto() was so confusing.


> == Easier attributes for stat objects ==
>
> The current path objects includes:
>  * isdir, isfile, islink, and -
>  * atime, mtime, ctime, size.
> The first line does file mode checking, and the second simply gives
> attributes from the stat object.
>
> I suggest that these should be added to the stat_result object. isdir,
> isfile and islink are true if a specific bit in st_mode is set, and
> atime, mtime, ctime and size are simply other names for st_atime,
> st_mtime, st_ctime and st_size.
>
> It means that instead of using the atime, mtime etc. methods, you will
> write {{{ p.stat().atime }}}, {{{ p.stat().size }}}, etc.
>
> This is good, because:
>  * If you want to make only one system call, it's very easy to save
> the stat object and use it.
>  * If you have to deal with symbolic links, you can simply use {{{
> p.lstat().mtime }}}. Yes, symbolic links have a modification time. The
> alternative is to add three methods with ugly names (latime, lmtime,
> lctime) or to have an incomplete interface without a good reason.
>
> I think that isfile, isdir should be kept (along with lisfile,
> lisdir), since I think that doing what they do is quite common, and
> requires six lines:
> {{{
> try:
>     st = p.stat()
> except OSError:
>     return False
> else:
>     return st.isdir
> }}}
>
> I think that still, isdir, isfile and islink should be added to
> stat_result objects: They turned out pretty useful in writing some of
> the more complex path methods.


Not sure about this.  I see the point in not duplicating .foo() vs
.stat().foo.  .foo() exists in os.path to avoid the ugliness of
os.stat() in the middle of an expression.  I think the current
recommendation is to just do stats all the time because the overhead
is minimal and it's not worth getting out of sync.

The question is, does forcing people to use .stat() expose an
implementation detail that should be hidden, and does it smell of
Unixism?  Most people think a file *is* a regular file or a directory.
 The fact that this is encoded in the file's permission bits -- which
stat() examines -- is a quirk of Unix.


> == One Method for Finding Files ==
>
> (They're actually two, but with exactly the same interface). The
> original path object has these methods for finding files:
>
> {{{
> def listdir(self, pattern = None): ...
> def dirs(self, pattern = None): ...
> def files(self, pattern = None): ...
> def walk(self, pattern = None): ...
> def walkdirs(self, pattern = None): ...
> def walkfiles(self, pattern = None): ...
> def glob(self, pattern):
> }}}
>
> I suggest one method that replaces all those:
> {{{
> def glob(self, pattern='*', topdown=True, onlydirs=False, onlyfiles=False): ...
> }}}
>
> pattern is the good old glob pattern, with one additional extension:
> "**" matches any number of subdirectories, including 0. This means
> that '**' means "all the files in a directory", '**/a' means "all the
> files in a directory called a", and '**/a*/**/b*' means "all the files
> in a directory whose name starts with 'b' and the name of one of their
> parent directories starts with 'a'".


I like the separate methods, but OK.  I hope it doesn't *really* call
glob if the pattern is the default.

How about 'dirs' and 'files' instead of 'onlydirs' and 'onlyfiles'? 
Or one could, gasp, pass a constant or the 'find' command's
abbreviation ("d" directory, "f" file, "s" socket, "b" block
special...).

In my enhanced Path class, I used a boolean 'symlinks' arg meaning
"follow symlinks" (default True).  This allows the user to process
symbolic links seperately if he wishes, or to ignore them if he
doesn't care.  Separate .symlinks() and .walklinks() methods handle
the case that the user does want to treat them specially.  This seems
to be more intuitive and flexible than just your .symlinks() method
below.

Not sure I like "**/" over a 'recursive' argument because the syntax
is so novel and nonstandard.

You mean "ancestor" in the last paragraph, not "parent'.  Parent is
often read as immediate parent, and ancestor as any number of levels
up.


> == Reduce the Number of Methods ==
>
> I think that the number of methods should be reduced. The most obvious
> example are the copy functions. In the current proposal:
>
> {{{
> def copyfile(self, dst): ...
> def copymode(self, dst): ...
> def copystat(self, dst): ...
> def copy(self, dst): ...
> def copy2(self, dst): ...
> }}}
>
> In my proposal:
>
> {{{
> def copy(self, dst, copystat=False): ...
> }}}
>
> It's just that I think that copyfile, copymode and copystat aren't
> usually useful, and there's no reason not to unite copy and copy2.


Sounds good.


> = Other Changes =
>
> Here is a list of the smaller things I've changed in my proposal.
>
> The current normpath removes '..' with the name before them. I didn't
> do that, because it doesn't return an equivalent path if the path
> before the '..' is a symbolic link.


I was wondering what the fallout would be of normalizing "a/../b" and
"a/./b" and "a//b", but it sounds like you're thinking about it.


> I removed the methods associated with file extensions. I don't recall
> using them, and since they're purely textual and not OS-dependent, I
> think that you can always do p[-1].rsplit('.', 1).

No, .ext and .namebase are important!  I use them all the time; e.g.,
to force the extension to lowercase, to insert/update a suffix to the
basename and re-add the extension, etc.  I can see cutting the other
properties but not these.

.namebase is an obnoxious name though.  I wish we could come up with
something better.


> I removed unlink. It's an alias to remove, as far as I know.


It was added because .unlink() is cryptic to non-Unix users.  Although
I'd argue that .delete() is more universally understood than either.


> I removed expand. There's no need to use normpath, so it's equivalent
> to .expanduser().expandvars(), and I think that the explicit form is
> better.


Expand is useful though, so you don't forget one or the other.


> copytree - I removed it. In shutil it's documented as being mostly a
> demonstration, and I'm not sure if it's really useful.


Er, not sure I've used it, but it seems useful.  Why force people to
reinvent the wheel with their own recursive loops that they may get
wrong?


> symlink - Instead of a function like copy, with the destination as the
> second (actually, the only) argument, I wrote "writelink", which gets
> a string and creates a symbolic link with that value. The reason is
> that symbolic links can be any string, not necessarily a legal path.


Does that mean you have to build a relative link yourself if you're
going from one directory to another?


> p.ext          -> ''.join(p[-1].rsplit('.', 1)[1:])


People shouldn't have to do this.


> Unicode - I have no idea about unicode paths. My current
> implementation simply uses str. This should be changed, I guess.


No idea about that either.

You've got two issues here.  One is to go to a tuple base and replace
several properties with slicing.  The other is all your other proposed
changes.  Ideally the PEP would be written in a way that these other
changes can be propagated back and forth between the PEPs as consensus
builds.

--
Mike Orr <sluggoster at gmail.com>
(mso at oz.net address is semi-reliable)

From fredrik at pythonware.com  Thu May  4 14:32:26 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 4 May 2006 14:32:26 +0200
Subject: [Python-Dev] Alternative path suggestion
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
Message-ID: <e3cs8q$t60$1@sea.gmane.org>

Noam Raphael wrote:

> You can find the implementation at
> http://wiki.python.org/moin/AlternativePathModule?action=raw
> (By the way, is there some "code wiki" available? It can simply be a
> public svn repository. I think it will be useful for those things.)

pastebin is quite popular:

    http://python.pastebin.com/

</F> 




From ncoghlan at gmail.com  Thu May  4 14:57:05 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 04 May 2006 22:57:05 +1000
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
Message-ID: <4459FA21.6030508@gmail.com>

Mike Orr wrote:
> Intriguing idea, Noam, and excellent thinking.  I'd say it's worth a
> separate PEP.  It's too different to fit into PEP 355, and too big to
> be summarized in the "Open Issues" section.  Of course, one PEP will
> be rejected if the other is approved.

I agree that a competing PEP is probably the best way to track this idea.

> The main difficulty with this approach is it's so radical.  It would
> require a serious champion to convince people it's as good as our
> tried-and-true strings.

Guido has indicated strong dissatisfaction with the idea of subclassing 
str/unicode with respect to PEP 355. And if you're not subclassing those types 
in order to be able to pass the object transparently to existing APIs, then it 
makes much more sense to choose a sensible internal representation like a 
tuple, so that issues like os.sep can be deferred until they matter.

That's generally been my problem with PEP 355 - a string is an excellent 
format for describing and displaying paths, but its a *hopeless* format for 
manipulating them.

A real data type that acts as a convenience API for the various 
string-accepting filesystem APIs would be a good thing.

>> == a tuple instead of a string ==
>>
>> The biggest conceptual change is that my path object is a subclass of
>> ''tuple'', not a subclass of str.

Why subclass anything? The path should internally represent the filesystem 
path as a list or tuple, but it shouldn't *be* a tuple.

Also, was it a deliberate design decision to make the path objects immutable, 
or was that simply copied from the fact that strings are immutable?

Given that immutability allows the string representation to be calculated once 
and then cached, I'll go with that interpretation.

>>>>> tuple(path('/a/b/c'))
>> (path.ROOT, 'a', 'b', 'c')
>> }}}
> 
> How about  an .isabsolute attribute instead of prepending path.ROOT? 
> I can see arguments both ways.  An attribute is easy to query and easy
> for str() to use, but it wouldn't show up in a tuple-style repr().

You can have your cake and eat it too by storing most of the state in the 
internal tuple, but providing convenience attributes to perform certain queries.

This would actually be the criterion for the property/method distinction. 
Properties can be determined solely by inspecting the internal data store, 
whereas methods would require accessing the filesystem.

>> This means that path objects aren't the string representation of a
>> path; they are a ''logical'' representation of a path. Remember why a
>> filesystem path is called a path - because it's a way to get from one
>> place on the filesystem to another. Paths can be relative, which means
>> that they don't define from where to start the walk, and can be not
>> relative, which means that they do. In the tuple representation,
>> relative paths are simply tuples of strings, and not relative paths
>> are tuples of strings with a first "root" element.

I suggest storing the first element separately from the rest of the path. The 
reason for suggesting this is that you use 'os.sep' to separate elements in 
the normal path, but *not* to separate the first element from the rest.

Possible values for the path's root element would then be:

   None ==> relative path
   path.ROOT ==> Unix absolute path
   path.DRIVECWD ==> Windows drive relative path
   path.DRIVEROOT ==> Windows drive absolute path
   path.UNCSHARE  ==> UNC path
   path.URL  ==> URL path

The last four would have attributes (the two Windows ones to get at the drive 
letter, the UNC one to get at the share name, and the URL one to get at the 
text of the URL).

Similarly, I would separate out the extension to a distinct attribute, as it 
too uses a different separator from the normal path elements ('.' most places, 
but '/' on RISC OS, for example)

The string representation would then be:

   def __str__(self):
       return (str(self.root)
               + os.sep.join(self.path)
               + os.extsep + self.ext)

>> The advantage of using a logical representation is that you can forget
>> about the textual representation, which can be really complex.

As noted earlier - text is a great format for path related I/O. It's a lousy 
format for path manipulation.

>> {{{
>> p.normpath()  -> Isn't needed - done by the constructor
>> p.basename()  -> p[-1]
>> p.splitpath() -> (p[:-1], p[-1])
>> p.splitunc()  -> (p[0], p[1:]) (if isinstance(p[0], path.UNCRoot))
>> p.splitall()  -> Isn't needed
>> p.parent      -> p[:-1]
>> p.name        -> p[-1]
>> p.drive       -> p[0] (if isinstance(p[0], path.Drive))
>> p.uncshare    -> p[0] (if isinstance(p[0], path.UNCRoot))
>> }}}

These same operations using separate root and path attributes:

p.basename()  -> p[-1]
p.splitpath() -> (p[:-1], p[-1])
p.splitunc()  -> (p.root, p.path)
p.splitall()  -> Isn't needed
p.parent      -> p[:-1]
p.name        -> p[-1]
p.drive       -> p.root.drive  (AttributeError if not drive based)
p.uncshare    -> p.root.share  (AttributeError if not drive based)

> That's a big drawback.  PEP 355 can choose between string and
> non-string, but this way is limited to non-string.  That raises the
> minor issue of changing the open() functions etc in the standard
> library, and the major issue of changing them in third-party
> libraries.

It's not that big a drama, really. All you need to do is call str() on your 
path objects when you're done manipulating them. The third party libraries 
don't need to know how you created your paths, only what you ended up with.

Alternatively, if the path elements are stored in separate attributes, there's 
nothing stopping the main object from inheriting from str or unicode the way 
the PEP 355 path object does.

Either way, this object would still be far more convenient for manipulating 
paths than a string based representation that has to deal with OS-specific 
issues on every operation, rather than only during creation and conversion to 
a string. The path objects would also serve as an OS-independent 
representation of filesystem paths.

In fact, I'd leave most of the low-level API's working only on strings - the 
only one I'd change to accept path objects directly is open() (which would be 
fairly easy, as that's a factory function now).

>> This means that paths starting with a drive letter alone
>> (!UnrootedDrive instance, in my module) and paths starting with a
>> backslash alone (the CURROOT object, in my module) are not relative
>> and not absolute.
> 
> I guess that's plausable.  We'll need feedback from Windows users.

As suggested above, I think the root element should be stored separately from 
the rest of the path. Then adding a new kind of root element (such as a URL) 
becomes trivial.

> The question is, does forcing people to use .stat() expose an
> implementation detail that should be hidden, and does it smell of
> Unixism?  Most people think a file *is* a regular file or a directory.
>  The fact that this is encoded in the file's permission bits -- which
> stat() examines -- is a quirk of Unix.

I wouldn't expose stat() - as you say, it's a Unixism. Instead, I'd provide a 
subclass of Path that used lstat instead of stat for symbolic links.

So if I want symbolic links followed, I use the normal Path class. This class 
just generally treat symbolic links as if they were the file pointed to 
(except for the whole not recursing into symlinked subdirectories thing).

The SymbolicPath subclass would treat normal files as usual, but *wouldn't* 
follow symbolic links when stat'ting files (instead, it would stat the symlink).

>> == One Method for Finding Files ==
>>
>> (They're actually two, but with exactly the same interface). The
>> original path object has these methods for finding files:
>>
>> {{{
>> def listdir(self, pattern = None): ...
>> def dirs(self, pattern = None): ...
>> def files(self, pattern = None): ...
>> def walk(self, pattern = None): ...
>> def walkdirs(self, pattern = None): ...
>> def walkfiles(self, pattern = None): ...
>> def glob(self, pattern):
>> }}}
>>
>> I suggest one method that replaces all those:
>> {{{
>> def glob(self, pattern='*', topdown=True, onlydirs=False, onlyfiles=False): ...
>> }}}

Swiss army methods are even more evil than wide APIs. And I consider the term 
'glob' itself to be a Unixism - I've found the technique to be far more 
commonly known as wildcard matching in the Windows world.

The path module has those methods for 7 distinct use cases:
   - list the contents of this directory
   - list the subdirectories of this directory
   - list the files in this directory
   - walk the directory tree rooted at this point, yielding both files and dirs
   - walk the directory tree rooted at this point, yielding only the dirs
   - walk the directory tree rooted at this point, yielding only the files
   - walk this pattern

The first 3 operations are far more common than the last 4, so they need to stay.

   def entries(self, pattern=None):
       """Return list of all entries in directory"""
       _path = type(self)
       all_entries = os.listdir(str(self))
       if pattern is not None:
           return [_path(x) for x in all_entries if x.matches(pattern)]
       return [_path(x) for x in all_entries]

   def subdirs(self, pattern=None)
       """Return list of all subdirectories in directory"""
       return [x for x in self.entries(pattern) if x.is_dir()]

   def files(self, pattern=None)
       """Return list of all files in directory"""
       return [x for x in self.entries(pattern) if x.is_dir()]

   # here's sample implementations of the test methods used above
   def matches(self, pattern):
       return fnmatch.fnmatch(str(self), pattern)
   def is_dir(self):
       return os.isdir(str(self))
   def is_file(self):
       return os.isfile(str(self))

For the tree traversal operations, there are still multiple use cases:

   def walk(self, topdown=True, onerror=None)
       """ Walk directories and files just as os.walk does"""
       # Similar to os.walk, only yielding Path objects instead of strings
       # For each directory, effectively returns:
       #    yield dirpath, dirpath.subdirs(), dirpath.files()


   def walkdirs(self, pattern=None, onerror=None)
       """Only walk directories matching pattern"""
       for dirpath, subdirs, files in self.walk(onerror=onerror):
           yield dirpath
           if pattern is not None:
               # Modify in-place so that walk() responds to the change
               subdirs[:] = [x for x in subdirs if x.matches(pattern)]

   def walkfiles(self, pattern=None, onerror=None)
       """Only walk file names matching pattern"""
       for dirpath, subdirs, files in self.walk(onerror=onerror):
           if pattern is not None:
               for f in files:
                   if f.match(pattern):
                       yield f
           else:
               for f in files:
                   yield f

   def walkpattern(self, pattern=None)
       """Only walk paths matching glob pattern"""
       _factory = type(self)
       for pathname in glob.glob(pattern):
           yield _factory(pathname)


>> pattern is the good old glob pattern, with one additional extension:
>> "**" matches any number of subdirectories, including 0. This means
>> that '**' means "all the files in a directory", '**/a' means "all the
>> files in a directory called a", and '**/a*/**/b*' means "all the files
>> in a directory whose name starts with 'b' and the name of one of their
>> parent directories starts with 'a'".
> 
> I like the separate methods, but OK.  I hope it doesn't *really* call
> glob if the pattern is the default.

Keep the separate methods. Trying to squeeze too many disparate use cases 
through a single API is a bad idea. Directory listing and tree-traversal are 
not the same thing. Path matching and filename matching are not the same thing 
either.

> Or one could, gasp, pass a constant or the 'find' command's
> abbreviation ("d" directory, "f" file, "s" socket, "b" block
> special...).

Magic letters in an API are just as bad as magic numbers :)

More importantly, these things don't port well between systems.

>> In my proposal:
>>
>> {{{
>> def copy(self, dst, copystat=False): ...
>> }}}
>>
>> It's just that I think that copyfile, copymode and copystat aren't
>> usually useful, and there's no reason not to unite copy and copy2.
> 
> Sounds good.

OK, this is one case where a swiss army method may make sense. Specifically, 
something like:

   def copy_to(self, dest, copyfile=True, copymode=True, copytime=False)

Whether or not to copy the file contents, the permission settings and the last 
access and modification time are then all independently selectable.

The different method name also makes the direction of the copying clear (with 
a bare 'copy', it's slightly ambiguous as the 'cp src dest' parallel isn't as 
strong as it is with a function).

> I was wondering what the fallout would be of normalizing "a/../b" and
> "a/./b" and "a//b", but it sounds like you're thinking about it.

The latter two are OK, but normalizing the first one can give you the wrong 
answer if 'a' is a symlink (since 'a/../b' is then not necessarily the same as 
'b').

Better to just leave the '..' in and not treat it as something that can be 
normalised away.

>> I removed the methods associated with file extensions. I don't recall
>> using them, and since they're purely textual and not OS-dependent, I
>> think that you can always do p[-1].rsplit('.', 1).

Most modern OS's use '.' as the extension separator, true, but os.extsep still 
exists for a reason :)

> .namebase is an obnoxious name though.  I wish we could come up with
> something better.

p.path[-1] :)

Then p.name can just do (p.path[-1] + os.extsep + p.ext) to rebuild the full 
filename including the extension (if p.ext was None, then p.name would be the 
same as p.path[-1])

>> I removed expand. There's no need to use normpath, so it's equivalent
>> to .expanduser().expandvars(), and I think that the explicit form is
>> better.
> 
> Expand is useful though, so you don't forget one or the other.

And as you'll usually want to do both, adding about 15 extra characters for no 
good reason seems like a bad idea. . .

>> copytree - I removed it. In shutil it's documented as being mostly a
>> demonstration, and I'm not sure if it's really useful.
> 
> Er, not sure I've used it, but it seems useful.  Why force people to
> reinvent the wheel with their own recursive loops that they may get
> wrong?

Because the handling of exceptional cases is almost always going to be 
application specific. Note that even os.walk provides a callback hook for if 
the call to os.listdir() fails when attempting to descend into a directory.

For copytree, the issues to be considered are significantly worse:
   - what to do if listdir fails in the source tree?
   - what to do if reading a file fails in the source tree?
   - what to do if a directory doesn't exist in the target tree?
   - what to do if a directory already exists in the target tree?
   - what to do if a file already exists in the target tree?
   - what to do if writing a file fails in the target tree?
   - should the file contents/mode/time be copied to the target tree?
   - what to do with symlinks in the source tree?

Now, what might potentially be genuinely useful is paired walk methods that 
allowed the following:

   # Do path.walk over this directory, and also return the corresponding
   # information for a destination directory (so the dest dir information
   # probably *won't* match that file system
   for src_info, dest_info in src_path.pairedwalk(dest_path):
       src_dirpath, src_subdirs, src_files = src_info
       dest_dirpath, dest_subdirs, dest_files = dest_info
       # Do something useful

   # Ditto for path.walkdirs
   for src_dirpath, dest_dirpath in src_path.pairedwalkdirs(dest_path):
       # Do something useful

   # Ditto for path.walkfiles
   for src_path, dest_path in src_path.pairedwalkfiles(dest_path):
       src_path.copy_to(dest_path)

> You've got two issues here.  One is to go to a tuple base and replace
> several properties with slicing.  The other is all your other proposed
> changes.  Ideally the PEP would be written in a way that these other
> changes can be propagated back and forth between the PEPs as consensus
> builds.

The main thing Jason's path object has going for it is that it brings together 
Python's disparate filesystem manipulation API's into one place. Using it is a 
definite improvement over using the standard lib directly.

However, the choice to use a string as the internal storage instead a more 
appropriate format (such as the three-piece structure I suggest of root, path, 
extension), it doesn't do as much as it could to abstract away the hassles of 
os.sep and os.extsep.

By focusing on the idea that strings are for path input and output operations, 
rather than for path manipulation, it should be possible to build something 
even more usable than path.py

If it was done within the next several months and released on PyPI, it might 
even be a contender for 2.6.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From jparlar at cogeco.ca  Thu May  4 15:21:57 2006
From: jparlar at cogeco.ca (Jay Parlar)
Date: Thu, 4 May 2006 09:21:57 -0400
Subject: [Python-Dev] lambda in Python
In-Reply-To: <mailman.21.1146736804.29262.python-dev@python.org>
References: <mailman.21.1146736804.29262.python-dev@python.org>
Message-ID: <7bf78738c8226235c5e4290d41a8922f@cogeco.ca>


On May 4, 2006, at 6:00 AM, Talin wrote:

> xahlee <xah <at> xahlee.org> writes:
>
>> Today i ran into one of Guido van Rossum's blog article titled
>> ?Language Design Is Not Just Solving Puzzles? at
>> http://www.artima.com/weblogs/viewpost.jsp?thread=147358
>
> The confrontational tone of this post makes it pretty much impossible
> to have a reasonable debate on the subject. I'd suggest that if you
> really want to be heard (instead of merely having that "I'm right"
> feeling) that you try a different approach.
>
> -- Talin

Xah Lee is a well known troll, he does stuff like this on c.l.p. all 
the time. Best to just ignore him, he doesn't listen to reason.

Jay P.


From stefan.rank at ofai.at  Thu May  4 15:28:29 2006
From: stefan.rank at ofai.at (Stefan Rank)
Date: Thu, 04 May 2006 15:28:29 +0200
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <4459FA21.6030508@gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com>
Message-ID: <445A017D.4080207@ofai.at>

on 04.05.2006 14:57 Nick Coghlan said the following:
> Mike Orr wrote:
>> Intriguing idea, Noam, and excellent thinking.  I'd say it's worth a
>> separate PEP.  It's too different to fit into PEP 355, and too big to
>> be summarized in the "Open Issues" section.  Of course, one PEP will
>> be rejected if the other is approved.
> 
> I agree that a competing PEP is probably the best way to track this idea.

<snip>

>>> This means that path objects aren't the string representation of a
>>> path; they are a ''logical'' representation of a path. Remember why a
>>> filesystem path is called a path - because it's a way to get from one
>>> place on the filesystem to another. Paths can be relative, which means
>>> that they don't define from where to start the walk, and can be not
>>> relative, which means that they do. In the tuple representation,
>>> relative paths are simply tuples of strings, and not relative paths
>>> are tuples of strings with a first "root" element.
> 
> I suggest storing the first element separately from the rest of the path. The 
> reason for suggesting this is that you use 'os.sep' to separate elements in 
> the normal path, but *not* to separate the first element from the rest.

I want to add that people might want to manipulate paths that are not 
for the currently running OS. Therefore I think the `sep` should be an 
attribute of the "root" element.
For the same reason I'd like to add two values to the following list:

> Possible values for the path's root element would then be:
> 
>    None ==> relative path
               (uses os.sep)
+    path.UNIXRELATIVE ==> uses '/'
+    path.WINDOWSRELATIVE ==> uses r'\' unconditionally
>    path.ROOT ==> Unix absolute path
>    path.DRIVECWD ==> Windows drive relative path
>    path.DRIVEROOT ==> Windows drive absolute path
>    path.UNCSHARE  ==> UNC path
>    path.URL  ==> URL path
> 
> The last four would have attributes (the two Windows ones to get at the drive 
> letter, the UNC one to get at the share name, and the URL one to get at the 
> text of the URL).
> 
> Similarly, I would separate out the extension to a distinct attribute, as it 
> too uses a different separator from the normal path elements ('.' most places, 
> but '/' on RISC OS, for example)
> 
> The string representation would then be:
> 
>    def __str__(self):
>        return (str(self.root)
>                + os.sep.join(self.path)
>                + os.extsep + self.ext)
> 
>>> The advantage of using a logical representation is that you can forget
>>> about the textual representation, which can be really complex.
> 
> As noted earlier - text is a great format for path related I/O. It's a lousy 
> format for path manipulation.
> 
>>> {{{
>>> p.normpath()  -> Isn't needed - done by the constructor
>>> p.basename()  -> p[-1]
>>> p.splitpath() -> (p[:-1], p[-1])
>>> p.splitunc()  -> (p[0], p[1:]) (if isinstance(p[0], path.UNCRoot))
>>> p.splitall()  -> Isn't needed
>>> p.parent      -> p[:-1]
>>> p.name        -> p[-1]
>>> p.drive       -> p[0] (if isinstance(p[0], path.Drive))
>>> p.uncshare    -> p[0] (if isinstance(p[0], path.UNCRoot))
>>> }}}
> 
> These same operations using separate root and path attributes:
> 
> p.basename()  -> p[-1]
> p.splitpath() -> (p[:-1], p[-1])
> p.splitunc()  -> (p.root, p.path)
> p.splitall()  -> Isn't needed
> p.parent      -> p[:-1]
> p.name        -> p[-1]
> p.drive       -> p.root.drive  (AttributeError if not drive based)
> p.uncshare    -> p.root.share  (AttributeError if not drive based)
> 
>> That's a big drawback.  PEP 355 can choose between string and
>> non-string, but this way is limited to non-string.  That raises the
>> minor issue of changing the open() functions etc in the standard
>> library, and the major issue of changing them in third-party
>> libraries.
> 
> It's not that big a drama, really. All you need to do is call str() on your 
> path objects when you're done manipulating them. The third party libraries 
> don't need to know how you created your paths, only what you ended up with.
> 
> Alternatively, if the path elements are stored in separate attributes, there's 
> nothing stopping the main object from inheriting from str or unicode the way 
> the PEP 355 path object does.
> 
> Either way, this object would still be far more convenient for manipulating 
> paths than a string based representation that has to deal with OS-specific 
> issues on every operation, rather than only during creation and conversion to 
> a string. The path objects would also serve as an OS-independent 
> representation of filesystem paths.
> 
> In fact, I'd leave most of the low-level API's working only on strings - the 
> only one I'd change to accept path objects directly is open() (which would be 
> fairly easy, as that's a factory function now).
> 
>>> This means that paths starting with a drive letter alone
>>> (!UnrootedDrive instance, in my module) and paths starting with a
>>> backslash alone (the CURROOT object, in my module) are not relative
>>> and not absolute.
>> I guess that's plausable.  We'll need feedback from Windows users.
> 
> As suggested above, I think the root element should be stored separately from 
> the rest of the path. Then adding a new kind of root element (such as a URL) 
> becomes trivial.
> 
>> The question is, does forcing people to use .stat() expose an
>> implementation detail that should be hidden, and does it smell of
>> Unixism?  Most people think a file *is* a regular file or a directory.
>>  The fact that this is encoded in the file's permission bits -- which
>> stat() examines -- is a quirk of Unix.
> 
> I wouldn't expose stat() - as you say, it's a Unixism. Instead, I'd provide a 
> subclass of Path that used lstat instead of stat for symbolic links.
> 
> So if I want symbolic links followed, I use the normal Path class. This class 
> just generally treat symbolic links as if they were the file pointed to 
> (except for the whole not recursing into symlinked subdirectories thing).
> 
> The SymbolicPath subclass would treat normal files as usual, but *wouldn't* 
> follow symbolic links when stat'ting files (instead, it would stat the symlink).
> 
>>> == One Method for Finding Files ==
>>>
>>> (They're actually two, but with exactly the same interface). The
>>> original path object has these methods for finding files:
>>>
>>> {{{
>>> def listdir(self, pattern = None): ...
>>> def dirs(self, pattern = None): ...
>>> def files(self, pattern = None): ...
>>> def walk(self, pattern = None): ...
>>> def walkdirs(self, pattern = None): ...
>>> def walkfiles(self, pattern = None): ...
>>> def glob(self, pattern):
>>> }}}
>>>
>>> I suggest one method that replaces all those:
>>> {{{
>>> def glob(self, pattern='*', topdown=True, onlydirs=False, onlyfiles=False): ...
>>> }}}
> 
> Swiss army methods are even more evil than wide APIs. And I consider the term 
> 'glob' itself to be a Unixism - I've found the technique to be far more 
> commonly known as wildcard matching in the Windows world.
> 
> The path module has those methods for 7 distinct use cases:
>    - list the contents of this directory
>    - list the subdirectories of this directory
>    - list the files in this directory
>    - walk the directory tree rooted at this point, yielding both files and dirs
>    - walk the directory tree rooted at this point, yielding only the dirs
>    - walk the directory tree rooted at this point, yielding only the files
>    - walk this pattern
> 
> The first 3 operations are far more common than the last 4, so they need to stay.
> 
>    def entries(self, pattern=None):
>        """Return list of all entries in directory"""
>        _path = type(self)
>        all_entries = os.listdir(str(self))
>        if pattern is not None:
>            return [_path(x) for x in all_entries if x.matches(pattern)]
>        return [_path(x) for x in all_entries]
> 
>    def subdirs(self, pattern=None)
>        """Return list of all subdirectories in directory"""
>        return [x for x in self.entries(pattern) if x.is_dir()]
> 
>    def files(self, pattern=None)
>        """Return list of all files in directory"""
>        return [x for x in self.entries(pattern) if x.is_dir()]
> 
>    # here's sample implementations of the test methods used above
>    def matches(self, pattern):
>        return fnmatch.fnmatch(str(self), pattern)
>    def is_dir(self):
>        return os.isdir(str(self))
>    def is_file(self):
>        return os.isfile(str(self))
> 
> For the tree traversal operations, there are still multiple use cases:
> 
>    def walk(self, topdown=True, onerror=None)
>        """ Walk directories and files just as os.walk does"""
>        # Similar to os.walk, only yielding Path objects instead of strings
>        # For each directory, effectively returns:
>        #    yield dirpath, dirpath.subdirs(), dirpath.files()
> 
> 
>    def walkdirs(self, pattern=None, onerror=None)
>        """Only walk directories matching pattern"""
>        for dirpath, subdirs, files in self.walk(onerror=onerror):
>            yield dirpath
>            if pattern is not None:
>                # Modify in-place so that walk() responds to the change
>                subdirs[:] = [x for x in subdirs if x.matches(pattern)]
> 
>    def walkfiles(self, pattern=None, onerror=None)
>        """Only walk file names matching pattern"""
>        for dirpath, subdirs, files in self.walk(onerror=onerror):
>            if pattern is not None:
>                for f in files:
>                    if f.match(pattern):
>                        yield f
>            else:
>                for f in files:
>                    yield f
> 
>    def walkpattern(self, pattern=None)
>        """Only walk paths matching glob pattern"""
>        _factory = type(self)
>        for pathname in glob.glob(pattern):
>            yield _factory(pathname)
> 
> 
>>> pattern is the good old glob pattern, with one additional extension:
>>> "**" matches any number of subdirectories, including 0. This means
>>> that '**' means "all the files in a directory", '**/a' means "all the
>>> files in a directory called a", and '**/a*/**/b*' means "all the files
>>> in a directory whose name starts with 'b' and the name of one of their
>>> parent directories starts with 'a'".
>> I like the separate methods, but OK.  I hope it doesn't *really* call
>> glob if the pattern is the default.
> 
> Keep the separate methods. Trying to squeeze too many disparate use cases 
> through a single API is a bad idea. Directory listing and tree-traversal are 
> not the same thing. Path matching and filename matching are not the same thing 
> either.
> 
>> Or one could, gasp, pass a constant or the 'find' command's
>> abbreviation ("d" directory, "f" file, "s" socket, "b" block
>> special...).
> 
> Magic letters in an API are just as bad as magic numbers :)
> 
> More importantly, these things don't port well between systems.
> 
>>> In my proposal:
>>>
>>> {{{
>>> def copy(self, dst, copystat=False): ...
>>> }}}
>>>
>>> It's just that I think that copyfile, copymode and copystat aren't
>>> usually useful, and there's no reason not to unite copy and copy2.
>> Sounds good.
> 
> OK, this is one case where a swiss army method may make sense. Specifically, 
> something like:
> 
>    def copy_to(self, dest, copyfile=True, copymode=True, copytime=False)
> 
> Whether or not to copy the file contents, the permission settings and the last 
> access and modification time are then all independently selectable.
> 
> The different method name also makes the direction of the copying clear (with 
> a bare 'copy', it's slightly ambiguous as the 'cp src dest' parallel isn't as 
> strong as it is with a function).
> 
>> I was wondering what the fallout would be of normalizing "a/../b" and
>> "a/./b" and "a//b", but it sounds like you're thinking about it.
> 
> The latter two are OK, but normalizing the first one can give you the wrong 
> answer if 'a' is a symlink (since 'a/../b' is then not necessarily the same as 
> 'b').
> 
> Better to just leave the '..' in and not treat it as something that can be 
> normalised away.
> 
>>> I removed the methods associated with file extensions. I don't recall
>>> using them, and since they're purely textual and not OS-dependent, I
>>> think that you can always do p[-1].rsplit('.', 1).
> 
> Most modern OS's use '.' as the extension separator, true, but os.extsep still 
> exists for a reason :)
> 
>> .namebase is an obnoxious name though.  I wish we could come up with
>> something better.
> 
> p.path[-1] :)
> 
> Then p.name can just do (p.path[-1] + os.extsep + p.ext) to rebuild the full 
> filename including the extension (if p.ext was None, then p.name would be the 
> same as p.path[-1])
> 
>>> I removed expand. There's no need to use normpath, so it's equivalent
>>> to .expanduser().expandvars(), and I think that the explicit form is
>>> better.
>> Expand is useful though, so you don't forget one or the other.
> 
> And as you'll usually want to do both, adding about 15 extra characters for no 
> good reason seems like a bad idea. . .
> 
>>> copytree - I removed it. In shutil it's documented as being mostly a
>>> demonstration, and I'm not sure if it's really useful.
>> Er, not sure I've used it, but it seems useful.  Why force people to
>> reinvent the wheel with their own recursive loops that they may get
>> wrong?
> 
> Because the handling of exceptional cases is almost always going to be 
> application specific. Note that even os.walk provides a callback hook for if 
> the call to os.listdir() fails when attempting to descend into a directory.
> 
> For copytree, the issues to be considered are significantly worse:
>    - what to do if listdir fails in the source tree?
>    - what to do if reading a file fails in the source tree?
>    - what to do if a directory doesn't exist in the target tree?
>    - what to do if a directory already exists in the target tree?
>    - what to do if a file already exists in the target tree?
>    - what to do if writing a file fails in the target tree?
>    - should the file contents/mode/time be copied to the target tree?
>    - what to do with symlinks in the source tree?
> 
> Now, what might potentially be genuinely useful is paired walk methods that 
> allowed the following:
> 
>    # Do path.walk over this directory, and also return the corresponding
>    # information for a destination directory (so the dest dir information
>    # probably *won't* match that file system
>    for src_info, dest_info in src_path.pairedwalk(dest_path):
>        src_dirpath, src_subdirs, src_files = src_info
>        dest_dirpath, dest_subdirs, dest_files = dest_info
>        # Do something useful
> 
>    # Ditto for path.walkdirs
>    for src_dirpath, dest_dirpath in src_path.pairedwalkdirs(dest_path):
>        # Do something useful
> 
>    # Ditto for path.walkfiles
>    for src_path, dest_path in src_path.pairedwalkfiles(dest_path):
>        src_path.copy_to(dest_path)
> 
>> You've got two issues here.  One is to go to a tuple base and replace
>> several properties with slicing.  The other is all your other proposed
>> changes.  Ideally the PEP would be written in a way that these other
>> changes can be propagated back and forth between the PEPs as consensus
>> builds.
> 
> The main thing Jason's path object has going for it is that it brings together 
> Python's disparate filesystem manipulation API's into one place. Using it is a 
> definite improvement over using the standard lib directly.
> 
> However, the choice to use a string as the internal storage instead a more 
> appropriate format (such as the three-piece structure I suggest of root, path, 
> extension), it doesn't do as much as it could to abstract away the hassles of 
> os.sep and os.extsep.
> 
> By focusing on the idea that strings are for path input and output operations, 
> rather than for path manipulation, it should be possible to build something 
> even more usable than path.py
> 
> If it was done within the next several months and released on PyPI, it might 
> even be a contender for 2.6.
> 
> Cheers,
> Nick.
> 


From ncoghlan at gmail.com  Thu May  4 15:36:50 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 04 May 2006 23:36:50 +1000
Subject: [Python-Dev] Python long command line options
In-Reply-To: <200605040829.53140.me+python-dev@modelnine.org>
References: <200605040829.53140.me+python-dev@modelnine.org>
Message-ID: <445A0372.3030007@gmail.com>

Heiko Wundram wrote:
> Anyway, back on topic, I personally agree with the people who posted to 
> comp.lang.python that --version (and possibly --help, possibly other long 
> parameters too) would be useful additions to Pythons command-line parameters, 
> as it's increasingly getting more common amongst GNU and BSD utilities to 
> support these command-line options to get information about the utility at 
> hand (very popular amongst system administrators) and to use long commandline 
> options to be able to easily see what an option does when encountered in a 
> shell script of some sort.

+0 from me, as long as the 'best guess' bit removed. Otherwise '-v' would be 
"verbose", while "--v" was "version". And if we added '--verbose' later, 
scripts relying on '--v' would start getting an error. Much better to force 
people to spell the option right in the first place.

And any such patch would require additional tests in test_cmd_line.py if the 
new option is equivalent to an option that's already tested by that file.

Being able to give less cryptic names to the various options would be good for 
shell scripts and sys calls.

e.g.:

-c  ==> --command
-d  ==> --debugparser
-E  ==> --noenv
-h  ==> --help
-i  ==> --interactive
-m  ==> --runmodule
-O  ==> <no long equivalent>
-OO ==> <no long equivalent>
-Q  ==> --div old, --div warn, --div warnall, --div new
-S  ==> --nosite
-t  ==> --warntabs
-tt ==> --errtabs
-u  ==> --unbuffered
-v  ==> --verbose
-V  ==> --version
-W  ==> --warn
-x  ==> --skipfirstline

As far as I know, the only place these are documented is in "python -h" and 
the Linux man pages.

And good luck to you if you encounter a sys call containing "python -U". Even 
though not recognising the meaning of the option is likely to be the least of 
your worries in that case, at least you'd have some idea *why* some Python 
scripts start collapsing in a screaming heap when you try it ;)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From g.brandl at gmx.net  Thu May  4 15:41:54 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 04 May 2006 15:41:54 +0200
Subject: [Python-Dev] Python long command line options
In-Reply-To: <200605040829.53140.me+python-dev@modelnine.org>
References: <200605040829.53140.me+python-dev@modelnine.org>
Message-ID: <e3d0b2$bu7$1@sea.gmane.org>

Heiko Wundram wrote:

> Anyway, back on topic, I personally agree with the people who posted to 
> comp.lang.python that --version (and possibly --help, possibly other long 
> parameters too) would be useful additions to Pythons command-line parameters, 
> as it's increasingly getting more common amongst GNU and BSD utilities to 
> support these command-line options to get information about the utility at 
> hand (very popular amongst system administrators) and to use long commandline 
> options to be able to easily see what an option does when encountered in a 
> shell script of some sort.
> 
> And, as this doesn't break any old code (-V isn't going away by the patch), I 
> personally find this to be a very useful change.
> 
> Your thoughts?

Personally, I'm +1, but wonder if it would be enough to support '--help' and
'--version'. We then could cut out the "best guess" code and the code to enable
--opt=value.

Georg


From me+python-dev at modelnine.org  Thu May  4 15:51:22 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Thu, 4 May 2006 15:51:22 +0200
Subject: [Python-Dev] Python long command line options
In-Reply-To: <e3d0b2$bu7$1@sea.gmane.org>
References: <200605040829.53140.me+python-dev@modelnine.org>
	<e3d0b2$bu7$1@sea.gmane.org>
Message-ID: <200605041551.22595.me+python-dev@modelnine.org>

Am Donnerstag 04 Mai 2006 15:41 schrieb Georg Brandl:
> Heiko Wundram wrote:
> > Your thoughts?
>
> Personally, I'm +1, but wonder if it would be enough to support '--help'
> and '--version'. We then could cut out the "best guess" code and the code
> to enable --opt=value.

The code for the above functionality is about 10-12 lines of C of the whole 
patch.

If there's concensus to remove it, I'll gladly prepare a patch that doesn't 
contain this, but I don't think it's sensible to do so just because it makes 
the code shorter. But, the "best guess" functionality most certainly is 
debatable on other grounds.

--- Heiko.

From ncoghlan at gmail.com  Thu May  4 15:55:34 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 04 May 2006 23:55:34 +1000
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <445A017D.4080207@ofai.at>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>	<4459FA21.6030508@gmail.com>
	<445A017D.4080207@ofai.at>
Message-ID: <445A07D6.403@gmail.com>

Stefan Rank wrote:
>> I suggest storing the first element separately from the rest of the path. The 
>> reason for suggesting this is that you use 'os.sep' to separate elements in 
>> the normal path, but *not* to separate the first element from the rest.
> 
> I want to add that people might want to manipulate paths that are not 
> for the currently running OS. Therefore I think the `sep` should be an 
> attribute of the "root" element.

I meant to mention that. The idea I had was for normal path objects to use 
os.sep and os.extsep (so they can be pickled and unpickled successfully 
between platforms), and then have a mechanism that allowed a platform specific 
path to be selected based on the desired platform (i.e. one of the possible 
values of os.name: 'posix', 'nt', 'os2', 'mac', 'ce' or 'riscos').

My inclination was to have a PlatformPath subclass that accepted 'os', 'sep' 
and 'extsep' keyword arguments to the constructor, and provided the 
appropriate 'sep' and 'extsep' attributes (supplying 'os' would just be a 
shortcut to avoid specifying the separators explicitly).

That way the main class can avoid being complicated by the relatively rare 
need to operate on another platform's paths, while still supporting the ability.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From xah at xahlee.org  Thu May  4 15:56:38 2006
From: xah at xahlee.org (xahlee)
Date: Thu, 4 May 2006 06:56:38 -0700
Subject: [Python-Dev] lambda in Python
In-Reply-To: <7bf78738c8226235c5e4290d41a8922f@cogeco.ca>
References: <mailman.21.1146736804.29262.python-dev@python.org>
	<7bf78738c8226235c5e4290d41a8922f@cogeco.ca>
Message-ID: <593B03DB-0FE3-49DC-ADD1-EF4BD7B05884@xahlee.org>

I do not wish to be the subject of mobbing here.

If you have opinions on what i wrote, respond to the subject on topic  
as with any discussion. Please do not start a gazillion war-cry on me.

If you cannot tolerate the way i express my opinions, at this moment  
write a polite request to me and cc to the sys op of this forum. If  
the sysop deems fit, I'd be happy to be asked to leave by the sysop.

Thanks.

    Xah
    xah at xahlee.org
? http://xahlee.org/


On 2006 May 4, at 6:21 AM, Jay Parlar wrote:


On May 4, 2006, at 6:00 AM, Talin wrote:

> xahlee <xah <at> xahlee.org> writes:
>
>> Today i ran into one of Guido van Rossum's blog article titled
>> ?Language Design Is Not Just Solving Puzzles? at
>> http://www.artima.com/weblogs/viewpost.jsp?thread=147358
>
> The confrontational tone of this post makes it pretty much impossible
> to have a reasonable debate on the subject. I'd suggest that if you
> really want to be heard (instead of merely having that "I'm right"
> feeling) that you try a different approach.
>
> -- Talin

Xah Lee is a well known troll, he does stuff like this on c.l.p. all
the time. Best to just ignore him, he doesn't listen to reason.

Jay P.

_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/xah% 
40xahlee.org


?




From p.f.moore at gmail.com  Thu May  4 16:18:29 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 4 May 2006 15:18:29 +0100
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <445A07D6.403@gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com> <445A017D.4080207@ofai.at>
	<445A07D6.403@gmail.com>
Message-ID: <79990c6b0605040718s10b6e177s95906387806e2508@mail.gmail.com>

On 5/4/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> My inclination was to have a PlatformPath subclass that accepted 'os', 'sep'
> and 'extsep' keyword arguments to the constructor, and provided the
> appropriate 'sep' and 'extsep' attributes (supplying 'os' would just be a
> shortcut to avoid specifying the separators explicitly).
>
> That way the main class can avoid being complicated by the relatively rare
> need to operate on another platform's paths, while still supporting the ability.

You ought to have predefined classes for the standard OSes. Expecting
people to know the values for sep and extsep seems unhelpful.

In fact, unless you expect to support the os.path interface forever,
as well as the new interface,  I'd assume there would have to be
internal WindowsPath and PosixPath classes anyway - much like the
current ntpath and posixpath modules. So keeping that structure, and
simply having

if os.name == 'posix':
    Path = PosixPath
elif os.name == 'nt':
    Path = WindowsPath
... etc

at the end, would seem simplest.

(But all the current proposals seem to build on os.path, so maybe I
should assume otherwise, that os.path will remain indefinitely...)

Paul.

From sanxiyn at gmail.com  Tue May  2 09:55:24 2006
From: sanxiyn at gmail.com (Sanghyeon Seo)
Date: Tue, 2 May 2006 16:55:24 +0900
Subject: [Python-Dev] Time since the epoch
Message-ID: <5b0248170605020055g2d6747baya099b967c9ae83fc@mail.gmail.com>

Hello,

Python library reference 6.11 says, "The epoch is the point where the
time starts. On January 1st of that year, at 0 hours, the "time since
the epoch'' is zero. For Unix, the epoch is 1970."

To me this seems to suggest that the epoch may vary among platforms
and implementations as long as it's consistent. Am I correct?

For example, does it make sense to file bug reports to Python projects
assuming that time.time() returns seconds since the Unix epoch?

I am asking because currently Python and IronPython returns very
different values for time.time() even if they run on the same computer
and at the same time.

Seo Sanghyeon

From fredrik at pythonware.com  Thu May  4 16:21:16 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 4 May 2006 16:21:16 +0200
Subject: [Python-Dev] Python long command line options
References: <200605040829.53140.me+python-dev@modelnine.org><e3d0b2$bu7$1@sea.gmane.org>
	<200605041551.22595.me+python-dev@modelnine.org>
Message-ID: <e3d2ks$ldf$1@sea.gmane.org>

Heiko Wundram wrote:

>> Personally, I'm +1, but wonder if it would be enough to support '--help'
>> and '--version'. We then could cut out the "best guess" code and the code
>> to enable --opt=value.
>
> The code for the above functionality is about 10-12 lines of C of the whole
> patch.
>
> If there's concensus to remove it, I'll gladly prepare a patch that doesn't
> contain this, but I don't think it's sensible to do so just because it makes
> the code shorter.

> But, the "best guess" functionality most certainly is  debatable on other grounds.

exactly.  as the zen says, "explicit is better than I know what you mean, even if you
don't mean what I think you mean".

I'm +1 on adding --help and --version, +1 on adding -? and /? for windows only,
-0=slightly sceptical if adding a generic long option machinery is worth it, and -1
on a guessing machinery.

</F> 




From thomas at python.org  Thu May  4 16:45:55 2006
From: thomas at python.org (Thomas Wouters)
Date: Thu, 4 May 2006 16:45:55 +0200
Subject: [Python-Dev] Time since the epoch
In-Reply-To: <5b0248170605020055g2d6747baya099b967c9ae83fc@mail.gmail.com>
References: <5b0248170605020055g2d6747baya099b967c9ae83fc@mail.gmail.com>
Message-ID: <9e804ac0605040745ia9582a7l3f1bd4d2a0ce077@mail.gmail.com>

On 5/2/06, Sanghyeon Seo <sanxiyn at gmail.com> wrote:
>
> Hello,
>
> Python library reference 6.11 says, "The epoch is the point where the
> time starts. On January 1st of that year, at 0 hours, the "time since
> the epoch'' is zero. For Unix, the epoch is 1970."
>
> To me this seems to suggest that the epoch may vary among platforms
> and implementations as long as it's consistent. Am I correct?


Yes. (I believe the C standard doesn't specify the 'epoch', just POSIX does
-- for C. Regardless of that, Python's 'epoch' isn't guaranteed anywhere,
and the docs you quote are probably the most authorative docs on the
question.)

For example, does it make sense to file bug reports to Python projects
> assuming that time.time() returns seconds since the Unix epoch?


I would say so, yes.

I am asking because currently Python and IronPython returns very
> different values for time.time() even if they run on the same computer
> and at the same time.


As long as the value returned by time.time() is still 'seconds', and
everything in the time and datetime modules properly considers the same
epoch, I would say it's a bug (for third-party modules or for other parts of
the standard library) to break.

--
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060504/73dc0031/attachment-0001.htm 

From stefan.rank at ofai.at  Thu May  4 17:22:34 2006
From: stefan.rank at ofai.at (Stefan Rank)
Date: Thu, 04 May 2006 17:22:34 +0200
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <79990c6b0605040718s10b6e177s95906387806e2508@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>	<4459FA21.6030508@gmail.com>
	<445A017D.4080207@ofai.at>	<445A07D6.403@gmail.com>
	<79990c6b0605040718s10b6e177s95906387806e2508@mail.gmail.com>
Message-ID: <445A1C3A.9040309@ofai.at>

on 04.05.2006 16:18 Paul Moore said the following:
> On 5/4/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> My inclination was to have a PlatformPath subclass that accepted 'os', 'sep'
>> and 'extsep' keyword arguments to the constructor, and provided the
>> appropriate 'sep' and 'extsep' attributes (supplying 'os' would just be a
>> shortcut to avoid specifying the separators explicitly).
>>
>> That way the main class can avoid being complicated by the relatively rare
>> need to operate on another platform's paths, while still supporting the ability.
> 
> You ought to have predefined classes for the standard OSes. Expecting
> people to know the values for sep and extsep seems unhelpful.
> 

I assume that having subclasses (at least for the different os 
filesystem paths) is not necessary.
The sep and extsep that needs to be used is determined by the root of 
the path.

The root, I suppose, is set when constructing the path from a string.
The ambiguous cases are relative paths and `p = path()`.

(Also think of the proposed URL root which normally would mandate '/' as 
sep. Actually, the format depends on the 'scheme' part of the URL...)

On the output side ( `str(pathinstance)` ) the format is determined by 
the root.
At least if you ignore people who want to have 
C:/this/style/of/acceptable/windows/path

When constructing a relative path, I suggest creating a os dependent one 
(root==None) by default, even if you use::

   p = path('./unix/relative/style')

on Windows.
Daring people can later use::

   p.root = path.WINDOWSRELATIVE    # or
   p.root = path.UNIXRELATIVE

if they need to.

stefan


From g.brandl at gmx.net  Thu May  4 17:31:16 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 04 May 2006 17:31:16 +0200
Subject: [Python-Dev] lambda in Python
In-Reply-To: <593B03DB-0FE3-49DC-ADD1-EF4BD7B05884@xahlee.org>
References: <mailman.21.1146736804.29262.python-dev@python.org>	<7bf78738c8226235c5e4290d41a8922f@cogeco.ca>
	<593B03DB-0FE3-49DC-ADD1-EF4BD7B05884@xahlee.org>
Message-ID: <e3d6o4$5np$1@sea.gmane.org>

xahlee wrote:
> I do not wish to be the subject of mobbing here.
> 
> If you have opinions on what i wrote, respond to the subject on topic  
> as with any discussion. Please do not start a gazillion war-cry on me.
> 
> If you cannot tolerate the way i express my opinions, at this moment  
> write a polite request to me and cc to the sys op of this forum. If  
> the sysop deems fit, I'd be happy to be asked to leave by the sysop.

There's no sysop needed to tell you this: This is a mailing list and
newsgroup (no "forum") devoted to the development *of* *Python*.
Obviously, we all like Python the way it is and people who disagree
(especially those who accuse the BDFL of being ignorant) don't belong
here.

Georg


From vys at renet.ru  Thu May  4 18:29:58 2006
From: vys at renet.ru (Vladimir 'Yu' Stepanov)
Date: Thu, 04 May 2006 20:29:58 +0400
Subject: [Python-Dev] binary trees. Review obmalloc.c
In-Reply-To: <1f7befae0604282117r61b3e1ewf451542aa4cb7ac0@mail.gmail.com>
References: <20060424114024.66D0.JCARLSON@uci.edu> <444F1E36.30302@renet.ru>	
	<20060426094148.66EE.JCARLSON@uci.edu>
	<44508866.60503@renet.ru>	 <445294E8.1080401@v.loewis.de>
	<1f7befae0604282117r61b3e1ewf451542aa4cb7ac0@mail.gmail.com>
Message-ID: <445A2C06.4090606@renet.ru>

Tim Peters wrote:
> [Vladimir 'Yu' Stepanov]
>>> * To adapt allocation of blocks of memory with other alignment. Now
>>> alignment is rigidly set on 8 bytes. As a variant, it is possible to
>>> use alignment on 4 bytes. And this value can be set at start of the
>>> interpreter through arguments/variable environments/etc. At testing
>>> with alignment on 4 or 8 bytes difference in speed of work was not
>>> appreciable.
>
> [Martin v. L?wis]
>> That depends on the hardware you use, of course. Some architectures
>> absolutely cannot stand mis-aligned accesses, and programs will just
>> crash if they try to perform such accesses.
>
> Note that we _had_ to add a goofy "long double" to the PyGC_Head union
> because some Python platform required 8-byte alignment for some types
> ... see rev 25454.  Any spelling of malloc() also needs to return
> memory aligned for any legitimate purpose, and 8-byte alignment is the
> strictest requirement we know of across current Python platforms.
It is possible to receive the reference about these tests? I have lead
some tests, which very small difference of speed of work at alignment on
4 and 8 byte (a maximum of 8%, and in the basic less than one percent).

In tests consecutive search of elements in one piece of memory was used.
I understand, that it is necessary to lead still the test with a casual
choice of addresses to minimize optimization of work cache the processor.
And certainly not all platforms will show the same result (I shall try
to not forget to attach a source code of the test and its result of
work :) ).

On the other hand reduction of a memory size by each separate copy of
object is capable to increase speed by the same percent that is lost at
change of displacement about 8 bytes up to 4 :) Besides begins to
refuse probably superstructures on allocation of memory at some objects,
since int.


>> So Python should err on the safe side, and only use a smaller alignment
>> when it is known not to hurt.
>>
>> OTOH, I don't see the *advantage* in reducing the alignment.
>
> It could cut wasted bytes.  There is no per-object memory overhead in
> a release-build obmalloc:  the allocatable part of a pool is entirely
> used for user-visible object memory, except when the alignment
> requirement ends up forcing unused (by both the user and by obmalloc)
> pad bytes.  For example, Python ints consume 12 bytes on 32-bit boxes,
> but if space were allocated for them by obmalloc (it's not), obmalloc
> would have to use 16 bytes per int to preserve 8-byte alignment.
Good note.

> OTOH, obmalloc (unlike PyGC_Head) has always used 8-byte alignment,
> because that's what worked best for Vladimir Marangozov during his
> extensive timing tests.  It's not an isolated decision -- e.g., it's
> also influenced by, and influences, "the best" pool size, and (of
> course) cutting alignment to 4 would double the number of "size
> classes" obmalloc has to juggle.
Yes, the maximum quantity of buffers will really be doubled. But it
should not to affect as that productivity of system, or on total amount
of consumed memory. Change of a fragmentation of memory to count it is
impossible, since it will depend on the concrete application.

>>> * To expand the maximal size which can be allocated by means of the
>>> given module. Now the maximal size is 256 bytes.
>
>> Right. This is somewhat deliberate, though; the expectation is that
>> fragmentation will increase dramatically if even larger size classes
>> are supported.
>
> It's entirely deliberate ;-)  obmalloc is no way trying to be a
> general-purpose allocator.  It was designed specifically for the
> common memory patterns in Python, aiming at large numbers of small
> objects of the same size.  That's extremely common in Python, and
> allocations larger than 256 bytes are not. The maximum size was
> actually 64 bytes at first.    After that, dicts grew an embedded
> 8-element table, which vastly increased the size of the dict struct. 
> obmalloc's limit was boosted to 256 then, although it could have
> stopped at the size of a dict (in the rough ballpark of 150 bytes). 
> There was no reason (then or now) to go beyond 256.
See below.
>>> * At the size of the allocated memory close to maximal, the
>>> allocation of blocks becomes inefficient. For example, for the
>>> sizesof blocks 248 and 256 (blocksize), values of quantity of
>>> elements (PAGESIZE/blocksize) it is equal 16. I.e. it would be
>>> possible to use for the sizes of the block 248 same page, as for the
>>> size of the block 256.
>
>> Good idea; that shouldn't be too difficult to implement: for sizes above
>> 215, pools need to be spaced apart only at 16 bytes.
>
> I'd rather drop the limit to 215 then <0.3 wink>.  Allocations that
> large probably still aren't important to obmalloc, but it is important
> that finding a requested allocation's "size index" be as cheap as
> possible.  Uniformity helps that.
An available module on allocation of memory really does not approach for
overall aims. And speed of work can "blink" in case of existing non-uniform
realization on allocation of memory. In some cases allocation of memory in
the module `obmalloc.c' work as function of a wrapper for standard
malloc/free, that does not raise speed :) .

For an example we shall assume, that two pieces of memory are allocated. 
One
is allocated `obmalloc.c'. Another is allocated standard malloc.
Both of a piece are small enough to be realized by similar methods. In a
number of OS malloc will lose in speed of work for multi-thread programs,
because of blocking the global lock on allocation of memory. Certainly it
concerns not to all operational systems.
> On modern boxes, it may be desirable to boost POOL_SIZE -- nobody has
> run timing experiments as comprehensive as Vladimir ran way back when,
> and there's no particular reason to believe that the same set of
> tradeoffs would look best on current architectures.
As to speed of work it will be not less than in available systems of
allocation of memory. Naturally, there are no pluses without what that of
minuses. In my opinion available minuses can be made insignificant.
Language Python has obviously expressed specificity on allocation of memory
that it is possible to use favourably, having developed system of 
allocation
of memory is direct under it.

The properties inherent in language Python:
* At work in multi-thread a mode already there is blocking GIL which can be
  used and in case of with allocation of memory.
* Objects which are often used, it is better to adapt on the real size,
  instead of on log2(size(ob)) therefore as function realloc is carried out
  extremely seldom or even less often :) .
* Extremely seldom there are objects the size of the allocated memory
  multiple 512, 1024, 2048 since before them heading Python of object, and
  the parameters describing properties of object is placed still.

First two items concern and to present `obmalloc.c', but for a narrow 
circle
of allocated blocks of memory.

In some interpreters own system of allocation of memory is used. Adjustment
of a control system of memory not the latest step to acceleration Python.

Thanks everyone who has answered. The theme on former is interesting.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: aligntest.c
Url: http://mail.python.org/pipermail/python-dev/attachments/20060504/8a7a928e/attachment.asc 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: out-ia32.txt
Url: http://mail.python.org/pipermail/python-dev/attachments/20060504/8a7a928e/attachment.txt 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: out-amd64.txt
Url: http://mail.python.org/pipermail/python-dev/attachments/20060504/8a7a928e/attachment-0001.txt 

From zbir at urbanape.com  Thu May  4 18:03:39 2006
From: zbir at urbanape.com (Zachery Bir)
Date: Thu, 4 May 2006 12:03:39 -0400
Subject: [Python-Dev] lambda in Python
In-Reply-To: <e3d6o4$5np$1@sea.gmane.org>
References: <mailman.21.1146736804.29262.python-dev@python.org>	<7bf78738c8226235c5e4290d41a8922f@cogeco.ca>
	<593B03DB-0FE3-49DC-ADD1-EF4BD7B05884@xahlee.org>
	<e3d6o4$5np$1@sea.gmane.org>
Message-ID: <A7BCACD1-8B02-4F86-923C-9D632508F4A1@urbanape.com>

On May 4, 2006, at 11:31 AM, Georg Brandl wrote:

> Obviously, we all like Python the way it is and people who disagree
> (especially those who accuse the BDFL of being ignorant) don't belong
> here.

If that were really the case, there wouldn't be much point to having  
a Python-Dev list, now would there?

Zac


From gjcarneiro at gmail.com  Thu May  4 18:36:19 2006
From: gjcarneiro at gmail.com (Gustavo Carneiro)
Date: Thu, 4 May 2006 17:36:19 +0100
Subject: [Python-Dev] Shared libs on Linux (Was: test failures in
	test_ctypes (HEAD))
In-Reply-To: <4458FC3C.1040407@python.net>
References: <ca471dc20605021029r44467206t7af7b7dd637bded8@mail.gmail.com>
	<4458FC3C.1040407@python.net>
Message-ID: <a467ca4f0605040936x4125a207r7833b47b3342355c@mail.gmail.com>

On 5/3/06, Thomas Heller <theller at python.net> wrote:
>
> [Crossposting to both python-dev and ctypes-users, please respond to the
> list
> that seems most appropriate]
>
> Guido van Rossum wrote:
> >   File "/home/guido/projects/python/trunk/Lib/ctypes/__init__.py",
> > line 288, in __init__
> >     self._handle = _dlopen(self._name, mode)
> > OSError: /usr/lib/libglut.so.3: undefined symbol: XGetExtensionVersion


It means that /usr/lib/libglut.so.3 is not explicitly linking to xlibs, but
it should.  Nothing wrong with python, it's the GL library that was not
correctly linked.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060504/4c88fb97/attachment.htm 

From guido at python.org  Thu May  4 18:37:02 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 4 May 2006 09:37:02 -0700
Subject: [Python-Dev] lambda in Python
In-Reply-To: <A7BCACD1-8B02-4F86-923C-9D632508F4A1@urbanape.com>
References: <mailman.21.1146736804.29262.python-dev@python.org>
	<7bf78738c8226235c5e4290d41a8922f@cogeco.ca>
	<593B03DB-0FE3-49DC-ADD1-EF4BD7B05884@xahlee.org>
	<e3d6o4$5np$1@sea.gmane.org>
	<A7BCACD1-8B02-4F86-923C-9D632508F4A1@urbanape.com>
Message-ID: <ca471dc20605040937r131328f3q30a423e104f7598@mail.gmail.com>

Reminder: the best way to get rid of a troll is to ignore him.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu May  4 18:47:47 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 4 May 2006 09:47:47 -0700
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <4459FA21.6030508@gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com>
Message-ID: <ca471dc20605040947g3967d2as7258c3faaf63d9e7@mail.gmail.com>

On 5/4/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Guido has indicated strong dissatisfaction with the idea of subclassing
> str/unicode with respect to PEP 355.

That's not my only problem with that PEP; I'm not at all convinced
that an object is a better solution than a module for this particular
use case.

FWIW, subclassing tuple isn't any better in my book than subclassing
str/unicode.

My only other comment on this discussion: I don't really have time for
it; but don't let that stop you all from having a jolly good time. If
you need me to pronounce on something, put me (and python-dev) in the
To: header and limit your post to 20 lines.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gustavo at niemeyer.net  Thu May  4 18:47:52 2006
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Thu, 4 May 2006 13:47:52 -0300
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
Message-ID: <20060504164751.GA13442@localhost.localdomain>

> The biggest conceptual change is that my path object is a subclass of
> ''tuple'', not a subclass of str. For example,

Using tuples is a nice idea!  Object paths using tuples is a somewhat
common concept.  I don't see an immediate reason to be a *subclass*
of tuple, though, as opposed to using it as an internal representation.

> {{{
> p.normpath()  -> Isn't needed - done by the constructor
> p.basename()  -> p[-1]
> p.splitpath() -> (p[:-1], p[-1])
> p.splitunc()  -> (p[0], p[1:]) (if isinstance(p[0], path.UNCRoot))
> p.splitall()  -> Isn't needed
> p.parent      -> p[:-1]
> p.name        -> p[-1]
> p.drive       -> p[0] (if isinstance(p[0], path.Drive))
> p.uncshare    -> p[0] (if isinstance(p[0], path.UNCRoot))
> 
> and of course:
> p.join(q) [or anything like it] -> p + q
> }}}

Nice indeed.

> The only drawback I can see in using a logical representation is that
> giving a path object to functions which expect a path string won't
> work. The immediate solution is to simply use str(p) instead of p. The
> long-term solution is to make all related functions accept a path
> object.

Let's use __path_.. I mean, generic functions! ;-)

(...)
>  * A ''relative path'' is a path without a root element, so it can be
> concatenated to other paths.
>  * An ''absolute path'' is a path whose meaning doesn't change when
> the current working directory changes.

That sounds right.

> == Easier attributes for stat objects ==
> 
> The current path objects includes:
>  * isdir, isfile, islink, and -
>  * atime, mtime, ctime, size.
(...)

This indeed shouldn't be attributes of path, IMO, since they are
operations that depend on the state of the filesystem, and will
change behind your back.

> I think that isfile, isdir should be kept (along with lisfile,
> lisdir), since I think that doing what they do is quite common, and
> requires six lines:

I don't agree. "isdir" is an attribute of the filesystem, not of
the path object. I'd never expect that e.g. a network operation
could result from accessing an attribute in Python, and that could
certainly happen if the path is referencing a network filesystem.

-- 
Gustavo Niemeyer
http://niemeyer.net

From guido at python.org  Thu May  4 18:49:32 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 4 May 2006 09:49:32 -0700
Subject: [Python-Dev] Python long command line options
In-Reply-To: <e3d2ks$ldf$1@sea.gmane.org>
References: <200605040829.53140.me+python-dev@modelnine.org>
	<e3d0b2$bu7$1@sea.gmane.org>
	<200605041551.22595.me+python-dev@modelnine.org>
	<e3d2ks$ldf$1@sea.gmane.org>
Message-ID: <ca471dc20605040949i2f8eca46wfebfae7a9a35df53@mail.gmail.com>

On 5/4/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
> I'm +1 on adding --help and --version, +1 on adding -? and /? for windows only,
> -0=slightly sceptical if adding a generic long option machinery is worth it, and -1
> on a guessing machinery.

I fully support Fredrik's position on this issue.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gustavo at niemeyer.net  Thu May  4 18:53:57 2006
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Thu, 4 May 2006 13:53:57 -0300
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <79990c6b0605040718s10b6e177s95906387806e2508@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com> <445A017D.4080207@ofai.at>
	<445A07D6.403@gmail.com>
	<79990c6b0605040718s10b6e177s95906387806e2508@mail.gmail.com>
Message-ID: <20060504165357.GB13442@localhost.localdomain>

> You ought to have predefined classes for the standard OSes. Expecting
> people to know the values for sep and extsep seems unhelpful.
(...)

Why not something as simple as having path.sep == None meaning the
default for the platform, and a defined value for otherwise?

-- 
Gustavo Niemeyer
http://niemeyer.net

From martin at v.loewis.de  Thu May  4 19:52:23 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 04 May 2006 19:52:23 +0200
Subject: [Python-Dev] Python long command line options
In-Reply-To: <e3d2ks$ldf$1@sea.gmane.org>
References: <200605040829.53140.me+python-dev@modelnine.org><e3d0b2$bu7$1@sea.gmane.org>	<200605041551.22595.me+python-dev@modelnine.org>
	<e3d2ks$ldf$1@sea.gmane.org>
Message-ID: <445A3F57.9020002@v.loewis.de>

Fredrik Lundh wrote:
> I'm +1 on adding --help and --version, +1 on adding -? and /? for windows only,
> -0=slightly sceptical if adding a generic long option machinery is worth it, and -1
> on a guessing machinery.

I also agree with all these judgments.

Regards,
Martin

From tjreedy at udel.edu  Thu May  4 20:01:24 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 4 May 2006 14:01:24 -0400
Subject: [Python-Dev] Alternative path suggestion
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
Message-ID: <e3dfhi$89o$1@sea.gmane.org>


"Mike Orr" <sluggoster at gmail.com> wrote in message 
news:6e9196d20605040113x2a2d563ah23fabf02c49a0778 at mail.gmail.com...
> Intriguing idea, Noam, and excellent thinking.  I'd say it's worth a
> separate PEP.

I am glad to see this idea resurface and developed.  So +1 on an alternate 
PEP.

> The main difficulty with this approach is it's so radical.

Only locally.  Last July, I wrote the following for clp:

http://article.gmane.org/gmane.comp.python.general/412997
Subject: Re: PEP on path module for standard library
Date: 2005-07-22 18:47:00 GMT
"
 Very interesting.  Java's File class is a system-independent "abstract
representation of file and directory pathnames" which is constructed from 
and converted to system-dependent string-form pathnames (including URL/URI 
file:... forms).  A File consist of an optional prefix and a *sequence* of 
zero or more string names.

In other words, Java's File class is what Duncan and I thought Python's
Path class might or possibly should be.  So this internal representation
might be worth considering as an option.  Of course, if the interface is
done right, it mostly should not make too much difference to the user.
"
I believe a previous post gave the reference to the File doc, which might 
be worth consulting.  As I remember, the general response at that time was 
that the existing string-based path class was tried, useful, and well 
accepted, so why try anything more different.  That seems to have become 
clearer with time.

Terry Jan Reedy




From tim.peters at gmail.com  Thu May  4 20:08:49 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 4 May 2006 14:08:49 -0400
Subject: [Python-Dev] [Python-checkins] r45898 - in python/trunk:
	Lib/test/test_os.py Lib/test/test_shutil.py Misc/NEWS
	Modules/posixmodule.c
In-Reply-To: <445A3E4A.8060402@v.loewis.de>
References: <20060504100844.5C1691E4007@bag.python.org>
	<ca471dc20605041000j6898b011ya6ee4417f8e4433e@mail.gmail.com>
	<445A3E4A.8060402@v.loewis.de>
Message-ID: <1f7befae0605041108j65d7f6f4jc320282696b6893f@mail.gmail.com>

[Guido]
>> I wonder if it's time to move the Win32 code out of posixmodule.c? It
>> seems the current mess of #ifdefs can't be very maintainable.

[Martin v. L?wis]
> I vaguely recall that we had considered that before, and rejected it,
> for some reason. Not sure what the reason was, but one might have been
> that there still is OS/2 code in there, also.

Martin worked up a patch for this in 2002, which he withdrew based on
mixed reviews (a +0.5 and two -0); the comments still seem relevant:

    http://www.python.org/sf/592529

From rasky at develer.com  Thu May  4 21:01:01 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Thu, 4 May 2006 21:01:01 +0200
Subject: [Python-Dev] [Python-checkins] r45898 - in
	python/trunk:Lib/test/test_os.py Lib/test/test_shutil.py
	Misc/NEWSModules/posixmodule.c
References: <20060504100844.5C1691E4007@bag.python.org><ca471dc20605041000j6898b011ya6ee4417f8e4433e@mail.gmail.com><445A3E4A.8060402@v.loewis.de>
	<1f7befae0605041108j65d7f6f4jc320282696b6893f@mail.gmail.com>
Message-ID: <04d901c66fad$15edf720$bf03030a@trilan>

Tim Peters wrote:

> [Martin v. L?wis]
>> I vaguely recall that we had considered that before, and rejected it,
>> for some reason. Not sure what the reason was, but one might have
>> been that there still is OS/2 code in there, also.
>
> Martin worked up a patch for this in 2002, which he withdrew based on
> mixed reviews (a +0.5 and two -0); the comments still seem relevant:
>
>     http://www.python.org/sf/592529

I think another way of doing it would be to keep the single module
posixmodule (maybe appropriately renamed) and then internally dispatch
function implementations where possible. If the dispatching is done
correctly (through a fixed size virtual table for all the platforms),
BuildBot should be able to tell you quite fast if you forgot something.

At least, this setup would fix the docstring and the largefile issues raised
in your comment in that bug.
-- 
Giovanni Bajo


From amk at amk.ca  Thu May  4 22:22:32 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Thu, 4 May 2006 16:22:32 -0400
Subject: [Python-Dev] Confirmed: DC-area sprint on Sat. June 3rd
Message-ID: <20060504202232.GA31662@rogue.amk.ca>

The DC-area sprint is now confirmed.  It'll be on Saturday June 3,
from 10 AM to 5 PM at the Arlington Career Center in Arlington VA.

I've created a wiki page at
<http://wiki.python.org/moin/ArlingtonSprint>; please add your name if
you'll be coming.  The wiki page can also be used to brainstorm about
tasks to work on.  

While this started out as a Python core sprint, there's no problem if
people want to come and work on something other than the Python core
(space permitting).  I'm still waiting to get an upper limit on
attendance, and will add that figure to the wiki page once I get it.

--amk


From me+python-dev at modelnine.org  Thu May  4 21:25:49 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Thu, 4 May 2006 21:25:49 +0200
Subject: [Python-Dev] Python long command line options
In-Reply-To: <e3d2ks$ldf$1@sea.gmane.org>
References: <200605040829.53140.me+python-dev@modelnine.org>
	<200605041551.22595.me+python-dev@modelnine.org>
	<e3d2ks$ldf$1@sea.gmane.org>
Message-ID: <200605042125.49602.me+python-dev@modelnine.org>

Am Donnerstag 04 Mai 2006 16:21 schrieb Fredrik Lundh:
> I'm +1 on adding --help and --version, +1 on adding -? and /? for windows
> only, -0=slightly sceptical if adding a generic long option machinery is
> worth it, and -1 on a guessing machinery.

I've updated the patch on the SourceForge tracker to reflect this criticism. 
In effect:

1) --help and --version are added unconditionally on all platforms
2) -? and /? are recognized on Windows, as are /help and /version,
   because / is just a different longopt-specifier on Windows for the
   getopt machinery in _PyOS_GetOpt
3) I removed the prefix matching
4) I removed the last reference to a string function in _PyOS_GetOpt
5) --command, which was a test-case for me, has been removed as a long-opt
   again.

The module has undergone a complete rewrite with this patch, and I've adapted 
the module header to reflect this (because there's absolutely no legacy code 
left now, which can be seen pretty clearly when you look at a diff...). The 
usage output in Modules/main.c has also been adapted, as have the test cases 
in test_cmd_line.py for usage and version, which pass cleanly for me.

The patch doesn't produce any warnings on gcc 3.4.6, and I've tested the 
Windows-specific additions to _PyOS_GetOpt by injecting MS_WINDOWS into the 
compilation on my system for Python/getopt.c, and it produces correct results 
on /?, /version, /help, and -?, but as I said before I can't tell whether 
there are bugs left which surface when it's being compiled on Windows 
directly, as I don't have a Windows system to test this patch on.

Anyway, would be glad to hear any feedback.

--- Heiko.

From me+python-dev at modelnine.org  Thu May  4 22:12:09 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Thu, 4 May 2006 22:12:09 +0200
Subject: [Python-Dev] Python long command line options
In-Reply-To: <200605042125.49602.me+python-dev@modelnine.org>
References: <200605040829.53140.me+python-dev@modelnine.org>
	<e3d2ks$ldf$1@sea.gmane.org>
	<200605042125.49602.me+python-dev@modelnine.org>
Message-ID: <200605042212.10035.me+python-dev@modelnine.org>

Am Donnerstag 04 Mai 2006 21:25 schrieb Heiko Wundram:
> 2) -? and /? are recognized on Windows, as are /help and /version,
>    because / is just a different longopt-specifier on Windows for the
>    getopt machinery in _PyOS_GetOpt

Just on a side-note: I chose for '/' to be a long-opt identifier on Windows, 
because it is for pretty much any Microsoft programming tool out there, 
starting with the linker, ending with the compiler of MSVC++, amongst others.

I know that shell commands such as dir, etc. take / to be similar to - on *nix 
(as single character command line arguments), but I guess the former is more 
useful. It's no problem to change this behaviour, though.

--- Heiko.

From fredrik at pythonware.com  Thu May  4 23:20:09 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 4 May 2006 23:20:09 +0200
Subject: [Python-Dev] context guards, context entry values, context managers,
	context contexts
Message-ID: <e3dr6b$hof$1@sea.gmane.org>

fwiw, I just tested

    http://pyref.infogami.com/with

on a live audience, and most people seemed to grok the "context
guard" concept quite quickly.

note sure about the "context entry value" term, though.  anyone
has a better idea ?

</F>




From pje at telecommunity.com  Thu May  4 23:39:24 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 May 2006 17:39:24 -0400
Subject: [Python-Dev] context guards, context entry values,
 context  managers, context contexts
In-Reply-To: <e3dr6b$hof$1@sea.gmane.org>
Message-ID: <5.1.1.6.0.20060504173609.01e89930@mail.telecommunity.com>

At 11:20 PM 5/4/2006 +0200, Fredrik Lundh wrote:
>fwiw, I just tested
>
>     http://pyref.infogami.com/with
>
>on a live audience, and most people seemed to grok the "context
>guard" concept quite quickly.
>
>note sure about the "context entry value" term, though.  anyone
>has a better idea ?

"guarded value"?  That works for files, at least.


From ncoghlan at gmail.com  Thu May  4 23:43:45 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 05 May 2006 07:43:45 +1000
Subject: [Python-Dev] context guards, context entry values,
 context managers, context contexts
In-Reply-To: <e3dr6b$hof$1@sea.gmane.org>
References: <e3dr6b$hof$1@sea.gmane.org>
Message-ID: <445A7591.8040302@gmail.com>

Fredrik Lundh wrote:
> fwiw, I just tested
> 
>     http://pyref.infogami.com/with
> 
> on a live audience, and most people seemed to grok the "context
> guard" concept quite quickly.

Compared to the various other changes made lately to PEP 343, s/manager/guard/ 
would be a fairly straightforward one :)

I'm personally +1 on switching to 'guard' for the following reasons:
   - 'guarding' block entry and exit seems much more natural terminology to me 
than 'managing' entry and exit (since the block entry and exit is really still 
managed by the interpreter - the guard is just given the change to intervene 
in both cases).
   - 'manager' is a pretty generic term ('wooly' as Greg put it), so it needs 
to be qualified a lot more often than 'guard' does.
   - .NET uses 'managed code' in relation to sandboxes (as I understand it), 
so I suspect 'context manager' would up causing at least a bit of confusion in 
relation to IronPython

OTOH, I also think getting rid of __context__() has made the problems with the 
term context manager far less severe, so I'm not averse to the idea of leaving 
it alone, either.

> note sure about the "context entry value" term, though.  anyone
> has a better idea ?

The latest version of the PEP punts on this entirely - for most real use 
cases, you're going to be talking about a specific thing, anyway (such as the 
new decimal context returned by a decimal context manager).

"context entry value" really isn't that much better than "result of the 
__enter__ method", and the latter has the virtue of being explicit. . .

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Thu May  4 23:49:42 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 05 May 2006 07:49:42 +1000
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <ca471dc20605040947g3967d2as7258c3faaf63d9e7@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>	
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>	
	<4459FA21.6030508@gmail.com>
	<ca471dc20605040947g3967d2as7258c3faaf63d9e7@mail.gmail.com>
Message-ID: <445A76F6.4080101@gmail.com>

Guido van Rossum wrote:
> On 5/4/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Guido has indicated strong dissatisfaction with the idea of subclassing
>> str/unicode with respect to PEP 355.
> 
> That's not my only problem with that PEP; I'm not at all convinced
> that an object is a better solution than a module for this particular
> use case.

Object vs module isn't my objection to the standard library status quo, either 
- it's the fact that it's a choice of using one object vs 5 or 6 different 
modules ;)

However, I'm going to have to bow out of this discussion for now, too - it's 
interesting, but Python 2.5 is a lot closer than Python 2.6. . .

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From tdelaney at avaya.com  Fri May  5 00:46:52 2006
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Fri, 5 May 2006 08:46:52 +1000
Subject: [Python-Dev] lambda in Python
Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1E6C5@au3010avexu1.global.avaya.com>

Guido van Rossum wrote:

> Reminder: the best way to get rid of a troll is to ignore him.

Indeed. Xah got past Spambayes for me because this time he posted on
Python-dev - I doubt that he'll get past Spambayes again by doing that
:)

Should we add an explicit rule to the Python-dev spam filter for Xah?
Based on his past history, I doubt we'll ever see anything useful from
him.

Tim Delaney

From greg.ewing at canterbury.ac.nz  Fri May  5 02:59:24 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 05 May 2006 12:59:24 +1200
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
Message-ID: <445AA36C.9080700@canterbury.ac.nz>

Mike Orr wrote:

> The main difficulty with this approach is it's so radical.  It would
> require a serious champion to convince people it's as good as our
> tried-and-true strings.

Another thing you would need to do is implement it for
some of the less Unix-like path syntaxes, such as Classic
MacOS and VMS, to make sure that it's feasible to fit
them into your tuple-like format.


> The question is, does forcing people to use .stat() expose an
> implementation detail that should be hidden, and does it smell of
> Unixism?  Most people think a file *is* a regular file or a directory.
>  The fact that this is encoded in the file's permission bits -- which
> stat() examines -- is a quirk of Unix.

Permission bits aren't the only thing that stat() examines.
I don't see anything wrong with having stat() be the way to
get at whatever metadata there is about a file on a platform.
And having .isdir etc. attributes on the stat() result
abstracts whether they're obtained from the permission bits
or not.

--
Greg

From greg.ewing at canterbury.ac.nz  Fri May  5 03:29:15 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 05 May 2006 13:29:15 +1200
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <4459FA21.6030508@gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com>
Message-ID: <445AAA6B.7030108@canterbury.ac.nz>

Nick Coghlan wrote:

> Similarly, I would separate out the extension to a distinct attribute, as it 
> too uses a different separator from the normal path elements ('.' most places, 
> but '/' on RISC OS, for example)

-1. What constitutes "the extension" is not well-defined in
all cases. What about filenames with multiple suffixes,
such as "spam.tar.gz"? What part of that would you put in
the extension attribute?

--
Greg


From greg.ewing at canterbury.ac.nz  Fri May  5 04:03:44 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 05 May 2006 14:03:44 +1200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <e3dcj3$sdq$1@sea.gmane.org>
References: <4453B025.3080100@acm.org> <445476CC.3080008@v.loewis.de>
	<e33mj1$5nd$1@sea.gmane.org> <4455A0CF.3080008@v.loewis.de>
	<e35ds3$p77$1@sea.gmane.org> <445644D1.3050200@v.loewis.de>
	<e36f6e$pns$1@sea.gmane.org> <44573D75.40007@canterbury.ac.nz>
	<ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>
	<445805EA.7000501@canterbury.ac.nz>
	<740c3aec0605031621v48f33dd0g591cb12a131764e8@mail.gmail.com>
	<44598128.8060506@canterbury.ac.nz> <e3dcj3$sdq$1@sea.gmane.org>
Message-ID: <445AB280.8060701@canterbury.ac.nz>

Terry Reedy wrote:
> The dispute is about the sensibility and 
> politeness of requiring a small fixed number of required, no-default args 
> to be passed by name only

There seems to be some confusion between two different
subthreads here. BJ?rn Lindqvist seemed to be saying that
instead of my suggested

   make_person(=name, =age, =phone, =location)

as a substitute for

   make_person(name=name, age=age, phone=phone, location=location)

it would be better to pass the arguments positionally. I
was pointing out that you can't do this when the thing you're
calling requires them to be passed by keyword.

--
Greg

From tim.peters at gmail.com  Fri May  5 07:16:15 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 5 May 2006 01:16:15 -0400
Subject: [Python-Dev] Python sprint mechanics
Message-ID: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>

There's going to be a Python sprint in Iceland later this month, and
it raises some management issues we also see on Bug Days, but more so:
 a relatively large number of people slinging code without commit
privileges, and a relative handful with commit privileges.  The latter
end up spending all their time reviewing and doing commits for the
former.

While that may be unavoidable for Bug Days, a major difference for the
sprint is that little of the code is likely _intended_ to end up on
the trunk at this time.  Instead it would make best sense for each
sprint project to work in its own branch, something SVN makes very
easy, but only for those who _can_ commit.  This year's PyCon core
sprint isn't a good model here, as everyone there did have commit
privs -- and that's unusual.

Since I hope we see a lot more of these problems in the future, what
can be done to ease the pain?  I don't know enough about SVN admin to
know what might be realistic.  Adding a pile of  "temporary
committers" comes to mind, but wouldn't really work since people will
want to keep working on their branches after the sprint ends.  Purely
local SVN setups wouldn't work either, since sprint projects will
generally have more than one worker bee, and they need to share code
changes.  There:  I think that's enough to prove I don't have a clue
:-)

From nnorwitz at gmail.com  Fri May  5 07:25:18 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 4 May 2006 22:25:18 -0700
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
Message-ID: <ee2a432c0605042225m6dc8369eu32661d24f5884bfd@mail.gmail.com>

On 5/4/06, Tim Peters <tim.peters at gmail.com> wrote:
>
> Since I hope we see a lot more of these problems in the future, what

Me too.

> can be done to ease the pain?  I don't know enough about SVN admin to
> know what might be realistic.  Adding a pile of  "temporary
> committers" comes to mind, but wouldn't really work since people will
> want to keep working on their branches after the sprint ends.  Purely
> local SVN setups wouldn't work either, since sprint projects will
> generally have more than one worker bee, and they need to share code
> changes.  There:  I think that's enough to prove I don't have a clue

I guess that means it's my turn to let my ignorance shine yet again.  :-)

Could we take a snapshot of the SVN repo, hosted on python.org, open
it up for public commit during the period?

That seems easy to setup.  The difficulty would come in when we need
to merge back to the real SVN.

n

From martin at v.loewis.de  Fri May  5 07:37:15 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 05 May 2006 07:37:15 +0200
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
Message-ID: <445AE48B.1070508@v.loewis.de>

Tim Peters wrote:
> Since I hope we see a lot more of these problems in the future, what
> can be done to ease the pain?  I don't know enough about SVN admin to
> know what might be realistic.  Adding a pile of  "temporary
> committers" comes to mind, but wouldn't really work since people will
> want to keep working on their branches after the sprint ends.  Purely
> local SVN setups wouldn't work either, since sprint projects will
> generally have more than one worker bee, and they need to share code
> changes.

I think Fredrik Lundh points to svk at such occasions.

People usually get commit privileges when they have demonstrates to
simultaneously meet a number of requirements, such as
- following style guides
- never committing anything without discussion which might cause
  strong opposition
- always committing tests and documentation along with the actual
  code changes
- licensing all of their own changes to the PSF
- ...more things I can't put into words right now.
Of course, many committers miss some of these rules some of
the time, but I hope they all feel guilty when doing so :-)

It might be reasonable to waive the "have demonstrated" part for
some group of people, i.e. give them commit privs after telling
them what the rules are. However, in this case, I'd would really
like to see some "shepherd" who oversees this group of people.
gcc has a "commit-after-approval" privilege; if you have that,
you can only commit after somebody (a "maintainer") has approved
a change. Approvals are 'on file' in a mailing list archive.

I could imagine such a model for sprint participants, if
there would be somebody to volunteer as a shepherd. That person
would have to track the status of the individual projects,
and either revoke commit privs when the project is eventually
completed, or grant the contributors permanent (without-approval)
commit privs if he considers this appropriate.

Regards,
Martin

From anthony at interlink.com.au  Fri May  5 07:54:17 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 5 May 2006 15:54:17 +1000
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
Message-ID: <200605051554.23062.anthony@interlink.com.au>

On Friday 05 May 2006 15:16, Tim Peters wrote:
> While that may be unavoidable for Bug Days, a major difference for
> the sprint is that little of the code is likely _intended_ to end
> up on the trunk at this time. 

Given where we are in the 2.5 release cycle, I'd _hope_ that only safe 
changes are considered for landing on the trunk. Massive refactorings 
really don't fill me with happy thoughts.

Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From sluggoster at gmail.com  Fri May  5 09:05:49 2006
From: sluggoster at gmail.com (Mike Orr)
Date: Fri, 5 May 2006 00:05:49 -0700
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <445AAA6B.7030108@canterbury.ac.nz>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com> <445AAA6B.7030108@canterbury.ac.nz>
Message-ID: <6e9196d20605050005u432ea0a4w9666d81cc2537f37@mail.gmail.com>

I've updated the wiki with a second proposal based on this thread, and
also summarized the Python-dev discussions.  Please make sure your
favorite feature or pet peeve is adequately represented.

http://wiki.python.org/moin/AlternativePathClass

--
Mike Orr <sluggoster at gmail.com>
(mso at oz.net address is semi-reliable)

From sluggoster at gmail.com  Fri May  5 09:15:39 2006
From: sluggoster at gmail.com (Mike Orr)
Date: Fri, 5 May 2006 00:15:39 -0700
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <4459FA21.6030508@gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com>
Message-ID: <6e9196d20605050015g492ae7c9p3e3d63e238729fe6@mail.gmail.com>

On 5/4/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Mike Orr wrote:
> >> == a tuple instead of a string ==
> >>
> >> The biggest conceptual change is that my path object is a subclass of
> >> ''tuple'', not a subclass of str.
>
> Why subclass anything? The path should internally represent the filesystem
> path as a list or tuple, but it shouldn't *be* a tuple.

Good point.

> Also, was it a deliberate design decision to make the path objects immutable,
> or was that simply copied from the fact that strings are immutable?
>
> Given that immutability allows the string representation to be calculated once
> and then cached, I'll go with that interpretation.

Immutability also allows them to be dictionary keys.  This is
convenient in many applications.

> Similarly, I would separate out the extension to a distinct attribute, as it
> too uses a different separator from the normal path elements ('.' most places,
> but '/' on RISC OS, for example)

That would be too surprising to users.  p[-1] should be the full
filename.  We can use p.name, p.extsep, and p.ext for the parts.

> Alternatively, if the path elements are stored in separate attributes, there's
> nothing stopping the main object from inheriting from str or unicode the way
> the PEP 355 path object does.

Do the string methods call .__str__(), or how do they get the string value?

> I wouldn't expose stat() - as you say, it's a Unixism. Instead, I'd provide a
> subclass of Path that used lstat instead of stat for symbolic links.
>
> So if I want symbolic links followed, I use the normal Path class. This class
> just generally treat symbolic links as if they were the file pointed to
> (except for the whole not recursing into symlinked subdirectories thing).
>
> The SymbolicPath subclass would treat normal files as usual, but *wouldn't*
> follow symbolic links when stat'ting files (instead, it would stat the symlink).


Normally I'd be processing the subdirectories, then the symlinks, off
the same path.  I'd rather just call a different method rather than
having to construct another object.

> >> == One Method for Finding Files ==
> >>
> >> (They're actually two, but with exactly the same interface). The
> >> original path object has these methods for finding files:
> >>
> >> {{{
> >> def listdir(self, pattern = None): ...
> >> def dirs(self, pattern = None): ...
> >> def files(self, pattern = None): ...
> >> def walk(self, pattern = None): ...
> >> def walkdirs(self, pattern = None): ...
> >> def walkfiles(self, pattern = None): ...
> >> def glob(self, pattern):
> >> }}}
> >>
> >> I suggest one method that replaces all those:
> >> {{{
> >> def glob(self, pattern='*', topdown=True, onlydirs=False, onlyfiles=False): ...
> >> }}}
>
> Swiss army methods are even more evil than wide APIs.


I don't see the point in combining these either.  There's the problem
of walking special files, but these are so rare and diverse that we
either have to provide methods for all of them, or an argument, or
force the user to use .walk().  Since they are rare, the latter is OK.


> Now, what might potentially be genuinely useful is paired walk methods that
> allowed the following:
>
>    # Do path.walk over this directory, and also return the corresponding
>    # information for a destination directory (so the dest dir information
>    # probably *won't* match that file system
>    for src_info, dest_info in src_path.pairedwalk(dest_path):
>        src_dirpath, src_subdirs, src_files = src_info
>        dest_dirpath, dest_subdirs, dest_files = dest_info
>        # Do something useful
>
>    # Ditto for path.walkdirs
>    for src_dirpath, dest_dirpath in src_path.pairedwalkdirs(dest_path):
>        # Do something useful
>
>    # Ditto for path.walkfiles
>    for src_path, dest_path in src_path.pairedwalkfiles(dest_path):
>        src_path.copy_to(dest_path)

These look like os.walk() derivatives.  I've never found that as
useful as getting a flat iteration of paths.  But if enough people
want it...

--
Mike Orr <sluggoster at gmail.com>
(mso at oz.net address is semi-reliable)

From sluggoster at gmail.com  Fri May  5 09:26:45 2006
From: sluggoster at gmail.com (Mike Orr)
Date: Fri, 5 May 2006 00:26:45 -0700
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <79990c6b0605040718s10b6e177s95906387806e2508@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com> <445A017D.4080207@ofai.at>
	<445A07D6.403@gmail.com>
	<79990c6b0605040718s10b6e177s95906387806e2508@mail.gmail.com>
Message-ID: <6e9196d20605050026o6675b2d4jc97380bb4dde4ced@mail.gmail.com>

On 5/4/06, Paul Moore <p.f.moore at gmail.com> wrote:
> On 5/4/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > My inclination was to have a PlatformPath subclass that accepted 'os', 'sep'
> > and 'extsep' keyword arguments to the constructor, and provided the
> > appropriate 'sep' and 'extsep' attributes (supplying 'os' would just be a
> > shortcut to avoid specifying the separators explicitly).
> >
> > That way the main class can avoid being complicated by the relatively rare
> > need to operate on another platform's paths, while still supporting the ability.
>
> You ought to have predefined classes for the standard OSes. Expecting
> people to know the values for sep and extsep seems unhelpful.
>
> In fact, unless you expect to support the os.path interface forever,
> as well as the new interface,  I'd assume there would have to be
> internal WindowsPath and PosixPath classes anyway - much like the
> current ntpath and posixpath modules. So keeping that structure, and
> simply having
>
> if os.name == 'posix':
>     Path = PosixPath
> elif os.name == 'nt':
>     Path = WindowsPath
> ... etc
>
> at the end, would seem simplest.

Why not just put a platform-specific Path class inside posixpath,
macpath, etc.  Python already chooses os.path for the actual platform,
and you can import a foreign module directly if you want to.

> (But all the current proposals seem to build on os.path, so maybe I
> should assume otherwise, that os.path will remain indefinitely...)

They build on os.path because that's what we're familiar with using. 
There's no reason to write the platform-specific classes until we
agree on an API; that would just be work down the drain.  When the new
classes are in the library, we can:  (one or more)

- Leave os.path.foo() alone because it works and most existing programs need it.

- Discourage os.path.foo() in the documentation but continue to support it.

- Rewrite os.path.foo() to use Path.foo().  A lot of useless work if we...

- Remove os.path.foo() in Python 3.0.

--
Mike Orr <sluggoster at gmail.com>
(mso at oz.net address is semi-reliable)

From fredrik at pythonware.com  Fri May  5 09:27:16 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 5 May 2006 09:27:16 +0200
Subject: [Python-Dev] Python sprint mechanics
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
	<445AE48B.1070508@v.loewis.de>
Message-ID: <e3euoo$h6a$1@sea.gmane.org>

Martin v. Löwis wrote:

> I think Fredrik Lundh points to svk at such occasions.

SVK makes it trivial to mirror a remote SVN repository, and make zillions
of local light-weight branches against that repository (e.g.one branch
per bug you're working on); see e.g.

    http://gcc.gnu.org/wiki/SvkHelp

for a brief tutorial.

I think you can set things up so others can work against your local
repository, but I haven't done that myself.  anyone here knows more
about this ?

(if that turns out to be hard, it's trivial to work with patch sets under
SVK).

</F>




From vys at renet.ru  Fri May  5 09:50:12 2006
From: vys at renet.ru (Vladimir 'Yu' Stepanov)
Date: Fri, 05 May 2006 11:50:12 +0400
Subject: [Python-Dev] binary trees. Review obmalloc.c
In-Reply-To: <20060502222033.678D.JCARLSON@uci.edu>
References: <20060426094148.66EE.JCARLSON@uci.edu> <44508866.60503@renet.ru>
	<20060502222033.678D.JCARLSON@uci.edu>
Message-ID: <445B03B4.6070600@renet.ru>

Josiah Carlson wrote:
> "Vladimir 'Yu' Stepanov" <vys at renet.ru> wrote:
>
>   
>> Comparison of functions of sorting and binary trees not absolutely 
>> correctly. I think that function sort will lose considerably on
>> greater lists. Especially after an insert or removal of all one element.
>>     
>
> Generally speaking, people who understand at least some of Python's
> internals (list internals specifically), will not be *removing* entries
> from lists one at a time (at least not using list.pop(0) ), because that
> is slow.  If they need to remove items one at a time from the smallest
> to the largest, they will usually use list.reverse(), then repeatedly
> list.pop(), as this is quite fast (in general).
>   
Yes. I understood it when resulted a set example.
> However, as I just said, people usually don't remove items from
> just-sorted lists, they tend to iterate over them via 'for i in list:' .
>   
Such problem arises at creation of the list of timers. And this list is in
constant change: addition/removal of elements in the list.
collections.deque here does not approach, as if addition in the big list is
made or search of the nearest value on the average it is necessary to lead
quantity of checks N/2 is made. For a binary tree the quantity of necessary
checks on former is equal log2 (N).

Other variant of use of binary trees: search of concurrence to ranges. Such
ranges can be blocks IP of addresses. Also this kind of the dictionary can
be used for a fast finding, whether the given point enters into one of
pieces. These problems can be realized by means of binary search. For
binary search the same lacks, as for above resulted example are
characteristic: the insert and removal for lists are carried out slowly and
after an insert sorting of the list is required. Except for that function
of binary search is absent in standard package Python. It is possible to
write its analogue on pure Python, but it will be more than twice more
slowly.
>  - Josiah
>
> As an aside, I have personally implemented trees a few times for
> different reasons.  One of the issues I have with most tree
> implementations is that it is generally difficult to do things like
> "give me the kth smallest/largest item".  Of course the standard
> solution is what is generally called a "partial order" or "order
> statistic" tree, but then you end up with yet another field in your
> structure.  One nice thing about Red-Black trees combined with
> order-statistic trees, is that you can usually use the uppermost bit of
> the order statistics for red/black information.  Of course, this is
> really only interesting from an "implement this in C" perspective;
> because if one were implementing it in Python, one may as well be really
> lazy and not bother implementing guaranteed balanced trees and be
> satisfied with randomized-balancing via Treaps.
Here that I have found through Google on a line " order statistic binary 
tree ":

    http://www.cs.mcgill.ca/~cs251/OldCourses/1997/topic20/

I have understood, that similar I have already made something :). The 
result of
the test shows on the most bad result, but is certainly worse, than a usual
tree. Badly that up to the end I have not tested this realization. 
Percent on
90 I am assured that it is efficient.

There is a question. What API should be used? Therefore as if to address to
object, as to MAP to object __getitem__ will return one result. And if as to
the list - another.

Here the list of already existing attributes and methods:

    class xtree(__builtin__.object)
     |  X-Tree. Binary tree with AVL balance mechanism.
     |
     |  Methods defined here:
     |
     |  __contains__(...)
     |      x.__contains__(y) <==> y in x
     |
     |  __delitem__(...)
     |      x.__delitem__(y) <==> del x[y]
     |
     |  __getitem__(...)
     |      x.__getitem__(y) <==> x[y]
     |
     |  __iter__(...)
     |      x.__iter__() <==> iter(x)
     |
     |  __len__(...)
     |      x.__len__() <==> len(x)
     |
     |  __repr__(...)
     |      x.__repr__() <==> repr(x)
     |
     |  __setitem__(...)
     |      x.__setitem__(i, y) <==> x[i]=y
     |
     |  append(...)
     |      D.append((k: v)) -> None, append new pair element into tree 
with sorting.
     |      Return pair element if argument is exist.
     |
     |  clear(...)
     |      D.clear() -> None. Remove all items from D.
     |
     |  copy(...)
     |      D.copy() -> a shallow copy of D.
     |
     |  get(...)
     |      D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.
     |
     |  has_key(...)
     |      D.has_key(k) -> True if D has a key k, else False.
     |
     |  items(...)
     |      D.items() -> list of D's (key, value) pairs.
     |
     |  iteritems(...)
     |      D.iteritems() -> an iterator over the (key, value) items of D.
     |
     |  iterkeys(...)
     |      D.iterkeys() -> an iterator over the keys of D.
     |
     |  itervalues(...)
     |      D.itervalues() -> an iterator over the values of D.
     |
     |  keys(...)
     |      D.keys() -> list of D's keys.
     |
     |  pop(...)
     |      D.pop(k[,d]) -> v, remove specified key and return the 
corresponding value.
     |      If key is not found, d is returned if given, otherwise 
KeyError is raised.
     |
     |  popitem(...)
     |      D.popmin() -> (k, v), remove and return minimal (key, value) 
pair as a
     |      2-tuple; but raise KeyError if D is empty.
     |
     |  popmax(...)
     |      D.popmax() -> (k, v), remove and return maximal (key, value) 
pair as a
     |      2-tuple; but raise KeyError if D is empty.
     |
     |  popmin(...)
     |      D.popmin() -> (k, v), remove and return minimal (key, value) 
pair as a
     |      2-tuple; but raise KeyError if D is empty.
     |
     |  pushmax(...)
     |      D.pushmax(item) -> (k, v), remove and return maximal (key, 
value) pair as a
     |      2-tuple; but raise KeyError if D is empty.
     |
     |  pushmin(...)
     |      D.pushmin(item) -> (k, v), remove and return minimal (key, 
value) pair as a
     |      2-tuple; but raise KeyError if D is empty.
     |
     |  rebuild(...)
     |      D.rebuild() -> None. Take compare function for rebuild 
dictionary.
     |      It works without use of additional memory on a place.
     |
     |  setdefault(...)
     |      D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not 
in D.
     |
     |  update(...)
     |      D.update(E) -> None.  Update D from E and F: for k in E: 
D[k] = E[k]
     |      (if E has keys else: for (k, v) in E: D[k] = v).
     |
     |  values(...)
     |      D.values() -> list of D's values.
     |
     |  
----------------------------------------------------------------------
     |  Data descriptors defined here:
     |
     |  cmp
     |      function of comparison of objects
     |
     |  freeze
     |      to block an opportunity change of object.
     |
     |  
----------------------------------------------------------------------
     |  Data and other attributes defined here:
     |
     |  __new__ = <built-in method __new__ of type object at 0x282a4aa0>
     |      T.__new__(S, ...) -> a new object with type S, a subtype of T

> Of course all of this discussion about trees in Python is fine, though
> there is one major problem.  With minimal work, one can construct 3
> values such that ((a < b < c) and (c < a)) is true.
>   
The same problem is inherent in the standard dictionary - dict (), only
roots at it, it is others. For example:

class test:
        def __hash__(self):
                return random.randint(0, 1000000)

The binary tree is very easy for tearing down. Therefore when any reference
to it is made, blocking either on reading, or on record should be
established. If it is opened iterator on the given object also blocking on
reading should be established where iterator will not come to the end or
while it will not be closed or destroyed. Thus it is possible to guarantee
security of object. Functions of blocking are similar on
pthread_rdlock_trylock, and pthread_wrlock_trylock. If blocking is
impossible, exception RDLockError, WRLockError, FreezeLockError is
generated.

Thanks.

--
Vladimir Stepanov
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: avl-timeit.py
Url: http://mail.python.org/pipermail/python-dev/attachments/20060505/6dd159df/attachment.asc 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test-normal.txt
Url: http://mail.python.org/pipermail/python-dev/attachments/20060505/6dd159df/attachment.txt 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test-order-statistic.txt
Url: http://mail.python.org/pipermail/python-dev/attachments/20060505/6dd159df/attachment-0001.txt 

From p.f.moore at gmail.com  Fri May  5 10:13:00 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 5 May 2006 09:13:00 +0100
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
Message-ID: <79990c6b0605050113j5b05396dle6b467325633fa1@mail.gmail.com>

On 5/5/06, Tim Peters <tim.peters at gmail.com> wrote:
> Since I hope we see a lot more of these problems in the future, what
> can be done to ease the pain?  I don't know enough about SVN admin to
> know what might be realistic.  Adding a pile of  "temporary
> committers" comes to mind, but wouldn't really work since people will
> want to keep working on their branches after the sprint ends.  Purely
> local SVN setups wouldn't work either, since sprint projects will
> generally have more than one worker bee, and they need to share code
> changes.  There:  I think that's enough to prove I don't have a clue
> :-)

Is it possible to create a branch in the main Python svn, and grant
commit privs to that branch only, for sprint participants? I seem to
recall  something like mod_authzsvn being involved, but I don't know
much more...

Paul.

From ncoghlan at gmail.com  Fri May  5 10:48:39 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 05 May 2006 18:48:39 +1000
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <6e9196d20605050026o6675b2d4jc97380bb4dde4ced@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>	<4459FA21.6030508@gmail.com>
	<445A017D.4080207@ofai.at>	<445A07D6.403@gmail.com>	<79990c6b0605040718s10b6e177s95906387806e2508@mail.gmail.com>
	<6e9196d20605050026o6675b2d4jc97380bb4dde4ced@mail.gmail.com>
Message-ID: <445B1167.5000401@gmail.com>

Mike Orr wrote:
> On 5/4/06, Paul Moore <p.f.moore at gmail.com> wrote:
>> (But all the current proposals seem to build on os.path, so maybe I
>> should assume otherwise, that os.path will remain indefinitely...)
> 
> They build on os.path because that's what we're familiar with using. 
> There's no reason to write the platform-specific classes until we
> agree on an API; that would just be work down the drain.  When the new
> classes are in the library, we can:  (one or more)
> 
> - Leave os.path.foo() alone because it works and most existing programs need it.

The threading module doesn't really obsolete the thread module, it just 
provides a higher level, more convenient API.

Similarly, I don't believe it's a given that a nice path object will obsolete 
the low level operations. When translating a shell script to Python (or vice 
versa), having access to the comparable low level operations would be of benefit.

At most, I would expect provision of an OO path API to result in a comment in 
the documentation of various modules (os.path, shutil, fnmatch, glob) saying 
that "pathlib.Path" (or whatever it ends up being called) is generally a more 
convenient API.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From murman at gmail.com  Fri May  5 15:20:02 2006
From: murman at gmail.com (Michael Urman)
Date: Fri, 5 May 2006 08:20:02 -0500
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <e3ert4$93f$1@sea.gmane.org>
References: <4453B025.3080100@acm.org> <e36f6e$pns$1@sea.gmane.org>
	<44573D75.40007@canterbury.ac.nz>
	<ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>
	<445805EA.7000501@canterbury.ac.nz>
	<740c3aec0605031621v48f33dd0g591cb12a131764e8@mail.gmail.com>
	<44598128.8060506@canterbury.ac.nz> <e3dcj3$sdq$1@sea.gmane.org>
	<445AB280.8060701@canterbury.ac.nz> <e3ert4$93f$1@sea.gmane.org>
Message-ID: <dcbbbb410605050620s5f0844f0idb9b852bab924e7f@mail.gmail.com>

On 5/5/06, Terry Reedy <tjreedy at udel.edu> wrote:
> At present, Python allows this as a choice.

Not always - take a look from another perspective:

def make_person(**kwds):
    name = kwds.pop('name', None)
    age = kwds.pop('age', None)
    phone = kwds.pop('phone', None)
    location = kwds.pop('location', None)
    ...

This already requires the caller to use keywords, but results in
horrid introspection based documentation. You know it takes some
keywords, but you have no clue what keywords they are. It's as bad as
calling help() on many of the C functions in the python stdlib.

So what allowing named keyword-only arguments does for us is allows us
to document this case. That is an absolute win.

Michael
--
Michael Urman  http://www.tortall.net/mu/blog

From benji at benjiyork.com  Fri May  5 15:48:05 2006
From: benji at benjiyork.com (Benji York)
Date: Fri, 05 May 2006 09:48:05 -0400
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <79990c6b0605050113j5b05396dle6b467325633fa1@mail.gmail.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
	<79990c6b0605050113j5b05396dle6b467325633fa1@mail.gmail.com>
Message-ID: <445B5795.60104@benjiyork.com>

Paul Moore wrote:
> Is it possible to create a branch in the main Python svn, and grant
> commit privs to that branch only, for sprint participants?

I'm not familiar with the mechanics, recent versions of Subversion allow 
per-directory security.  We do this to give some customers read access 
to parts of the repo, and read-write to others.  It shouldn't be 
difficult (given a recent enough Subversion) to set up a sprint area in 
the repo.
--
Benji York


From fdrake at acm.org  Fri May  5 15:51:03 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 5 May 2006 09:51:03 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <e3ert4$93f$1@sea.gmane.org>
References: <4453B025.3080100@acm.org> <445AB280.8060701@canterbury.ac.nz>
	<e3ert4$93f$1@sea.gmane.org>
Message-ID: <200605050951.04048.fdrake@acm.org>

On Friday 05 May 2006 02:38, Terry Reedy wrote:
 > My point has been that the function writer should not make such a
 > requirement (for four no-defaut, required params) and that proposing to do
 > so with the proposed '*' is an abuse (for public code).  The caller should

And what exactly is the point at which constraining use goes from unreasonable 
to reasonable?  Perhaps that involves a judgement call?  I think it does.  
Since we're all consenting adults, we should have the tools to make our 
judgements easy to apply.

Since it requires specific action to make the constraint (insertion of the "*" 
marker), there doesn't appear to be any real issue here.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From exarkun at divmod.com  Fri May  5 16:01:23 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Fri, 5 May 2006 10:01:23 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <dcbbbb410605050620s5f0844f0idb9b852bab924e7f@mail.gmail.com>
Message-ID: <20060505140123.22481.872593371.divmod.quotient.21698@ohm>

On Fri, 5 May 2006 08:20:02 -0500, Michael Urman <murman at gmail.com> wrote:
>On 5/5/06, Terry Reedy <tjreedy at udel.edu> wrote:
>> At present, Python allows this as a choice.
>
>Not always - take a look from another perspective:
>
>def make_person(**kwds):
>    name = kwds.pop('name', None)
>    age = kwds.pop('age', None)
>    phone = kwds.pop('phone', None)
>    location = kwds.pop('location', None)
>    ...
>
>This already requires the caller to use keywords, but results in
>horrid introspection based documentation. You know it takes some
>keywords, but you have no clue what keywords they are. It's as bad as
>calling help() on many of the C functions in the python stdlib.
>
>So what allowing named keyword-only arguments does for us is allows us
>to document this case. That is an absolute win.

Here you go:

    import inspect

    @decorator
    def keyword(f):
        def g(*a, **kw):
            if a:
                raise TypeError("Arguments must be passed by keyword")
            args = []
            expectedArgs = inspect.getargspec(f)[0]
            defaults = inspect.getargspec(f)[-1]
            for i, a in enumerate(expectedArgs):
                if a in kw:
                    args.append(kw[a])
                elif i >= len(expectedArgs) - len(defaults):
                    args.append(defaults[i - len(expectedArgs)])
                else:
                    raise TypeError("Missing argument for " + a)
            return f(*args)
        return g

    @keyword
    def foo(a, b, c=10, d=20, e=30):
        return a, b, c, d, e

    try:
        foo()
    except TypeError:
        pass
    else:
        assert False, "Should have raised TypeError"


    try:
        foo(1)
    except TypeError:
        pass
    else:
        assert False, "Should have raised TypeError"


    try:
        foo(a=1)
    except TypeError:
        pass
    else:
        assert False, "Should have raised TypeError"


    assert foo(a=1, b=2) == (1, 2, 10, 20, 30)
    assert foo(a=1, b=2, c=3) == (1, 2, 3, 20, 30)
    assert foo(a=1, b=2, d=4) == (1, 2, 10, 4, 30)
    assert foo(a=1, b=2, c=3, d=4, e=5) == (1, 2, 3, 4, 5)
    print 'Success!'

From ncoghlan at gmail.com  Fri May  5 16:16:15 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 06 May 2006 00:16:15 +1000
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <200605050951.04048.fdrake@acm.org>
References: <4453B025.3080100@acm.org>
	<445AB280.8060701@canterbury.ac.nz>	<e3ert4$93f$1@sea.gmane.org>
	<200605050951.04048.fdrake@acm.org>
Message-ID: <445B5E2F.4080900@gmail.com>

Fred L. Drake, Jr. wrote:
> On Friday 05 May 2006 02:38, Terry Reedy wrote:
>  > My point has been that the function writer should not make such a
>  > requirement (for four no-defaut, required params) and that proposing to do
>  > so with the proposed '*' is an abuse (for public code).  The caller should
> 
> And what exactly is the point at which constraining use goes from unreasonable 
> to reasonable?  Perhaps that involves a judgement call?  I think it does.  
> Since we're all consenting adults, we should have the tools to make our 
> judgements easy to apply.
> 
> Since it requires specific action to make the constraint (insertion of the "*" 
> marker), there doesn't appear to be any real issue here.

And I imagine API designers that abused the feature would end up being abused 
by their users :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From fdrake at acm.org  Fri May  5 16:50:45 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 5 May 2006 10:50:45 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <445B5E2F.4080900@gmail.com>
References: <4453B025.3080100@acm.org> <200605050951.04048.fdrake@acm.org>
	<445B5E2F.4080900@gmail.com>
Message-ID: <200605051050.45802.fdrake@acm.org>

On Friday 05 May 2006 10:16, Nick Coghlan wrote:
 > And I imagine API designers that abused the feature would end up being
 > abused by their users :)

If used to create poor APIs, I think that's a reasonable outcome.  :-)

I don't think using such a feature to constrain APIs is necessarily abuse, 
which is what seems to be the point of disagreement in this thread.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From coder_infidel at hotmail.com  Fri May  5 19:06:54 2006
From: coder_infidel at hotmail.com (Luke Dunstan)
Date: Sat, 6 May 2006 01:06:54 +0800
Subject: [Python-Dev] Python for Windows CE
References: <BAY109-DAV2BF904600E22663CCC5038AB70@phx.gbl>
	<44591A9F.1090304@v.loewis.de>
Message-ID: <BAY109-DAV141C5E22ABDD28FAF9765E8AB50@phx.gbl>


----- Original Message ----- 
From: ""Martin v. L?wis"" <martin at v.loewis.de>
To: "Luke Dunstan" <coder_infidel at hotmail.com>
Cc: <Python-Dev at python.org>
Sent: Thursday, May 04, 2006 5:03 AM
Subject: Re: [Python-Dev] Python for Windows CE


> Luke Dunstan wrote:
>> 1. Is there any reason in principle why patches for Windows CE support 
>> would
>> be rejected?
>
> No, not in principle. Of course,
> - the patch shouldn't break anything
> - you should state an explicit commitment to support the port for
>  some years (or else it might get removed at the slightest problem)

Thanks for your response. I hope I can continue to support Python for 
Windows CE for a while at least.

>> 4. The latest existing patch set uses os.name = "ce", and this can be 
>> used
>> to distinguish Windows CE from other operating systems in Python code. 
>> The
>> existing patches also set sys.platform = "Pocket PC", but I am not so 
>> keen
>> on keeping this, mainly because Windows CE is equivalent to Win32 in many
>> cases. Any comments?
>
> I would guide the decision based on the number of API changes you need
> to make to the standard library (e.g. distutils). In any case, the
> platform module should be able to reliably report the specific system
> (including the specific processor architecture).

OK. Actually I think distutils will be the last thing to be ported because 
it is not necessary for using the rest of Python. Does distutils has support 
for cross-compiling anyway?

>
>> (b) Instead we could use #ifdef HAVE_DIRECT_H, requiring patches to 
>> #define
>> HAVE_DIRECT_H for other platforms
>
> That would be best. Python generally uses autoconf methodology, which
> implies that conditionally-existing headers should be detected using
> HAVE_*_H.
>
> Regards,
> Martin

OK, but what about ANSI C headers like signal.h? I expected HAVE_SIGNAL_H to 
exist already but I just came across this patch:

http://svn.python.org/view/python/trunk/pyconfig.h.in?rev=35255&r1=35221&r2=35255

I am guessing that the reason for this patch was to reduce the number of 
#ifdefs in the Python source to ease maintenance. Would you want to add some 
of them back again?

Luke

From jcarlson at uci.edu  Fri May  5 20:27:42 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 05 May 2006 11:27:42 -0700
Subject: [Python-Dev] binary trees. Review obmalloc.c
In-Reply-To: <445B03B4.6070600@renet.ru>
References: <20060502222033.678D.JCARLSON@uci.edu> <445B03B4.6070600@renet.ru>
Message-ID: <20060505105630.67BE.JCARLSON@uci.edu>


"Vladimir 'Yu' Stepanov" <vys at renet.ru> wrote:
> Josiah Carlson wrote:
> > However, as I just said, people usually don't remove items from
> > just-sorted lists, they tend to iterate over them via 'for i in list:' .
> >   
> Such problem arises at creation of the list of timers.

I've never seen that particular use-case.

Please understand something.  I believe that trees can be useful in some
cases. However, I don't believe that they are generally useful
enough in Python, given that we already have key,value dictionaries and
sortable lists.  They become even less worthwhile in Python, given the
total ordering problem that I describe later in this message.


> Except for that function of binary search is absent in standard
> package Python.

Binary search exists in the standard library as the bisect module.


> Here that I have found through Google on a line " order statistic binary 
> tree ":
> 
>     http://www.cs.mcgill.ca/~cs251/OldCourses/1997/topic20/

Yes, that link does describe what an 'order statistic tree' is.  One
thing to note is that you can add order statistics to any tree wih one
more field on each tree node.  That is, you can have your AVL or
Red-Black tree, and add order statistics to it.  Allowing tree['x'] to
return the associated value for the key 'x' (the same as a dictionary),
as well as offer the ability to do tree.select(10) to get the 10th (or
11th if one counts from 0) smallest key (or key,value), or get the 'rank'
of a key with tree.rank('x').

> > Of course all of this discussion about trees in Python is fine, though
> > there is one major problem.  With minimal work, one can construct 3
> > values such that ((a < b < c) and (c < a)) is true.
> >   
> The same problem is inherent in the standard dictionary - dict (), only
> roots at it, it is others. For example:

This problem has nothing to do with dictionaries and hashing, it has to
do with the fact that there may not be a total ordering on the elements
of a sequence.

>>> 'b' < (0,) < u'a' < 'b'
True
>>> x = ['b', (0,), u'a']
>>> random.shuffle(x)
>>> x.sort()
>>> x
[u'a', 'b', (0,)]
>>> random.shuffle(x)
>>> x.sort()
>>> x
['b', (0,), u'a']
>>>


What effect does this have on trees?  Imagine, for a moment, that your
tree looked like:

    (0,)
   /    \
 'b'    u'a'

Then you added sufficient data so that your tree was rebalanced to be
something like:

    u'a'
    /   \
  (0,) (stuff)
  /
'b'

If you were searching for 'b', you would never find it, because in your
search, you would compare 'b' against u'a', and take the right branch,
where 'b' isn't.

 - Josiah


From jcarlson at uci.edu  Fri May  5 20:32:30 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 05 May 2006 11:32:30 -0700
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <20060505140123.22481.872593371.divmod.quotient.21698@ohm>
References: <dcbbbb410605050620s5f0844f0idb9b852bab924e7f@mail.gmail.com>
	<20060505140123.22481.872593371.divmod.quotient.21698@ohm>
Message-ID: <20060505113104.67C1.JCARLSON@uci.edu>


Jean-Paul Calderone <exarkun at divmod.com> wrote:
> 
> On Fri, 5 May 2006 08:20:02 -0500, Michael Urman <murman at gmail.com> wrote:
> >On 5/5/06, Terry Reedy <tjreedy at udel.edu> wrote:
> >> At present, Python allows this as a choice.
> >
> >Not always - take a look from another perspective:
> >
> >def make_person(**kwds):
> >    name = kwds.pop('name', None)
> >    age = kwds.pop('age', None)
> >    phone = kwds.pop('phone', None)
> >    location = kwds.pop('location', None)
> >    ...
> >
> >This already requires the caller to use keywords, but results in
> >horrid introspection based documentation. You know it takes some
> >keywords, but you have no clue what keywords they are. It's as bad as
> >calling help() on many of the C functions in the python stdlib.
> >
> >So what allowing named keyword-only arguments does for us is allows us
> >to document this case. That is an absolute win.
> 
> Here you go:
[snip]

Nice work!  So, can we add this to the functools module right next to
decorator, and bypass the syntax change?

 - Josiah


From ian.bollinger at gmail.com  Fri May  5 21:25:22 2006
From: ian.bollinger at gmail.com (Ian D. Bollinger)
Date: Fri, 05 May 2006 15:25:22 -0400
Subject: [Python-Dev] lambda in Python
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1E6C5@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5FF1E6C5@au3010avexu1.global.avaya.com>
Message-ID: <445BA6A2.50901@gmail.com>

Delaney, Timothy (Tim) wrote:
> Should we add an explicit rule to the Python-dev spam filter for Xah?
> Based on his past history, I doubt we'll ever see anything useful from
> him.
>   
I'm not sure Xah is so much a troll as he is completely out of his 
mind.  At any rate, it seems he never has anything coherent to 
contribute.  I, however, dislike the idea of censoring someone just for 
being incoherent and obnoxious.  Ignoring him is probably the most 
pragmatic thing to do.

-- Ian D. Bollinger

From tjreedy at udel.edu  Fri May  5 22:19:19 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 5 May 2006 16:19:19 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org>
	<e36f6e$pns$1@sea.gmane.org><44573D75.40007@canterbury.ac.nz><ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com><445805EA.7000501@canterbury.ac.nz><740c3aec0605031621v48f33dd0g591cb12a131764e8@mail.gmail.com><44598128.8060506@canterbury.ac.nz>
	<e3dcj3$sdq$1@sea.gmane.org><445AB280.8060701@canterbury.ac.nz>
	<e3ert4$93f$1@sea.gmane.org>
	<dcbbbb410605050620s5f0844f0idb9b852bab924e7f@mail.gmail.com>
Message-ID: <e3gc05$g74$1@sea.gmane.org>


"Michael Urman" <murman at gmail.com> wrote in message 
news:dcbbbb410605050620s5f0844f0idb9b852bab924e7f at mail.gmail.com...

> On 5/5/06, Terry Reedy <tjreedy at udel.edu> wrote:
>> At present, Python allows this as a choice.

I made that statement in the context of comparing these syntaxes

def make_person(name, age, phone, location) # current
def make_person(*, name, age, phone, location) # possible future

for a function with required, no-default parameters.  In that context, my 
statement is true.

> Not always - take a look from another perspective:

As I wrote it, I was aware that it was not true in the alternate context 
(not perspective) of using **names.  Many true statements become otherwise 
when their context is ripped away.

> def make_person(**kwds):
>    name = kwds.pop('name', None)
>    age = kwds.pop('age', None)
>    phone = kwds.pop('phone', None)
>    location = kwds.pop('location', None)

This version gives the parameters default values and make them optional. 
The current version, ignoring error messages, of the possible future above 
is

def make_person(**names):
    name = names.pop('name')
    age = names.pop('age')
    phone = names.pop('phone')
    location = names.pop('location')
    if names: raise KeyError("Extra named args passed")

Now there are no defaults and all are required.

> This already requires the caller to use keywords,

Your version allows the seemingly senseless no-keyword call make_person().

> but results in horrid introspection based documentation.
> You know it takes some
> keywords, but you have no clue what keywords they are. It's as bad as
> calling help() on many of the C functions in the python stdlib.

Improving doc strings for C functions is a separate issue.

> So what allowing named keyword-only arguments does for us is allows us
> to document this case. That is an absolute win.

That is your opinion, whereas mine is that this generally a bad case that 
should not be written and that making it easier to write and therefore more 
likely is a loss.

Terry Jan Reedy




From benji at benjiyork.com  Fri May  5 23:01:26 2006
From: benji at benjiyork.com (Benji York)
Date: Fri, 05 May 2006 17:01:26 -0400
Subject: [Python-Dev] lambda in Python
In-Reply-To: <445BA6A2.50901@gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5FF1E6C5@au3010avexu1.global.avaya.com>
	<445BA6A2.50901@gmail.com>
Message-ID: <445BBD26.9010302@benjiyork.com>

Ian D. Bollinger wrote:
> I'm not sure Xah is so much a troll as he is completely out of his 
> mind.

Is that Bollinger's law?

   Any sufficiently advanced insanity is indistinguishable from trolling.

--
Benji York


From tjreedy at udel.edu  Fri May  5 23:26:51 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 5 May 2006 17:26:51 -0400
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
References: <4453B025.3080100@acm.org>
	<445AB280.8060701@canterbury.ac.nz><e3ert4$93f$1@sea.gmane.org>
	<200605050951.04048.fdrake@acm.org>
Message-ID: <e3gfuo$s6i$1@sea.gmane.org>


"Fred L. Drake, Jr." <fdrake at acm.org> wrote in message 
news:200605050951.04048.fdrake at acm.org...
> On Friday 05 May 2006 02:38, Terry Reedy wrote:
> > My point has been that the function writer should not make such a
> > requirement (for four no-defaut, required params) and that proposing to 
> > do
> > so with the proposed '*' is an abuse (for public code).  The caller 
> > should
>
> And what exactly is the point at which constraining use goes from 
> unreasonable
> to reasonable?  Perhaps that involves a judgement call?  I think it does.

Well, if so, it is my increasingly strong judgment that requiring another 
programmer to junk up his or her code with

  make_person(name=name, age=age, phone=phone, location = location)

# instead of #

  make_person(name, age, phone, location)

# or *

  make_person(name=person_data[0], age=person_data[2],
      phone=person_data[3], location=person_data[3])

# instead #

  make_person(*person_data)

is generally unreasonable.  Remember, any person who actually prefers to 
write the longer forms is free to do so.

> Since we're all consenting adults,

That is the basis for my objection.  Most of the 'reasons' given for 
imposing the longer forms above have been about *not* treating 
function-calling programmers as fellow consenting adults, or rather, about 
treating them as non-adults.

One exception is the claim that in the alpha phase of a library project, 
the developer might be willing to freeze parameter names before freezing 
parameter positions.  But this should be rare, temporary, and might be 
dealt with as well by the common procedure of stating that function 
signatures (APIs) are subject to change, while indicating which aspect are 
considered most stable or unstable.

>  we should have the tools to make our judgements easy to apply.

The discussion is about a tool to make a very disputable judgment easier to 
impose on others

> Since it requires specific action to make the constraint (insertion of 
> the "*"
> marker), there doesn't appear to be any real issue here.

I do not see the logic of this statement.  I could just as well claim that 
it is constraints that require action, and which are therefore, in a sense, 
optional, that are disputable issues.

Terry Jan Reedy




From martin at v.loewis.de  Fri May  5 23:50:10 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 05 May 2006 23:50:10 +0200
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <79990c6b0605050113j5b05396dle6b467325633fa1@mail.gmail.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
	<79990c6b0605050113j5b05396dle6b467325633fa1@mail.gmail.com>
Message-ID: <445BC892.1060808@v.loewis.de>

Paul Moore wrote:
> Is it possible to create a branch in the main Python svn, and grant
> commit privs to that branch only, for sprint participants? I seem to
> recall  something like mod_authzsvn being involved, but I don't know
> much more...

We couldn't technically enforce it - but I'm sure sprint participants
would restrict checkins to a branch if they were told to.

However, I don't see how that would help.

Regards,
Martin

From martin at v.loewis.de  Fri May  5 23:51:09 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 05 May 2006 23:51:09 +0200
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <445B5795.60104@benjiyork.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>	<79990c6b0605050113j5b05396dle6b467325633fa1@mail.gmail.com>
	<445B5795.60104@benjiyork.com>
Message-ID: <445BC8CD.7060909@v.loewis.de>

Benji York wrote:
> I'm not familiar with the mechanics, recent versions of Subversion allow 
> per-directory security.  We do this to give some customers read access 
> to parts of the repo, and read-write to others.  It shouldn't be 
> difficult (given a recent enough Subversion) to set up a sprint area in 
> the repo.

It works fine for http(s), but not for svn+ssh.

Regards,
Martin

From martin at v.loewis.de  Fri May  5 23:59:10 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 05 May 2006 23:59:10 +0200
Subject: [Python-Dev] Python for Windows CE
In-Reply-To: <BAY109-DAV141C5E22ABDD28FAF9765E8AB50@phx.gbl>
References: <BAY109-DAV2BF904600E22663CCC5038AB70@phx.gbl>	<44591A9F.1090304@v.loewis.de>
	<BAY109-DAV141C5E22ABDD28FAF9765E8AB50@phx.gbl>
Message-ID: <445BCAAE.2010000@v.loewis.de>

Luke Dunstan wrote:
> OK. Actually I think distutils will be the last thing to be ported because 
> it is not necessary for using the rest of Python. Does distutils has support 
> for cross-compiling anyway?

No, it doesn't.

> OK, but what about ANSI C headers like signal.h? I expected HAVE_SIGNAL_H to 
> exist already but I just came across this patch:
> 
> http://svn.python.org/view/python/trunk/pyconfig.h.in?rev=35255&r1=35221&r2=35255
> 
> I am guessing that the reason for this patch was to reduce the number of 
> #ifdefs in the Python source to ease maintenance. Would you want to add some 
> of them back again?

Yes. with a record what system (and version) requires them, so we can
remove them again when the system becomes unsupported.

Regards,

From benji at benjiyork.com  Sat May  6 00:18:39 2006
From: benji at benjiyork.com (Benji York)
Date: Fri, 05 May 2006 18:18:39 -0400
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <445BC8CD.7060909@v.loewis.de>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>	<79990c6b0605050113j5b05396dle6b467325633fa1@mail.gmail.com>	<445B5795.60104@benjiyork.com>
	<445BC8CD.7060909@v.loewis.de>
Message-ID: <445BCF3F.5040909@benjiyork.com>

Martin v. L?wis wrote:
> Benji York wrote:
> 
>>I'm not familiar with the mechanics, recent versions of Subversion allow 
>>per-directory security.
> 
> It works fine for http(s), but not for svn+ssh.

Versions prior to 1.3 could use Apache's authorization system. 
Subversion 1.3 added a path-based authorization feature to svnserve.
--
Benji York

From murman at gmail.com  Sat May  6 01:07:21 2006
From: murman at gmail.com (Michael Urman)
Date: Fri, 5 May 2006 18:07:21 -0500
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <20060505140123.22481.872593371.divmod.quotient.21698@ohm>
References: <dcbbbb410605050620s5f0844f0idb9b852bab924e7f@mail.gmail.com>
	<20060505140123.22481.872593371.divmod.quotient.21698@ohm>
Message-ID: <dcbbbb410605051607m4d6875a6re101db9828ac23ac@mail.gmail.com>

On 5/5/06, Jean-Paul Calderone <exarkun at divmod.com> wrote:
>     @keyword
>     def foo(a, b, c=10, d=20, e=30):
>         return a, b, c, d, e

Cute, indeed. That decorator implementation is not as flexible as the
* which can go after positional parameters, but of course that is easy
to tweak. However the part I didn't bring up as a reason to prefer
explicit syntax for required-as-keyword is performance. I suspect
syntactic support will be faster than **kw.pop; I'm almost certain
it's faster than decorator gimmicks. I'm not certain at all that it's
a concern either way.

Michael
--
Michael Urman  http://www.tortall.net/mu/blog

From rasky at develer.com  Sat May  6 01:12:33 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Sat, 6 May 2006 01:12:33 +0200
Subject: [Python-Dev] Alternative path suggestion
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
Message-ID: <151d01c67099$63e3f9b0$8d472597@bagio>

Noam Raphael <noamraph at gmail.com> wrote:

> The only drawback I can see in using a logical representation is that
> giving a path object to functions which expect a path string won't
> work. The immediate solution is to simply use str(p) instead of p. The
> long-term solution is to make all related functions accept a path
> object.


I'm not an expert in this field, but I believe that if you can make your Path
object support the so-called buffer interface, it would be directly usable for
functions like open() without an explicit conversion.

Giovanni Bajo


From rasky at develer.com  Sat May  6 01:16:07 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Sat, 6 May 2006 01:16:07 +0200
Subject: [Python-Dev] Alternative path suggestion
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com><6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com><4459FA21.6030508@gmail.com>
	<445AAA6B.7030108@canterbury.ac.nz>
Message-ID: <152b01c67099$e3a89020$8d472597@bagio>

Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

>> Similarly, I would separate out the extension to a distinct
>> attribute, as it too uses a different separator from the normal path
>> elements ('.' most places, but '/' on RISC OS, for example)
>
> -1. What constitutes "the extension" is not well-defined in
> all cases. What about filenames with multiple suffixes,
> such as "spam.tar.gz"? What part of that would you put in
> the extension attribute?


.gz of course:

Path = "foo.tar.gz"
Path.ext = ".gz"
Path.name.ext = ".tar"
Path.name.name.ext = ""

Which is exactly the *same* thing that os.path.splitext() does. And yes, I do
use splitext quite a lot.

Giovanni Bajo


From xah at xahlee.org  Sat May  6 02:23:14 2006
From: xah at xahlee.org (xahlee)
Date: Fri, 5 May 2006 17:23:14 -0700
Subject: [Python-Dev] A critic of Guido's blog on Python's lambda
In-Reply-To: <C4BE8293-5F0D-44EE-AE1F-6655B4A5474C@xahlee.org>
References: <C4BE8293-5F0D-44EE-AE1F-6655B4A5474C@xahlee.org>
Message-ID: <9271BA58-9A0A-4713-AE3D-388C487ECC8F@xahlee.org>

In this post, i'd like to deconstruct one of Guido's recent blog  
about lambda in Python.

In Guido's blog written in 2006-02-10 at http://www.artima.com/ 
weblogs/viewpost.jsp?thread=147358

is first of all, the title ?Language Design Is Not Just Solving  
Puzzles?. In the outset, and in between the lines, we are told  
that ?I'm the supreme intellect, and I created Python?.

This seems impressive, except that the tech geekers due to their  
ignorance of sociology as well as lack of analytic abilities of the  
mathematician, do not know that creating a language is a act that  
requires little qualifications. However, creating a language that is  
used by a lot people takes considerable skill, and a big part of that  
skill is salesmanship. Guido seems to have done it well and seems to  
continue selling it well, where, he can put up a title of  
belittlement and get away with it too.

Gaudy title aside, let's look at the content of his say. If you  
peruse the 700 words, you'll find that it amounts to that Guido does  
not like the suggested lambda fix due to its multi-line nature, and  
says that he don't think there could possibly be any proposal he'll  
like. The reason? Not much! Zen is bantered about, mathematician's  
impractical ways is waved, undefinable qualities are given, human's  
right brain (neuroscience) is mentioned for support, Rube Goldberg  
contrivance phraseology is thrown, and coolness of Google Inc is  
reminded for the tech geekers (in juxtaposition of a big notice that  
Guido works there.).

If you are serious, doesn't this writing sounds bigger than its  
content? Look at the gorgeous ending: ?This is also the reason why  
Python will never have continuations, and even why I'm uninterested  
in optimizing tail recursion. But that's for another installment.?.

This benevolent geeker is gonna give us another INSTALLMENT!

There is a computer language leader by the name of Larry Wall, who  
said that ?The three chief virtues of a programmer are: Laziness,  
Impatience and Hubris? among quite a lot of other ingenious  
outpourings. It seems to me, the more i learn about Python and its  
leader, the more similarities i see.

So Guido, i understand that selling oneself is a inherent and  
necessary part of being a human animal. But i think the lesser beings  
should be educated enough to know that fact. So that when minions  
follow a leader, they have a clear understanding of why and what.

----

Regarding the lambda in Python situation... conceivably you are right  
that Python lambda is perhaps at best left as it is crippled, or even  
eliminated. However, this is what i want: I want Python literatures,  
and also in Wikipedia, to cease and desist stating that Python  
supports functional programing. (this is not necessarily a bad  
publicity) And, I want the Perl literatures to cease and desist  
saying they support OOP. But that's for another installment.

This post is archived at:
http://xahlee.org/UnixResource_dir/writ/python_lambda_guido.html

     Xah
     xah at xahlee.org
? http://xahlee.org/



?





From kbk at shore.net  Sat May  6 05:35:38 2006
From: kbk at shore.net (Kurt B. Kaiser)
Date: Fri, 5 May 2006 23:35:38 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200605060335.k463Zc9n003686@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  378 open ( +0) /  3216 closed (+17) /  3594 total (+17)
Bugs    :  894 open ( -7) /  5811 closed (+19) /  6705 total (+12)
RFE     :  216 open ( +2) /   215 closed ( +1) /   431 total ( +3)

New / Reopened Patches
______________________

Rename functional to functools  (2006-04-29)
       http://python.org/sf/1478788  opened by  Collin Winter

Take advantage of BaseException/Exception split in cookielib  (2006-04-29)
       http://python.org/sf/1478993  opened by  John J Lee

Split open() and file()  (2006-04-29)
CLOSED http://python.org/sf/1479181  opened by  Aahz

ColorDelegator - Several bug fixes  (2006-04-30)
       http://python.org/sf/1479219  opened by  Tal Einat

Fix building with SWIG's -c++ option set in setup.py  (2006-04-30)
       http://python.org/sf/1479255  opened by  dOb

Make urllib2 digest auth and basic auth play together  (2006-04-30)
       http://python.org/sf/1479302  opened by  John J Lee

with statement context managers - with should be keyword  (2006-04-30)
CLOSED http://python.org/sf/1479438  opened by  Matt Fleming

speed up function calls  (2006-04-30)
       http://python.org/sf/1479611  opened by  Neal Norwitz

Heavy revisions to urllib2 howto  (2006-05-01)
       http://python.org/sf/1479977  opened by  John J Lee

weakref dict methods  (2006-05-01)
CLOSED http://python.org/sf/1479988  opened by  Fred L. Drake, Jr.

urllib2 digest auth redirection bug causes 400 error  (2006-05-01)
CLOSED http://python.org/sf/1480067  opened by  John J Lee

patch smtplib:when SMTPDataError, rset crashes with sslerror  (2006-05-03)
       http://python.org/sf/1481032  opened by  kxroberto

Support HTTP_REFERER in CGIHTTPServer.py  (2006-05-03)
       http://python.org/sf/1481079  opened by  S??bastien Martini

Python long option support  (2006-05-03)
       http://python.org/sf/1481112  opened by  Heiko Wundram

Cleaned up 16x16px icons for windows.  (2006-05-03)
       http://python.org/sf/1481304  opened by  goxe

imputil "from" os.path import bug  (2006-05-03)
CLOSED http://python.org/sf/1481530  opened by  Eric Huss

Patches Closed
______________

--enable-universalsdk on Mac OS X  (2006-04-17)
       http://python.org/sf/1471883  closed by  ronaldoussoren

Split open() and file()  (2006-04-29)
       http://python.org/sf/1479181  closed by  nnorwitz

urllib2 ProxyBasicAuthHandler broken  (2006-04-15)
       http://python.org/sf/1470846  closed by  gbrandl

Forbid iteration over strings  (2006-04-16)
       http://python.org/sf/1471291  closed by  gbrandl

Fix for urllib/urllib2 ftp bugs 1357260 and 1281692  (2006-04-15)
       http://python.org/sf/1470976  closed by  gbrandl

cPickle produces locale-dependent dumps  (2006-04-20)
       http://python.org/sf/1473625  closed by  gbrandl

IndentationError for unexpected indent  (2006-04-25)
       http://python.org/sf/1475845  closed by  loewis

rlcompleter to be usable without readline  (2006-04-19)
       http://python.org/sf/1472854  closed by  gbrandl

fix for #1472251  (2006-04-18)
       http://python.org/sf/1472263  closed by  gbrandl

with statement context managers - with should be keyword  (2006-04-30)
       http://python.org/sf/1479438  closed by  gbrandl

fixed handling of nested  comments in mail addresses  (2006-04-05)
       http://python.org/sf/1464708  closed by  bwarsaw

weakref dict methods  (2006-05-01)
       http://python.org/sf/1479988  closed by  fdrake

urllib2 digest auth redirection bug causes 400 error  (2006-05-01)
       http://python.org/sf/1480067  closed by  gbrandl

Fix to allow urllib2 digest auth to talk to livejournal.com  (2005-02-18)
       http://python.org/sf/1143695  closed by  gbrandl

pdb: fix for #1472191('clear' command bug)  (2006-04-18)
       http://python.org/sf/1472184  closed by  gbrandl

__init__.py'less package import warnings  (2006-04-26)
       http://python.org/sf/1477281  closed by  gbrandl

imputil "from" os.path import bug  (2006-05-04)
       http://python.org/sf/1481530  closed by  gbrandl

New / Reopened Bugs
___________________

urllib2.Request constructor to urllib.quote the url given  (2006-04-20)
CLOSED http://python.org/sf/1473560  reopened by  gbrandl

'compile' built-in function failures when missing EOL  (2006-04-30)
       http://python.org/sf/1479099  opened by  Ori Peleg

test_inspect fails on WinXP (2.5a2)  (2006-04-30)
CLOSED http://python.org/sf/1479226  opened by  Miki Tebeka

Segmentation fault while using Tkinter  (2006-05-01)
       http://python.org/sf/1479586  opened by  Ali Gholami Rudi

Uninstall does not clearn registry  (2006-05-01)
       http://python.org/sf/1479626  opened by  Miki Tebeka

float->str rounding bug  (2006-05-01)
CLOSED http://python.org/sf/1479776  opened by  Michael Nolta

Quitter object masked  (2006-05-01)
       http://python.org/sf/1479785  opened by  Jim Jewett

mmap tries to truncate special files  (2006-05-02)
CLOSED http://python.org/sf/1480678  opened by  Carl Banks

long(float('nan'))!=0L  (2006-05-03)
       http://python.org/sf/1481296  opened by  Erik Dahl

parse_makefile doesn't handle $$  correctly  (2006-05-03)
       http://python.org/sf/1481347  opened by  Han-Wen Nienhuys

Docs on import of email.MIMExxx classes wrong  (2006-05-04)
       http://python.org/sf/1481650  opened by  Hugh Gibson

hpux ia64 shared lib ext should be ".so"  (2006-05-04)
       http://python.org/sf/1481770  opened by  David Everly

Shift+Backspace exhibits odd behavior  (2006-05-04)
       http://python.org/sf/1482122  opened by  NothingCanFulfill

socket.getsockopt bug  (2006-05-05)
CLOSED http://python.org/sf/1482328  opened by  ganges master

Forwarding events and Tk.mainloop problem  (2006-05-05)
       http://python.org/sf/1482402  opened by  Matthias Kievernagel

asyncore is not listed in test_sundry  (2006-05-05)
       http://python.org/sf/1482746  opened by  jackilyn

Bugs Closed
___________

urllib2.Request constructor to urllib.quote the url given  (2006-04-20)
       http://python.org/sf/1473560  closed by  gbrandl

size limit exceeded for read() from network drive  (2006-04-28)
       http://python.org/sf/1478529  closed by  tim_one

urllib2 AuthHandlers can pass a bad host to HTTPPasswordMgr  (2004-02-20)
       http://python.org/sf/900898  closed by  gbrandl

test_inspect fails on WinXP (2.5a2)  (2006-04-30)
       http://python.org/sf/1479226  closed by  gbrandl

urllib violates rfc 959  (2005-09-04)
       http://python.org/sf/1281692  closed by  gbrandl

urllib/urllib2 cannot ftp files which are not listable.  (2005-11-15)
       http://python.org/sf/1357260  closed by  gbrandl

test_ctypes: undefined symbol: XGetExtensionVersion  (2006-04-28)
       http://python.org/sf/1478253  closed by  gbrandl

compiler module does not detect a syntax error  (2005-12-19)
       http://python.org/sf/1385040  closed by  gbrandl

Tkinter hangs in test_tcl  (2006-04-23)
       http://python.org/sf/1475162  closed by  loewis

float->str rounding bug  (2006-05-01)
       http://python.org/sf/1479776  closed by  tim_one

setup.py --help-commands exception  (2006-04-16)
       http://python.org/sf/1471407  closed by  jwpye

HTMLParser doesn't treat endtags in <script> tags as CDATA  (2004-10-21)
       http://python.org/sf/1051840  closed by  fdrake

HTMLParser can't handle page with javascript  (2004-11-30)
       http://python.org/sf/1076070  closed by  fdrake

mmap tries to truncate special files  (2006-05-02)
       http://python.org/sf/1480678  closed by  tim_one

urllib2 authentication problem  (2003-02-05)
       http://python.org/sf/680577  closed by  gbrandl

urllib has trouble with Windows filenames  (2006-02-22)
       http://python.org/sf/1436428  closed by  gbrandl

Defining a class with __dict__ brakes attributes assignment  (2006-03-11)
       http://python.org/sf/1448042  closed by  gbrandl

pdb 'clear' command doesn't clear selected bp's  (2006-04-18)
       http://python.org/sf/1472191  closed by  gbrandl

socket.getsockopt bug  (2006-05-05)
       http://python.org/sf/1482328  closed by  gangesmaster

New / Reopened RFE
__________________

Add SeaMonkey to webbrowser.py  (2006-04-25)
CLOSED http://python.org/sf/1476166  reopened by  phd

IOBaseError  (2006-05-03)
       http://python.org/sf/1481036  opened by  kxroberto

RFE Closed
__________

Add SeaMonkey to webbrowser.py  (2006-04-25)
       http://python.org/sf/1476166  closed by  gbrandl

"idna" encoding (drawing unicodedata) necessary in httplib?  (2006-04-18)
       http://python.org/sf/1472176  closed by  gbrandl


From martin at v.loewis.de  Sat May  6 08:11:29 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 06 May 2006 08:11:29 +0200
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <445BCF3F.5040909@benjiyork.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>	<79990c6b0605050113j5b05396dle6b467325633fa1@mail.gmail.com>	<445B5795.60104@benjiyork.com>
	<445BC8CD.7060909@v.loewis.de> <445BCF3F.5040909@benjiyork.com>
Message-ID: <445C3E11.8090009@v.loewis.de>

Benji York wrote:
>>> I'm not familiar with the mechanics, recent versions of Subversion
>>> allow per-directory security.
>>
>> It works fine for http(s), but not for svn+ssh.
> 
> Versions prior to 1.3 could use Apache's authorization system.

How would that work? Apache is not involved at all.

> Subversion 1.3 added a path-based authorization feature to svnserve.

That's what I mean by "does not work fine": I would need to update
to subversion 1.3.

Regards,
Martin

From martin at v.loewis.de  Sat May  6 08:20:26 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 06 May 2006 08:20:26 +0200
Subject: [Python-Dev] binary trees. Review obmalloc.c
In-Reply-To: <445B03B4.6070600@renet.ru>
References: <20060426094148.66EE.JCARLSON@uci.edu>
	<44508866.60503@renet.ru>	<20060502222033.678D.JCARLSON@uci.edu>
	<445B03B4.6070600@renet.ru>
Message-ID: <445C402A.1090809@v.loewis.de>

Vladimir 'Yu' Stepanov wrote:
> Yes. I understood it when resulted a set example.
>> However, as I just said, people usually don't remove items from
>> just-sorted lists, they tend to iterate over them via 'for i in list:' .
>>   
> Such problem arises at creation of the list of timers.

For a list of timers/timeout events, a priority queue is often the best
data structure, which is already available through the heapq module.

Regards,
Martin

From sluggoster at gmail.com  Sat May  6 08:29:36 2006
From: sluggoster at gmail.com (Mike Orr)
Date: Fri, 5 May 2006 23:29:36 -0700
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <445B1167.5000401@gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com> <445A017D.4080207@ofai.at>
	<445A07D6.403@gmail.com>
	<79990c6b0605040718s10b6e177s95906387806e2508@mail.gmail.com>
	<6e9196d20605050026o6675b2d4jc97380bb4dde4ced@mail.gmail.com>
	<445B1167.5000401@gmail.com>
Message-ID: <6e9196d20605052329s444e7a06i2850b6be16c7dbc9@mail.gmail.com>

On 5/5/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Mike Orr wrote:
> > On 5/4/06, Paul Moore <p.f.moore at gmail.com> wrote:
> >> (But all the current proposals seem to build on os.path, so maybe I
> >> should assume otherwise, that os.path will remain indefinitely...)
> >
> > They build on os.path because that's what we're familiar with using.
> > There's no reason to write the platform-specific classes until we
> > agree on an API; that would just be work down the drain.  When the new
> > classes are in the library, we can:  (one or more)
> >
> > - Leave os.path.foo() alone because it works and most existing programs need it.
>
> The threading module doesn't really obsolete the thread module, it just
> provides a higher level, more convenient API.
>
> Similarly, I don't believe it's a given that a nice path object will obsolete
> the low level operations. When translating a shell script to Python (or vice
> versa), having access to the comparable low level operations would be of benefit.

I think you meant to say Perl in that sentence.  In Python there
should be one way to do it, and beautiful is better than ugly.  The
os.path functions are bad enough, but shutil.copy2 is just plain evil.
   Is it that much of a step to translate:

    Y="$(dirname $(dirname $X))/lib"

as

    y = Path(x).parent.parent + "lib"
    y = Path(x).parent.parent.join("lib")
    or whatever the syntax do jour is

rather than

    y = os.path.join(os.path.dirname(os.path.dirname(x)), "lib")

?

--
Mike Orr <sluggoster at gmail.com>
(mso at oz.net address is semi-reliable)

From ncoghlan at gmail.com  Sat May  6 08:40:50 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 06 May 2006 16:40:50 +1000
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <152b01c67099$e3a89020$8d472597@bagio>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com><6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com><4459FA21.6030508@gmail.com>	<445AAA6B.7030108@canterbury.ac.nz>
	<152b01c67099$e3a89020$8d472597@bagio>
Message-ID: <445C44F2.2090005@gmail.com>

[Nick]
>>> Similarly, I would separate out the extension to a distinct
>>> attribute, as it too uses a different separator from the normal path
>>> elements ('.' most places, but '/' on RISC OS, for example)

[Greg]
>> -1. What constitutes "the extension" is not well-defined in
>> all cases. What about filenames with multiple suffixes,
>> such as "spam.tar.gz"? What part of that would you put in
>> the extension attribute?

[Giovanni]
> .gz of course:
> 
> Path = "foo.tar.gz"
> Path.ext = ".gz"
> Path.name.ext = ".tar"
> Path.name.name.ext = ""
> 
> Which is exactly the *same* thing that os.path.splitext() does. And yes, I do
> use splitext quite a lot.

Remember, the idea with portable path information is to *never* store os.sep 
and os.extsep anywhere in the internal data - those should only be added when 
needed to produce strings to pass to OS-specific functions or to display to users.

So I suggest splitting the internal data into 'path elements separated by 
os.sep', 'name elements separated by os.extsep' and 'the base path' (which can 
be any kind of object that supports __str__, including another path object).

This leads to the following:

import pathlib
from pathlib import Path, PosixPath, WindowsPath, HOMEDIR

pth = Path("~/foo/bar/baz.tar.gz"):

assert pth.basepath == HOMEDIR
assert pth.dirparts == ('foo', 'bar')
assert pth.nameparts == ('baz', 'tar', 'gz')

I would also provide an alternate constructor "from_parts" which looked like:

pth = Path.from_parts(HOMEDIR, ('foo', 'bar'), ('baz', 'tar', 'gz'))

Now consider PosixPath and WindowsPath subclasses:

psx = PosixPath(pth)
win = WindowsPath(pth)

(platform specific Path variants would still accept posix-formatted path 
strings & other Path variants as inputs in addition to path strings formatted 
for the current platform)

# Formatting of the prefix is platform dependent
assert pth.prefix == str(pth.basepath)
assert psx.prefix == "/home/nick/"
assert win.prefix == r"C:\Documents and Settings\Nick\"

# Formatting of the directory path is platform dependent
assert pth.dir == os.sep.join(pth.dirparts + ('',))
assert psx.dir == "foo/bar/"
assert win.dir == r"foo\bar\"

# Formatting of the file name is the same on both platforms
assert pth.name == os.extsep.join(pth.nameparts)
assert psx.name == "baz.tar.gz"
assert win.name == "baz.tar.gz"

# The full path name will then also be platform dependent
assert str(pth) == pth.prefix + pth.dir + pth.name
assert str(psx) == "/home/nick/foo/bar/baz.tar.gz"
assert str(win) == r"C:\Documents and Settings\Nick\foo\bar\baz.tar.gz"

The separation of the directory parts and the name parts also allows the
difference between "/foo/bar/" and "foo/bar" to be represented properly:

pth1 = PosixPath("/foo/bar/")
assert pth1.prefix = '/'
assert pth1.dir == "foo/bar/"
assert pth1.name == ""
assert str(pth1) == "/foo/bar/"

pth2 = PosixPath("/foo/bar")
assert pth2.prefix = '/'
assert pth2.dir == "foo/"
assert pth2.name == "bar"
assert str(pth2) == "/foo/bar"


Operations such as pth.walk() would then return path objects where basepath 
was set to the original path object, and only the relative path was included 
in "dirparts".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ryanf at cs.uoregon.edu  Sat May  6 08:28:43 2006
From: ryanf at cs.uoregon.edu (Ryan Forsythe)
Date: Fri, 05 May 2006 23:28:43 -0700
Subject: [Python-Dev] Yet another type system -- request for comments on a
	SoC proposal
Message-ID: <445C421B.1040800@cs.uoregon.edu>

I'm currently revising a proposal for the Google Summer of Code, and it
was suggested that I start a thread here to get input. Apologies for the
length, but I wanted this to more than just a link to my proposal.

The short version of my proposal is: The module would provide a
system by which the attributes of function parameters was inferred
through the testing process, and those attributes would be used to check
the arguments to function calls at runtime. In addition to a form of
runtime type checking, this information would also be used to
automatically generate up-to-date documentation.

I've built a *very* simple proof of concept here:
http://www.cs.uoregon.edu/~ryanf/verify.py

My full proposal is here:
http://www.cs.uoregon.edu/~ryanf/proposal.txt

It's been suggested that this could, instead, be an extension to one of
the lint tools; while I'm open to that suggestion it seems to me like
that would seriously change the character of the modified lint tool, as
it would be running tests rather than simply analyzing source.

So why have something automatically generate signatures, rather than
explicitly declare them? Currently, the knowledge about a function's
signature may be stored in four places (other than the function's calls):

1. The function itself: the programmer expects to be passed a group of
   objects which implement certain functions, and uses them as such.
2. The tests run against the function.
3. A sequence of checks on the passed objects, though the author may
   trust the function user to pass the right thing.
4. The documentation for the function, though, again, the author may
   leave that out.

The proposed module would use the knowledge embedded in the system by
the programmer in the first two points to generate the other
information. As the function changes over time, these generated items
will be kept up to date simply by re-running the new test cases.

A sample program using this module might look like the following:

    import sigcheck

    verify = sigcheck.decorator('./verify.pickle')      # note 1
    sigcheck.mode('learn')

    @verify                                             # note 2
    def foo(file, b, c):
        # ... do stuff ...

    class A:
        @verify
        def foo(self, a, b):
            return a * b

        # and, the worst case:
        @verify(('a', sigcheck.Int),                    # note 3.1
                ('b', ('m1', 'm2', 'm3')),              #      3.2
                ('c', sigcheck.Int, sigcheck.String),   #      3.3
                returns=sigcheck.String,                #      3.4
                raises=(MyException, MyOtherExc))       #      3.5
        def bar(self, a, b, c):
            # ... do some stuff...

1. The inferred information needs to be stored somewhere. A file could
   hold the pickled data structure between testing runs.

2. This line shows the most basic use of the decorator. The system
   uses inference for all the parameters.

3. There may be cases where the inferences are incorrect, the user
   wishes to be totally explicit about certain parameters, or wishes
   to fold the inferences into their code explicitly. By passing a
   data structure to the decorator they may partially or completely
   bypass the inferences.
   1. The system will come with some common 'types' built in.
   2. The user may also require attribute sets, free-form.
   3. If a parameter may have several, very different attribute sets,
      they may simply be listed.
   4. The return type may also be specified.
   5. The function may raise exceptions which are expected; these
      should be specified so the inference system doesn't treat them
      as errors.


From ncoghlan at gmail.com  Sat May  6 09:21:49 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 06 May 2006 17:21:49 +1000
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <6e9196d20605052329s444e7a06i2850b6be16c7dbc9@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>	
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>	
	<4459FA21.6030508@gmail.com> <445A017D.4080207@ofai.at>	
	<445A07D6.403@gmail.com>	
	<79990c6b0605040718s10b6e177s95906387806e2508@mail.gmail.com>	
	<6e9196d20605050026o6675b2d4jc97380bb4dde4ced@mail.gmail.com>	
	<445B1167.5000401@gmail.com>
	<6e9196d20605052329s444e7a06i2850b6be16c7dbc9@mail.gmail.com>
Message-ID: <445C4E8D.1040507@gmail.com>

Mike Orr wrote:
> I think you meant to say Perl in that sentence.  In Python there
> should be one way to do it, and beautiful is better than ugly.  The
> os.path functions are bad enough, but shutil.copy2 is just plain evil.

I agree a decent pathlib may make many of the low-level filesystem 
manipulation modules obsolete - just not necessarily all of them. Until we get 
an actual module to experiment with, we won't really know.

>   Is it that much of a step to translate:
> 
>    Y="$(dirname $(dirname $X))/lib"
> 
> as
> 
>    y = Path(x).parent.parent + "lib"
>    y = Path(x).parent.parent.join("lib")
>    or whatever the syntax do jour is

I'd suggest something like:

y = Path(x).parent(2) + "lib/"

Where the parent function works like (based on the three-part internal data 
approach I posted recently):

   def parent(self, generation=1):
       return type(self).from_parts(
                         self.basepath, self.dirparts[:-generation], ())

And path addition is equivalent to:

   def __add__(self, other):
       if isinstance(other, basestring):
           other = Path(other)
       if self.nameparts:
           raise ValueError("Cannot add to path with non-empty name")
       if other.basepath:
           raise ValueError("Cannot append non-relative path")
       dirparts = self.dirparts + other.dirparts
       return type(self).from_parts(
                     self.basepath, dirparts, other.nameparts)


Anyway, I'll try to find some time to look at this on the Wiki. It'll be 
easier to bat code back and forth over there.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From sluggoster at gmail.com  Sat May  6 09:37:05 2006
From: sluggoster at gmail.com (Mike Orr)
Date: Sat, 6 May 2006 00:37:05 -0700
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <4459FA21.6030508@gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com>
Message-ID: <6e9196d20605060037i159f3b7fp5acb6b1eb08cf2c0@mail.gmail.com>

There's something that bothers me about putting the path in an
attribute rather than subclassing tuple.  I prefer it that way but I
don't see how you'd do directory slicing and joining.  If the path is
a tuple it's easy:

p = Path("/a/b/c")
p[:-1]         # Path("/a/b")

If the directory components are on an attribute

p.path[:-1]       # ("a", "b")

How do you do slicing and joining?  If Path subclasses object, it
could be done there like in the first example.  But if Path subclasses
string, that API is taken:

p[:-1]                # "/a/b"


--
Mike Orr <sluggoster at gmail.com>
(mso at oz.net address is semi-reliable)

From martin at v.loewis.de  Sat May  6 09:57:23 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 06 May 2006 09:57:23 +0200
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <6e9196d20605060037i159f3b7fp5acb6b1eb08cf2c0@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>	<4459FA21.6030508@gmail.com>
	<6e9196d20605060037i159f3b7fp5acb6b1eb08cf2c0@mail.gmail.com>
Message-ID: <445C56E3.5030101@v.loewis.de>

Mike Orr wrote:
> There's something that bothers me about putting the path in an
> attribute rather than subclassing tuple.
[...]
> How do you do slicing and joining?  If Path subclasses object, it
> could be done there like in the first example.  But if Path subclasses
> string, that API is taken:

I think Path should neither subclass string nor tuple. It might behave
like a string or a tuple, but it shouldn't subclass, as that
unnecessarily constraints the implementation strategy.

In particular, it shouldn't subclass str, as you then could not
represent Unicode file names in a reasonable way anymore.

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Sat May  6 10:47:11 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 06 May 2006 20:47:11 +1200
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
Message-ID: <445C628F.3050801@canterbury.ac.nz>

Tim Peters wrote:
> Instead it would make best sense for each
> sprint project to work in its own branch, something SVN makes very
> easy, but only for those who _can_ commit.

There's no way of restricting commit privileges to
a particular branch?

--
Greg

From fperez.net at gmail.com  Sat May  6 11:00:49 2006
From: fperez.net at gmail.com (Fernando Perez)
Date: Sat, 06 May 2006 03:00:49 -0600
Subject: [Python-Dev] Python sprint mechanics
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
	<445AE48B.1070508@v.loewis.de>
Message-ID: <e3hok2$o4m$1@sea.gmane.org>

"Martin v. L?wis" wrote:

> Tim Peters wrote:
>> Since I hope we see a lot more of these problems in the future, what
>> can be done to ease the pain?  I don't know enough about SVN admin to
>> know what might be realistic.  Adding a pile of  "temporary
>> committers" comes to mind, but wouldn't really work since people will
>> want to keep working on their branches after the sprint ends.  Purely
>> local SVN setups wouldn't work either, since sprint projects will
>> generally have more than one worker bee, and they need to share code
>> changes.
> 
> I think Fredrik Lundh points to svk at such occasions.

Allow me to make a suggestion.  A few months ago, Robert Kern (scipy dev)
taught me a very neat trick for this kind of problem; we were going to hack
on ipython together and wanted a quick way to share code without touching
the main SVN repo (though both of us have commit rights).  His solution was
very simple, involving just twisted and bazaar-ng (Robert had some bad
experiences with SVK, though I suppose you could use that as well).

I later had the occasion to test it on a multi-day sprint with other
developers, and was extremely happy with the results.  For that sprint, I
documented the process here:

http://projects.scipy.org/neuroimaging/ni/wiki/SprintCodeSharing

You might find this useful.  In combination with IP aliasing over a local
subnet to create stable IPs, and a few named aliases in a little hosts
file, a group of people can find each other's data in an extremely
efficient way over a several days, sync back and forth as needed, etc.  At
the end of the sprint, the 'real' committers can evaluate how much of what
got done they want to push to the upstream SVN servers.

I hope this helps, feel free to ask for further details.

Cheers,

f


From vys at renet.ru  Sat May  6 11:10:06 2006
From: vys at renet.ru (Vladimir 'Yu' Stepanov)
Date: Sat, 06 May 2006 13:10:06 +0400
Subject: [Python-Dev] binary trees.
In-Reply-To: <20060505105630.67BE.JCARLSON@uci.edu>
References: <20060502222033.678D.JCARLSON@uci.edu> <445B03B4.6070600@renet.ru>
	<20060505105630.67BE.JCARLSON@uci.edu>
Message-ID: <445C67EE.1030704@renet.ru>

Josiah Carlson wrote:
> "Vladimir 'Yu' Stepanov" <vys at renet.ru> wrote:
>> Josiah Carlson wrote:
>>> However, as I just said, people usually don't remove items from
>>> just-sorted lists, they tend to iterate over them via 'for i in list:' .
>>>   
>> Such problem arises at creation of the list of timers.
>
> I've never seen that particular use-case.
>
> Please understand something.  I believe that trees can be useful in some
> cases. However, I don't believe that they are generally useful
> enough in Python, given that we already have key,value dictionaries and
> sortable lists.  They become even less worthwhile in Python, given the
> total ordering problem that I describe later in this message.

I tiresome. It so :)
Sorry for that.

>> Here that I have found through Google on a line " order statistic binary 
>> tree ":
>>
>>     http://www.cs.mcgill.ca/~cs251/OldCourses/1997/topic20/
>
> Yes, that link does describe what an 'order statistic tree' is.  One
> thing to note is that you can add order statistics to any tree wih one
> more field on each tree node.  That is, you can have your AVL or
> Red-Black tree, and add order statistics to it.  Allowing tree['x'] to
> return the associated value for the key 'x' (the same as a dictionary),
> as well as offer the ability to do tree.select(10) to get the 10th (or
> 11th if one counts from 0) smallest key (or key,value), or get the 'rank'
> of a key with tree.rank('x').

Thanks.

> This problem has nothing to do with dictionaries and hashing, it has to
> do with the fact that there may not be a total ordering on the elements
> of a sequence.

It is sad. I did not know it. Therefore and have not understood.

I have looked in Google on "python dev total ordering". On intentions to
correct this problem I have not found anything. It already earlier was
discussed ?

--
Vladimir Stepanov

From vys at renet.ru  Sat May  6 11:31:54 2006
From: vys at renet.ru (Vladimir 'Yu' Stepanov)
Date: Sat, 06 May 2006 13:31:54 +0400
Subject: [Python-Dev] binary trees.
In-Reply-To: <445C402A.1090809@v.loewis.de>
References: <20060426094148.66EE.JCARLSON@uci.edu>
	<44508866.60503@renet.ru>	<20060502222033.678D.JCARLSON@uci.edu>
	<445B03B4.6070600@renet.ru> <445C402A.1090809@v.loewis.de>
Message-ID: <445C6D0A.9070400@renet.ru>

Martin v. L?wis wrote:
> Vladimir 'Yu' Stepanov wrote:
>   
>> Yes. I understood it when resulted a set example.
>>     
>>> However, as I just said, people usually don't remove items from
>>> just-sorted lists, they tend to iterate over them via 'for i in list:' .
>>>   
>>>       
>> Such problem arises at creation of the list of timers.
>>     
>
> For a list of timers/timeout events, a priority queue is often the best
> data structure, which is already available through the heapq module.
>   

It is in most cases valid so. heapq very much approaches for these purposes.
The unique problem is connected with removal of an any element from the 
list.

--
Vladimir Stepanov

From jcarlson at uci.edu  Sat May  6 11:58:53 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 06 May 2006 02:58:53 -0700
Subject: [Python-Dev] binary trees.
In-Reply-To: <445C6D0A.9070400@renet.ru>
References: <445C402A.1090809@v.loewis.de> <445C6D0A.9070400@renet.ru>
Message-ID: <20060506025215.67D7.JCARLSON@uci.edu>


"Vladimir 'Yu' Stepanov" <vys at renet.ru> wrote:
> 
> Martin v. L?wis wrote:
> > Vladimir 'Yu' Stepanov wrote:
> >   
> >> Yes. I understood it when resulted a set example.
> >>     
> >>> However, as I just said, people usually don't remove items from
> >>> just-sorted lists, they tend to iterate over them via 'for i in list:' .
> >>>   
> >>>       
> >> Such problem arises at creation of the list of timers.
> >>     
> >
> > For a list of timers/timeout events, a priority queue is often the best
> > data structure, which is already available through the heapq module.
> >   
> 
> It is in most cases valid so. heapq very much approaches for these purposes.
> The unique problem is connected with removal of an any element from the 
> list.

For this particular problem, usually you don't want to say, "remove time
4.34252 from the heap", you want to say "remove the (time,event) pair
given the event X, from the heap", where you have in your heap (time,
event) pairs.

For hashable events, you can create a class which implements enough
list-like semantics to allow it to work like a heap for event timing
(__getitem__, __setitem__, __len__, .pop(-1) is about it, if I remember
correctly), but also allows you to remove arbitrary events (as items are
shifted around in the heap, you update pointers from a dictionary).

The details are a bit annoying to get right, but it is possible.


 - Josiah


From jcarlson at uci.edu  Sat May  6 12:12:11 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 06 May 2006 03:12:11 -0700
Subject: [Python-Dev] binary trees.
In-Reply-To: <445C67EE.1030704@renet.ru>
References: <20060505105630.67BE.JCARLSON@uci.edu> <445C67EE.1030704@renet.ru>
Message-ID: <20060506025902.67DA.JCARLSON@uci.edu>


"Vladimir 'Yu' Stepanov" <vys at renet.ru> wrote:
> Josiah Carlson wrote:
> > This problem has nothing to do with dictionaries and hashing, it has to
> > do with the fact that there may not be a total ordering on the elements
> > of a sequence.
> 
> It is sad. I did not know it. Therefore and have not understood.
> 
> I have looked in Google on "python dev total ordering". On intentions to
> correct this problem I have not found anything. It already earlier was
> discussed ?

Not really.  It has to do with how non-comparable instances compare to
each other.  Generally speaking, when instances aren't comparable,
Python compares their type, which is actually comparing the names of
types. This wouldn't be a big deal, except that:

>>> str < tuple < unicode
True

And you can actually compare str and unicode, so, if you have a str that
is greater than the unicode, you run into this issue.  With unicode
becoming str in Py3k, we may not run into this issue much then, unless
bytes are comparable to str, in which case we end up witht the same
problems.

Actually, according to my google of "python dev total ordering", it
gives me...

http://mail.python.org/pipermail/python-dev/2003-March/034169.html
http://mail.python.org/pipermail/python-list/2003-March/154142.html

Which were earlier discussions on this topic, which are quite topical. 
The ultimate solution is to choose a total ordering on types and
consider the problem solved.  Then list.sort() and binary trees will
work properly.


 - Josiah


From greg.ewing at canterbury.ac.nz  Sat May  6 13:02:28 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 06 May 2006 23:02:28 +1200
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <445C44F2.2090005@gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com> <445AAA6B.7030108@canterbury.ac.nz>
	<152b01c67099$e3a89020$8d472597@bagio> <445C44F2.2090005@gmail.com>
Message-ID: <445C8244.4060607@canterbury.ac.nz>

Nick Coghlan wrote:

> So I suggest splitting the internal data into 'path elements separated 
> by os.sep', 'name elements separated by os.extsep'

What bothers me about that is that in many systems
there isn't any formal notion of an "extension",
just a convention used by some applications.

Just because I have a "." in my filename doesn't
necessarily mean I intend what follows to be
treated as an extension.

> assert pth.basepath == HOMEDIR
> assert pth.dirparts == ('foo', 'bar')
> assert pth.nameparts == ('baz', 'tar', 'gz')

What if one of the dirparts contains a character
happening to match os.extsep? When you do
pth2 = pth[:-1], does it suddenly get split up
into multiple nameparts, even though it's actually
naming a directory rather than a file?

(This is not hypothetical -- it's a common convention
in some unix systems to use names like "spam.d" for
directories of configuration files.)

--
Greg

From greg.ewing at canterbury.ac.nz  Sat May  6 13:08:31 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 06 May 2006 23:08:31 +1200
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <6e9196d20605060037i159f3b7fp5acb6b1eb08cf2c0@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com>
	<6e9196d20605060037i159f3b7fp5acb6b1eb08cf2c0@mail.gmail.com>
Message-ID: <445C83AF.5000403@canterbury.ac.nz>

Mike Orr wrote:

> How do you do slicing and joining?  If Path subclasses object, it
> could be done there like in the first example.  But if Path subclasses
> string,

Er, hang on, I thought the idea was that it wouldn't
subclass either tuple *or* str, but would be a new
class all of its own.

That seems like the best idea, because it gives the
most flexibility for dealing with the quirks of
different systems' pathname semantics.

--
Greg

From rasky at develer.com  Sat May  6 14:37:22 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Sat, 6 May 2006 14:37:22 +0200
Subject: [Python-Dev] Alternative path suggestion
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com> <445AAA6B.7030108@canterbury.ac.nz>
	<152b01c67099$e3a89020$8d472597@bagio> <445C44F2.2090005@gmail.com>
	<445C8244.4060607@canterbury.ac.nz>
Message-ID: <17c201c67109$d256dc30$8d472597@bagio>

Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

>> So I suggest splitting the internal data into 'path elements
>> separated by os.sep', 'name elements separated by os.extsep'
>
> What bothers me about that is that in many systems
> there isn't any formal notion of an "extension",
> just a convention used by some applications.
>
> Just because I have a "." in my filename doesn't
> necessarily mean I intend what follows to be
> treated as an extension.

This is up to the application to find out or deduce. Python still needs a
proper support for this highly diffused conventions. I would say that *most* of
the files in my own hard-disk follow this convention (let's see: *.html, *.py,
*.sh, *.cpp, *.c, *.h, *.conf, *.jpg, *.mp3, *.png, *.txt, *.zip, *.tar, *.gz,
*.pl, *.odt, *.mid, *.so... ehi, I can't find a file which it doesn't follow
this convention). Even if you can have a Python file without extension, it
doesn't mean that an application should manually extract "foo" out of "foo.py"
just because you could also name it only "foo".

>> assert pth.basepath == HOMEDIR
>> assert pth.dirparts == ('foo', 'bar')
>> assert pth.nameparts == ('baz', 'tar', 'gz')
>
> What if one of the dirparts contains a character
> happening to match os.extsep? When you do
> pth2 = pth[:-1], does it suddenly get split up
> into multiple nameparts, even though it's actually
> naming a directory rather than a file?
>
> (This is not hypothetical -- it's a common convention
> in some unix systems to use names like "spam.d" for
> directories of configuration files.)

Yes. And there is absolutely nothing wrong with it. Do you have in mind any
real-world case in which an algorithm is broken by the fact that splitext()
doesn't magically guess what's an extension and what is not? People coding
Python applications *know* that splitext() will split the string using the
rightmost dot (if any). They *know* it doesn't do anything more than that; and
that it does that on any path string, even those where the rightmost dot
doesn't indicate a real extension. They know it's not magical, in that it can't
try to guess whether it is a real extension or not, and so does the simple,
clear, mechanical thing: it splits on the right-most dot.

And even if they know this "limitation" (if you want to call it so, I call it
"clear, consistent behaviour which applies to a not-always-consistently-used
convention"), the function is still useful.

Giovanni Bajo


From benji at benjiyork.com  Sat May  6 15:48:12 2006
From: benji at benjiyork.com (Benji York)
Date: Sat, 06 May 2006 09:48:12 -0400
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <445C3E11.8090009@v.loewis.de>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>	<79990c6b0605050113j5b05396dle6b467325633fa1@mail.gmail.com>	<445B5795.60104@benjiyork.com>
	<445BC8CD.7060909@v.loewis.de> <445BCF3F.5040909@benjiyork.com>
	<445C3E11.8090009@v.loewis.de>
Message-ID: <445CA91C.3010908@benjiyork.com>

Martin v. L?wis wrote:
> Benji York wrote:
>>Subversion 1.3 added a path-based authorization feature to svnserve.
> 
> That's what I mean by "does not work fine": I would need to update
> to subversion 1.3.

I noted that in my original message.  I thought you meant that it wasn't 
possible at all with svnserve.  A simple misunderstanding.
--
Benji York

From ryan at cs.uoregon.edu  Sat May  6 04:17:29 2006
From: ryan at cs.uoregon.edu (Ryan Forsythe)
Date: Fri, 05 May 2006 19:17:29 -0700
Subject: [Python-Dev] Yet another type system -- request for comments on a
	SoC proposal
Message-ID: <445C0739.5080404@cs.uoregon.edu>

I'm currently revising a proposal for the Google Summer of Code, and it
was suggested that I start a thread here to get input. Apologies for the
length, but I wanted this to more than just a link to my proposal.

The short version of my proposal is: The module would provide a
system by which the manifests* of function parameters was inferred
through the testing process, and those manifests would be used to check
the arguments to function calls at runtime. In addition to a form of
runtime type checking, this information would also be used to
automatically generate up-to-date documentation.

*: I'm using the word 'manifest' to refer to the list of methods /
attributes provided by an object -- that is, what you get from `dir(obj)`.

I've built a *very* simple proof of concept here:
http://www.cs.uoregon.edu/~ryanf/verify.py

My full proposal is here:
http://www.cs.uoregon.edu/~ryanf/proposal.txt

It's been suggested that this could, instead, be an extension to one of
the lint tools; while I'm open to that suggestion it seems to me like
that would seriously change the character of the modified lint tool, as
it would be running tests rather than simply analyzing source.

So why have something automatically generate signatures, rather than
explicitly declare them? Currently, the knowledge about a function's
signature may be stored in four places (other than the function's calls):

1. The function itself: the programmer expects to be passed a group of
   objects which implement certain functions, and uses them as such.
2. The tests run against the function.
3. A sequence of checks on the passed objects, though the author may
   trust the function user to pass the right thing.
4. The documentation for the function, though, again, the author may
   leave that out.

The proposed module would use the knowledge embedded in the system by
the programmer in the first two points to generate the other
information. As the function changes over time, these generated items
will be kept up to date simply by re-running the new test cases.

A sample program using this module might look like the following:

    import sigcheck

    verify = sigcheck.decorator('./verify.pickle')      # note 1
    sigcheck.mode('learn')

    @verify                                             # note 2
    def foo(file, b, c):
        # ... do stuff ...

    class A:
        @verify
        def foo(self, a, b):
            return a * b

        # and, the worst case:
        @verify(('a', sigcheck.Int),                    # note 3.1
                ('b', ('m1', 'm2', 'm3')),              #      3.2
                ('c', sigcheck.Int, sigcheck.String),   #      3.3
                returns=sigcheck.String,                #      3.4
                raises=(MyException, MyOtherExc))       #      3.5
        def bar(self, a, b, c):
            # ... do some stuff...

1. The infered information needs to be stored somewhere. A file could
   hold the pickled data structure between testing runs.

2. This line shows the most basic use of the decorator. The system
   uses inferrence for all the parameters.

3. There may be cases where the inferences are incorrect, the user
   wishes to be totally explicit about certain parameters, or wishes
   to fold the inferences into their code explicitly. By passing a
   data structure to the decorator they may partially or completely
   bypass the inferences.
   1. The system will come with some common 'types' built in.
   2. The user may also require manifests, free-form.
   3. If a parameter may have several, very different manifests, they
      may simply be listed.
   4. The return type may also be specified.
   5. The function may raise exceptions which are expected; these
      should be specified so the inference system doesn't treat them
      as errors.

From ncoghlan at gmail.com  Sat May  6 16:26:31 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 07 May 2006 00:26:31 +1000
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <445C8244.4060607@canterbury.ac.nz>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com> <445AAA6B.7030108@canterbury.ac.nz>
	<152b01c67099$e3a89020$8d472597@bagio> <445C44F2.2090005@gmail.com>
	<445C8244.4060607@canterbury.ac.nz>
Message-ID: <445CB217.8010802@gmail.com>

Greg Ewing wrote:
> Nick Coghlan wrote:
> 
>> So I suggest splitting the internal data into 'path elements separated 
>> by os.sep', 'name elements separated by os.extsep'
> 
> What bothers me about that is that in many systems
> there isn't any formal notion of an "extension",
> just a convention used by some applications.
> 
> Just because I have a "." in my filename doesn't
> necessarily mean I intend what follows to be
> treated as an extension.

This is an interesting point - I've been tinkering with this a bit (despite my 
best intentions about not getting distracted by the topic ;), and it's 
starting to look like treating directory paths and file paths identically 
poses significant problems for an object-oriented API.

For example, appending a filename or a relative directory path to an existing 
directory path is fine (giving a filename or directory path as the result), 
but you can't do either to a filename.

Listing the entries or walking a file doesn't make any sense either.

And extension information is typically significant only for filenames, not for 
directories.

When paths are just strings, the distinction doesn't matter because the 
instances have no inherent behaviour - it is the application that carries the 
state regarding whether something is meant to be a directory name or a file name.

So it makes a lot more sense to have separate Dirpath and Filepath (and 
possibly Linkpath) objects - from an OO point of view, the methods provided 
for each of these kinds of filesystem entities will be significantly 
different. A common parent Path object may still make sense if there's 
sufficient overlap, though (e.g. retrieving status related information).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From sluggoster at gmail.com  Sat May  6 18:05:43 2006
From: sluggoster at gmail.com (Mike Orr)
Date: Sat, 6 May 2006 09:05:43 -0700
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <445CB217.8010802@gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com> <445AAA6B.7030108@canterbury.ac.nz>
	<152b01c67099$e3a89020$8d472597@bagio> <445C44F2.2090005@gmail.com>
	<445C8244.4060607@canterbury.ac.nz> <445CB217.8010802@gmail.com>
Message-ID: <6e9196d20605060905w993b4c5sea33ceafc4bb933@mail.gmail.com>

On 5/6/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Greg Ewing wrote:
> > Nick Coghlan wrote:
> >
> >> So I suggest splitting the internal data into 'path elements separated
> >> by os.sep', 'name elements separated by os.extsep'
> >
> > What bothers me about that is that in many systems
> > there isn't any formal notion of an "extension",
> > just a convention used by some applications.
> >
> > Just because I have a "." in my filename doesn't
> > necessarily mean I intend what follows to be
> > treated as an extension.

Good points, Nick and Greg.  Splitting directories is unambiguous and
makes the API much simpler.  But filename extensions are just a
convention.  The user may want all, some, or no extensions split. 
With os.splitext() the user is explicitly saying, "Split this number
of extensions on this particular file."  We we should not presume to
automate this.

How about:

    .splitext(max_exts=1)    =>   (filename, list_of_extensions)

Another good idea yesterday:

    p.parent(2)                 # Fulfills my .ancestor() proposal.

> This is an interesting point - I've been tinkering with this a bit (despite my
> best intentions about not getting distracted by the topic ;), and it's
> starting to look like treating directory paths and file paths identically
> poses significant problems for an object-oriented API.
>
> For example, appending a filename or a relative directory path to an existing
> directory path is fine (giving a filename or directory path as the result),
> but you can't do either to a filename.
>
> Listing the entries or walking a file doesn't make any sense either.
>
> And extension information is typically significant only for filenames, not for
> directories.
>
> When paths are just strings, the distinction doesn't matter because the
> instances have no inherent behaviour - it is the application that carries the
> state regarding whether something is meant to be a directory name or a file name.
>
> So it makes a lot more sense to have separate Dirpath and Filepath (and
> possibly Linkpath) objects - from an OO point of view, the methods provided
> for each of these kinds of filesystem entities will be significantly
> different. A common parent Path object may still make sense if there's
> sufficient overlap, though (e.g. retrieving status related information).

Very interesting.  The user knows whether he wants a file or
directory, so why not force him to specify?  Then we can have a
standard exception if the actual filesystem object is inconsistent
with the spec.

--
Mike Orr <sluggoster at gmail.com>
(mso at oz.net address is semi-reliable)

From edloper at gradient.cis.upenn.edu  Sat May  6 18:11:59 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Sat, 6 May 2006 12:11:59 -0400
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <445C44F2.2090005@gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com><6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com><4459FA21.6030508@gmail.com>	<445AAA6B.7030108@canterbury.ac.nz>
	<152b01c67099$e3a89020$8d472597@bagio> <445C44F2.2090005@gmail.com>
Message-ID: <070566C2-0A27-4F63-A43A-971961E344A4@gradient.cis.upenn.edu>

On May 6, 2006, at 2:40 AM, Nick Coghlan wrote:
> Remember, the idea with portable path information is to *never*  
> store os.sep
> and os.extsep anywhere in the internal data - those should only be  
> added when
> needed to produce strings to pass to OS-specific functions or to  
> display to users.

If one of the path segments contained a path-splitting character,  
should it automatically get quoted?  (Is that even possible on all  
platforms?)  E.g., what would the following give me on windows?

str(Path('a', 'b\\c'))

-Edward

From phd at mail2.phd.pp.ru  Sat May  6 18:13:43 2006
From: phd at mail2.phd.pp.ru (Oleg Broytmann)
Date: Sat, 6 May 2006 20:13:43 +0400
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <070566C2-0A27-4F63-A43A-971961E344A4@gradient.cis.upenn.edu>
References: <445AAA6B.7030108@canterbury.ac.nz>
	<152b01c67099$e3a89020$8d472597@bagio> <445C44F2.2090005@gmail.com>
	<070566C2-0A27-4F63-A43A-971961E344A4@gradient.cis.upenn.edu>
Message-ID: <20060506161342.GD3768@phd.pp.ru>

On Sat, May 06, 2006 at 12:11:59PM -0400, Edward Loper wrote:
> If one of the path segments contained a path-splitting character,  
> should it automatically get quoted?  (Is that even possible on all  
> platforms?)  E.g., what would the following give me on windows?
> 
> str(Path('a', 'b\\c'))

   AFAIK there is no way to escape path-splitting characters (directory
separators, actually) on any major platform.

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From ryanf at cs.uoregon.edu  Sat May  6 18:28:17 2006
From: ryanf at cs.uoregon.edu (Ryan Forsythe)
Date: Sat, 06 May 2006 09:28:17 -0700
Subject: [Python-Dev] Yet another type system -- request for comments on
 a	SoC proposal
In-Reply-To: <445C0739.5080404@cs.uoregon.edu>
References: <445C0739.5080404@cs.uoregon.edu>
Message-ID: <445CCEA1.6010104@cs.uoregon.edu>

Apologies for the double post. Had my mail client misconfigured.

From noamraph at gmail.com  Sun May  7 00:21:47 2006
From: noamraph at gmail.com (Noam Raphael)
Date: Sun, 7 May 2006 01:21:47 +0300
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
Message-ID: <b348a0850605061521w3c06e181hed0ae3a9aa49fe25@mail.gmail.com>

Hello all,

I just wanted to say thanks for your encouraging comments and
participation, and to say that I'm sorry that I haven't replied yet -
unfortunately, I don't have an Internet connection where I stay most
evenings. I now read all your replies, but I want to reply seriously
and the time is getting late, so I hope I'll be able to reply
tomorrow.

Anyway, don't make my poor participation stop you from developing the
idea. (I know it doesn't, I just hope you forgive me.)

Have a good day,
Noam

From greg.ewing at canterbury.ac.nz  Sun May  7 03:22:47 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 07 May 2006 13:22:47 +1200
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <070566C2-0A27-4F63-A43A-971961E344A4@gradient.cis.upenn.edu>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<6e9196d20605040113x2a2d563ah23fabf02c49a0778@mail.gmail.com>
	<4459FA21.6030508@gmail.com> <445AAA6B.7030108@canterbury.ac.nz>
	<152b01c67099$e3a89020$8d472597@bagio> <445C44F2.2090005@gmail.com>
	<070566C2-0A27-4F63-A43A-971961E344A4@gradient.cis.upenn.edu>
Message-ID: <445D4BE7.6020308@canterbury.ac.nz>

Edward Loper wrote:

> If one of the path segments contained a path-splitting character,  
> should it automatically get quoted?

No, an exception should be raised if you try to construct
a Path object containing such a name. No such object could
exist in the file system, so there's no point in being
able to name it.

--
Greg

From tim.peters at gmail.com  Sun May  7 03:28:39 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 6 May 2006 21:28:39 -0400
Subject: [Python-Dev] binary trees.
In-Reply-To: <20060506025902.67DA.JCARLSON@uci.edu>
References: <20060505105630.67BE.JCARLSON@uci.edu> <445C67EE.1030704@renet.ru>
	<20060506025902.67DA.JCARLSON@uci.edu>
Message-ID: <1f7befae0605061828n798f6020xfbc3ca9b7111e31b@mail.gmail.com>

[Josiah Carlson]
> ...
> >>> str < tuple < unicode
> True
>
> And you can actually compare str and unicode, so, if you have a str that
> is greater than the unicode, you run into this issue.

Oh dear -- I didn't realize we still had holes like that:

>>> 'b' < () < u'a' < 'b'
True

We used to have a bunch of those with numeric types (like int < list <
long), and then hacked comparison to artificially lump all "the
numeric types" together (by pretending that their type names are the
empty string -- see default_3way_compare() in object.c:

	/* different type: compare type names; numbers are smaller */
	if (PyNumber_Check(v))
		vname = "";

).

That ensures, e.g., that int < list and long < list, despite that
"int" < "list" < "long".

> With unicode becoming str in Py3k, we may not run into this issue much
> then, unless bytes are comparable to str, in which case we end up with
> the same problems.

If Guido thinks about it in time :-), I expect it's more likely that
P3K will never fall back to comparing by type name; that non-equality
comparisons that currently fall back to looking at the type name will
raise exceptions instead.  That's already the case for the types most
recently added to Python, like

>>> {} < set()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: can only compare to a set
>>> import datetime
>>> {} < datetime.datetime.now()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: can't compare datetime.datetime to dict
>>> {} == set()
False
>>> {} == datetime.datetime.now()
False

From talin at acm.org  Sun May  7 03:54:31 2006
From: talin at acm.org (Talin)
Date: Sat, 06 May 2006 18:54:31 -0700
Subject: [Python-Dev] PEP 3101 Update
Message-ID: <445D5357.9060409@acm.org>

I've updated PEP 3101 based on the feedback collected so far.
---------------------
PEP: 3101
Title: Advanced String Formatting
Version: $Revision: 45928 $
Last-Modified: $Date: 2006-05-06 18:49:43 -0700 (Sat, 06 May 2006) $
Author: Talin <talin at acm.org>
Status: Draft
Type: Standards
Content-Type: text/plain
Created: 16-Apr-2006
Python-Version: 3.0
Post-History: 28-Apr-2006, 6-May-2006


Abstract

     This PEP proposes a new system for built-in string formatting
     operations, intended as a replacement for the existing '%' string
     formatting operator.


Rationale

     Python currently provides two methods of string interpolation:

     - The '%' operator for strings. [1]

     - The string.Template module. [2]

     The scope of this PEP will be restricted to proposals for built-in
     string formatting operations (in other words, methods of the
     built-in string type).

     The '%' operator is primarily limited by the fact that it is a
     binary operator, and therefore can take at most two arguments.
     One of those arguments is already dedicated to the format string,
     leaving all other variables to be squeezed into the remaining
     argument.  The current practice is to use either a dictionary or a
     tuple as the second argument, but as many people have commented
     [3], this lacks flexibility.  The "all or nothing" approach
     (meaning that one must choose between only positional arguments,
     or only named arguments) is felt to be overly constraining.

     While there is some overlap between this proposal and
     string.Template, it is felt that each serves a distinct need,
     and that one does not obviate the other.  In any case,
     string.Template will not be discussed here.


Specification

     The specification will consist of 4 parts:

     - Specification of a new formatting method to be added to the
       built-in string class.

     - Specification of a new syntax for format strings.

     - Specification of a new set of class methods to control the
       formatting and conversion of objects.

     - Specification of an API for user-defined formatting classes.


String Methods

     The build-in string class will gain a new method, 'format',
     which takes takes an arbitrary number of positional and keyword
     arguments:

         "The story of {0}, {1}, and {c}".format(a, b, c=d)

     Within a format string, each positional argument is identified
     with a number, starting from zero, so in the above example, 'a' is
     argument 0 and 'b' is argument 1.  Each keyword argument is
     identified by its keyword name, so in the above example, 'c' is
     used to refer to the third argument.

     The result of the format call is an object of the same type
     (string or unicode) as the format string.


Format Strings

     Brace characters ('curly braces') are used to indicate a
     replacement field within the string:

         "My name is {0}".format('Fred')

     The result of this is the string:

         "My name is Fred"

     Braces can be escaped using a backslash:

         "My name is {0} :-\{\}".format('Fred')

     Which would produce:

         "My name is Fred :-{}"

     The element within the braces is called a 'field'.  Fields consist
     of a 'field name', which can either be simple or compound, and an
     optional 'conversion specifier'.

     Simple field names are either names or numbers. If numbers, they
     must be valid base-10 integers; if names, they must be valid
     Python identifiers.  A number is used to identify a positional
     argument, while a name is used to identify a keyword argument.

     Compound names are a sequence of simple names seperated by
     periods:

         "My name is {0.name} :-\{\}".format(dict(name='Fred'))

     Compound names can be used to access specific dictionary entries,
     array elements, or object attributes.  In the above example, the
     '{0.name}' field refers to the dictionary entry 'name' within
     positional argument 0.

     Each field can also specify an optional set of 'conversion
     specifiers' which can be used to adjust the format of that field.
     Conversion specifiers follow the field name, with a colon (':')
     character separating the two:

         "My name is {0:8}".format('Fred')

     The meaning and syntax of the conversion specifiers depends on the
     type of object that is being formatted, however many of the
     built-in types will recognize a standard set of conversion
     specifiers.

     The conversion specifier consists of a sequence of zero or more
     characters, each of which can consist of any printable character
     except for a non-escaped '}'.

     Conversion specifiers can themselves contain replacement fields;
     this will be described in a later section.  Except for this
     replacement, the format() method does not attempt to intepret the
     conversion specifiers in any way; it merely passes all of the
     characters between the first colon ':' and the matching right
     brace ('}') to the various underlying formatters (described
     later.)


Standard Conversion Specifiers

     For most built-in types, the conversion specifiers will be the
     same or similar to the existing conversion specifiers used with
     the '%' operator.  Thus, instead of '%02.2x", you will say
     '{0:02.2x}'.

     There are a few differences however:

     - The trailing letter is optional - you don't need to say '2.2d',
       you can instead just say '2.2'.  If the letter is omitted, a
       default will be assumed based on the type of the argument.
       The defaults will be as follows:

         string or unicode object: 's'
         integer: 'd'
         floating-point number: 'f'
         all other types: 's'

     - Variable field width specifiers use a nested version of the {}
       syntax, allowing the width specifier to be either a positional
       or keyword argument:

         "{0:{1}.{2}d}".format(a, b, c)

     - The support for length modifiers (which are ignored by Python
       anyway) is dropped.

     For non-built-in types, the conversion specifiers will be specific
     to that type.  An example is the 'datetime' class, whose
     conversion specifiers are identical to the arguments to the
     strftime() function:

         "Today is: {0:%a %b %d %H:%M:%S %Y}".format(datetime.now())


Controlling Formatting

     A class that wishes to implement a custom interpretation of its
     conversion specifiers can implement a __format__ method:

     class AST:
         def __format__(self, specifiers):
             ...

     The 'specifiers' argument will be either a string object or a
     unicode object, depending on the type of the original format
     string.  The __format__ method should test the type of the
     specifiers parameter to determine whether to return a string or
     unicode object.  It is the responsibility of the __format__ method
     to return an object of the proper type.

     string.format() will format each field using the following steps:

      1) See if the value to be formatted has a __format__ method.  If
         it does, then call it.

      2) Otherwise, check the internal formatter within string.format
         that contains knowledge of certain builtin types.

      3) Otherwise, call str() or unicode() as appropriate.


User-Defined Formatting Classes

     There will be times when customizing the formatting of fields
     on a per-type basis is not enough.  An example might be an
     accounting application, which displays negative numbers in
     parentheses rather than using a negative sign.

     The string formatting system facilitates this kind of application-
     specific formatting by allowing user code to directly invoke
     the code that interprets format strings and fields.  User-written
     code can intercept the normal formatting operations on a per-field
     basis, substituting their own formatting methods.

     For example, in the aforementioned accounting application, there
     could be an application-specific number formatter, which reuses
     the string.format templating code to do most of the work. The
     API for such an application-specific formatter is up to the
     application; here are several possible examples:

         cell_format( "The total is: {0}", total )

         TemplateString( "The total is: {0}" ).format( total )

     Creating an application-specific formatter is relatively straight-
     forward.  The string and unicode classes will have a class method
     called 'cformat' that does all the actual work of formatting; The
     built-in format() method is just a wrapper that calls cformat.

     The parameters to the cformat function are:

         -- The format string (or unicode; the same function handles
            both.)
         -- A callable 'format hook', which is called once per field
         -- A tuple containing the positional arguments
         -- A dict containing the keyword arguments

     The cformat function will parse all of the fields in the format
     string, and return a new string (or unicode) with all of the
     fields replaced with their formatted values.

     The format hook is a callable object supplied by the user, which
     is invoked once per field, and which can override the normal
     formatting for that field.  For each field, the cformat function
     will attempt to call the field format hook with the following
     arguments:

        format_hook(value, conversion, buffer)

     The 'value' field corresponds to the value being formatted, which
     was retrieved from the arguments using the field name.

     The 'conversion' argument is the conversion spec part of the
     field, which will be either a string or unicode object, depending
     on the type of the original format string.

     The 'buffer' argument is a Python array object, either a byte
     array or unicode character array.  The buffer object will contain
     the partially constructed string; the field hook is free to modify
     the contents of this buffer if needed.

     The field_hook will be called once per field. The field_hook may
     take one of two actions:

         1) Return False, indicating that the field_hook will not
            process this field and the default formatting should be
            used.  This decision should be based on the type of the
            value object, and the contents of the conversion string.

         2) Append the formatted field to the buffer, and return True.


Alternate Syntax

     Naturally, one of the most contentious issues is the syntax of the
     format strings, and in particular the markup conventions used to
     indicate fields.

     Rather than attempting to exhaustively list all of the various
     proposals, I will cover the ones that are most widely used
     already.

     - Shell variable syntax: $name and $(name) (or in some variants,
       ${name}).  This is probably the oldest convention out there, and
       is used by Perl and many others.  When used without the braces,
       the length of the variable is determined by lexically scanning
       until an invalid character is found.

       This scheme is generally used in cases where interpolation is
       implicit - that is, in environments where any string can contain
       interpolation variables, and no special subsitution function
       need be invoked.  In such cases, it is important to prevent the
       interpolation behavior from occuring accidentally, so the '$'
       (which is otherwise a relatively uncommonly-used character) is
       used to signal when the behavior should occur.

       It is the author's opinion, however, that in cases where the
       formatting is explicitly invoked, that less care needs to be
       taken to prevent accidental interpolation, in which case a
       lighter and less unwieldy syntax can be used.

     - Printf and its cousins ('%'), including variations that add a
       field index, so that fields can be interpolated out of order.

     - Other bracket-only variations.  Various MUDs (Multi-User
       Dungeons) such as MUSH have used brackets (e.g. [name]) to do
       string interpolation.  The Microsoft .Net libraries uses braces
       ({}), and a syntax which is very similar to the one in this
       proposal, although the syntax for conversion specifiers is quite
       different. [4]

     - Backquoting.  This method has the benefit of minimal syntactical
       clutter, however it lacks many of the benefits of a function
       call syntax (such as complex expression arguments, custom
       formatters, etc.).

     - Other variations include Ruby's #{}, PHP's {$name}, and so
       on.

     Some specific aspects of the syntax warrant additional comments:

     1) The use of the backslash character for escapes.  A few people
     suggested doubling the brace characters to indicate a literal
     brace rather than using backslash as an escape character.  This is
     also the convention used in the .Net libraries.  Here's how the
     previously-given example would look with this convention:

         "My name is {0} :-{{}}".format('Fred')

     One problem with this syntax is that it conflicts with the use of
     nested braces to allow parameterization of the conversion
     specifiers:

         "{0:{1}.{2}}".format(a, b, c)

     (There are alternative solutions, but they are too long to go
     into here.)

     2) The use of the colon character (':') as a separator for
     conversion specifiers.  This was chosen simply because that's
     what .Net uses.


Sample Implementation

     A rough prototype of the underlying 'cformat' function has been
     coded in Python, however it needs much refinement before being
     submitted.


Backwards Compatibility

     Backwards compatibility can be maintained by leaving the existing
     mechanisms in place.  The new system does not collide with any of
     the method names of the existing string formatting techniques, so
     both systems can co-exist until it comes time to deprecate the
     older system.


References

     [1] Python Library Reference - String formating operations
     http://docs.python.org/lib/typesseq-strings.html

     [2] Python Library References - Template strings
     http://docs.python.org/lib/node109.html

     [3] [Python-3000] String formating operations in python 3k
         http://mail.python.org/pipermail/python-3000/2006-April/000285.html

     [4] Composite Formatting - [.Net Framework Developer's Guide]
 
http://msdn.microsoft.com/library/en-us/cpguide/html/cpconcompositeformatting.asp?frame=true


Copyright

     This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

From martin at v.loewis.de  Sun May  7 09:15:55 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 07 May 2006 09:15:55 +0200
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <445C628F.3050801@canterbury.ac.nz>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
	<445C628F.3050801@canterbury.ac.nz>
Message-ID: <445D9EAB.7010901@v.loewis.de>

Greg Ewing wrote:
> Tim Peters wrote:
>> Instead it would make best sense for each
>> sprint project to work in its own branch, something SVN makes very
>> easy, but only for those who _can_ commit.
> 
> There's no way of restricting commit privileges to
> a particular branch?

In the current setup, not easily. However, sprinters would certainly
restrain themselves to the branches they are told to use. IOW, technical
mechanisms for fine-tuned write access are often besides the point:
you first need to give these people any kind of write access at all,
and then you find that further restricting that can readily be done
without any technical enforcement.

Or, to put it yet in a different way: whether or not commit privileges
are restricted, you need to add the sprinters to the committers list
first, unless you want to allow anonymous commits to these branches.

Just to not be mistaken: it is technically fairly easy to add somebody
to the committers list. So technical, there is no problem to add all
sprinters.

Regards,
Martin

From root at renet.ru  Sat May  6 22:59:24 2006
From: root at renet.ru (Vladimir Yu. Stepanov)
Date: Sun, 7 May 2006 00:59:24 +0400
Subject: [Python-Dev] total ordering.
In-Reply-To: <20060506025902.67DA.JCARLSON@uci.edu>
References: <20060505105630.67BE.JCARLSON@uci.edu> <445C67EE.1030704@renet.ru>
	<20060506025902.67DA.JCARLSON@uci.edu>
Message-ID: <20060506205924.GA14684@fox.renet.ru>

On Sat, May 06, 2006 at 03:12:11AM -0700, Josiah Carlson wrote:
> 
> "Vladimir 'Yu' Stepanov" <vys at renet.ru> wrote:
> > Josiah Carlson wrote:
> > > This problem has nothing to do with dictionaries and hashing, it has to
> > > do with the fact that there may not be a total ordering on the elements
> > > of a sequence.
> > 
> > It is sad. I did not know it. Therefore and have not understood.
> > 
> > I have looked in Google on "python dev total ordering". On intentions to
> > correct this problem I have not found anything. It already earlier was
> > discussed ?
> 
> Not really.  It has to do with how non-comparable instances compare to
> each other.  Generally speaking, when instances aren't comparable,
> Python compares their type, which is actually comparing the names of
> types. This wouldn't be a big deal, except that:
> 
> >>> str < tuple < unicode
> True
> 
> And you can actually compare str and unicode, so, if you have a str that
> is greater than the unicode, you run into this issue.  With unicode
> becoming str in Py3k, we may not run into this issue much then, unless
> bytes are comparable to str, in which case we end up witht the same
> problems.
> 
> Actually, according to my google of "python dev total ordering", it
> gives me...
> 
> http://mail.python.org/pipermail/python-dev/2003-March/034169.html
> http://mail.python.org/pipermail/python-list/2003-March/154142.html
> 
> Which were earlier discussions on this topic, which are quite topical. 
> The ultimate solution is to choose a total ordering on types and
> consider the problem solved.  Then list.sort() and binary trees will
> work properly.
> 

Thanks for so detailed answer.

Under these references discussion is conducted is presented in the form
of "so is", instead of "as it to correct". Therefore I would like to
mention a question "as it to correct".

In case of incomparability of types can be it is meaningful
compare the identifiers compared to each concrete type. More truly
on them to calculate weight which will play a direct role in case
of impossibility of comparison of types.

Below one of variants of the decision of this problem is resulted.

To each type of data in Python to add three fields:
..
        int     tp_id;
        int     tp_qualifier;
        int     tp_weight;
..

Rigidly to appoint some built in and external to types
identifiers (tp_id). The others receive free identifiers on their
initialization. Except for that types should be classified on
their basic properties - tp_qualifier. The type can be
classified as:
0 = None
1 = integer (bool, int, long, etc..)
2 = float (decimal, float, etc..)
3 = math (complex, matrix may be ?? )
4 = string (str, unicode, splice, etc..)
5 = sequence (iterators, deque, xrange, enumerate, etc..)
6 = array (tuple, list, bytes, array, etc..)
7 = dictionary (dict, defdict, etc..)
8 = set (set, etc..)
9 = file (file, socket.socket, cStringIO.StringIO)

..
127 = non qualifier (sre.*, datetime.*, ClassType, IntsenceType, etc..)

It is the approximate list on which it will be convenient
to classify types (in my opinion).

The weight can be precisely set in structure or if be equal -1
should is calculated under the formula:

    ob_type->tp_weight = (ob_type->tp_qualifier<<24) + (ob_type->tp_id<<8)

Thus objects with similar properties will follow one after
another. A problem can make external types. But if to develop
further classification, problems to arise should not.


-- 
SY. Vladimir Stepanov


From bjourne at gmail.com  Sun May  7 13:29:53 2006
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Sun, 7 May 2006 13:29:53 +0200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <44598128.8060506@canterbury.ac.nz>
References: <4453B025.3080100@acm.org> <4455A0CF.3080008@v.loewis.de>
	<e35ds3$p77$1@sea.gmane.org> <445644D1.3050200@v.loewis.de>
	<e36f6e$pns$1@sea.gmane.org> <44573D75.40007@canterbury.ac.nz>
	<ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>
	<445805EA.7000501@canterbury.ac.nz>
	<740c3aec0605031621v48f33dd0g591cb12a131764e8@mail.gmail.com>
	<44598128.8060506@canterbury.ac.nz>
Message-ID: <740c3aec0605070429l6efc5884v34ada379e7690177@mail.gmail.com>

> > would have thought that the one obvious way to get rid of
> > the wanky feeling would have been to write:
> >
> > def make_person(name, age, phone, location): ...
> >
> > make_person(name, age, phone, location)
>
> This doesn't fly in something like PyGUI, where there
> are literally dozens of potential arguments to many
> of the constructors. The only sane way to deal with
> that is for them to be keyword-only, at least
> conceptually if not in actual implementation.

Probably. I don't know enough about library design in Python to
meaningfully criticize the Keyword-only arguments suggestion. However,
I do know enough about Python to know that the make_person function is
a really bad example. The one obvious way to write make_peson is to
use four positional arguments, name, age, phone, location and not
complicate the function by throwing in required keyword arguments. It
would be nice to instead see some real examples of the usefulness of
the required keyword-only arguments. Like from your PyGUI or maybe
from the standard library? But IMHO, your design is broken if you need
to send dozens of arguments to any function or method.

--
mvh Bj?rn

From greg.ewing at canterbury.ac.nz  Sun May  7 13:55:01 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 07 May 2006 23:55:01 +1200
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <740c3aec0605070429l6efc5884v34ada379e7690177@mail.gmail.com>
References: <4453B025.3080100@acm.org> <4455A0CF.3080008@v.loewis.de>
	<e35ds3$p77$1@sea.gmane.org> <445644D1.3050200@v.loewis.de>
	<e36f6e$pns$1@sea.gmane.org> <44573D75.40007@canterbury.ac.nz>
	<ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>
	<445805EA.7000501@canterbury.ac.nz>
	<740c3aec0605031621v48f33dd0g591cb12a131764e8@mail.gmail.com>
	<44598128.8060506@canterbury.ac.nz>
	<740c3aec0605070429l6efc5884v34ada379e7690177@mail.gmail.com>
Message-ID: <445DE015.3000707@canterbury.ac.nz>

BJ?rn Lindqvist wrote:

> But IMHO, your design is broken if you need
> to send dozens of arguments to any function or method.

My design allows property values to be specified using
keywords in the constructor. You typically only use
a few of them in any given call, but there are a large
number of potential keywords that you could use.

--
Greg

From edloper at gradient.cis.upenn.edu  Sun May  7 14:47:05 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Sun, 07 May 2006 08:47:05 -0400
Subject: [Python-Dev] PEP 3101 Update
In-Reply-To: <445D5357.9060409@acm.org>
References: <445D5357.9060409@acm.org>
Message-ID: <445DEC49.6050709@gradient.cis.upenn.edu>

Talin wrote:
>      Braces can be escaped using a backslash:
> 
>          "My name is {0} :-\{\}".format('Fred')
> 
>      Which would produce:
> 
>          "My name is Fred :-{}"

Do backslashes also need to be backslashed then?  If not, then what is 
the translation of this:?

     r'abc\{%s\}' % 'x'

I guess the only sensible translation if backslashes aren't backslashed 
would be:

     r'abc\\{{0}\\}'.format('x')

But the parsing of that format string seems fairly unintuitive to me. 
If backslashes do need to be backslashed, then that fact needs to be 
mentioned.

-Edward

From guido at python.org  Sun May  7 17:25:14 2006
From: guido at python.org (Guido van Rossum)
Date: Sun, 7 May 2006 08:25:14 -0700
Subject: [Python-Dev] total ordering.
In-Reply-To: <20060506205924.GA14684@fox.renet.ru>
References: <20060505105630.67BE.JCARLSON@uci.edu> <445C67EE.1030704@renet.ru>
	<20060506025902.67DA.JCARLSON@uci.edu>
	<20060506205924.GA14684@fox.renet.ru>
Message-ID: <ca471dc20605070825k24c81178p1a2fb1123211f98d@mail.gmail.com>

On 5/6/06, Vladimir Yu. Stepanov <root at renet.ru> wrote:
[proposing a total ordering between types]

It Ain't Gonna Happen. (From now on, I'll write this as IAGH.)

In Python 3000, we'll actually *remove* ordering between arbitrary
types as a feature; only types that explicitly care to be ordered with
respect to one another will be ordered. Equality tests are unaffected,
x==y will simply return False if x and y are of incomparable types;
but x<y (etc.) will raise an exception.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From lcaamano at gmail.com  Sun May  7 17:35:26 2006
From: lcaamano at gmail.com (Luis P Caamano)
Date: Sun, 7 May 2006 11:35:26 -0400
Subject: [Python-Dev] possible use of __decorates__ in functools.decorator
Message-ID: <c56e219d0605070835y28cef932odde47906546ffcac@mail.gmail.com>

In a previous post related to the functools.decorator function, I
think Nick was wondering if the  __decorator__ and __decorates__
attributes were useful and Guido was tempted to call YAGNI on them.

Coincidentally, I've run into a situation where I had to use the
__decorates__ attribute, which I'd like to show to you because it
might be a good example of why __decorates__ is useful or perhaps
there's another way to do what I'm doing without it that some of you
could hint at.

We have a tracing decorator that automatically logs enter/exits
to/from functions and methods but it also figures out by itself the
arguments values of the function call and the class or module the
function/method is defined on.  I thought this was trivial until I ran
into methods extended in subclasses, at which point we had to deal
with bases and the mro.

The part that figures out the name of the class where the wrapped
method was defined has to start with the assumption that the first
argument is self and then find the defining class from it.  This code
fragment is in the wrapper function of the decorator and it looks like
this:

                if numargs > 0:
                    # at definition time, class methods are not methods
                    # yet because the class doesn't exist when the
                    # decorators get called and thus, we have to figure
                    # out classname at runtime via self, which is then
                    # an instance of a class.
                    #
                    # assume first arg is self, see if f.__name__ is there
                    # as a method and if so, then grab it's class name
                    #
                    self = args[0]

                    if type(self) == types.InstanceType:
                        # getattr will find the method anywhere in the
                        # class tree so start from the top
                        bases = list(inspect.getmro(self.__class__))
                        bases.reverse()
                        for c in bases:
                            # f was given to us in the deco_func
                            meth = getattr(c, f.__name__, None)

                            # we found a method with that name, which
                            # it's probably this same wrapper function
                            # we wrapped the original method with.

                            ofunc = getattr(meth, '__decorates__', False)

                            if ofunc and ofunc.func_code == f.func_code:
                                # got it
                                clsname = meth.im_class.__name__
                                break
                        del c, meth
                    del self

This tracing code will correctly show calls to foomethod with the
appropriate class name.

    def testMethodInBothClasses(self):
        class Base:
            @tracelevel(1)
            def foomethod(self, x, y=5, **kwds):
                return x+y

        class TClass(Base):
            @tracelevel(1)
            def foomethod(self, x, y=1, **kwds):
                return Base.foomethod(self, x, **kwds) + x + y

        t = TClass()
        rc = t.foomethod(4, d=1)
        self.failUnless(rc == 14)
        return

Is there a way to do this without the __decorates__ attribute?

--
Luis P Caamano
Atlanta, GA USA

From steven.bethard at gmail.com  Sun May  7 20:25:07 2006
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 7 May 2006 12:25:07 -0600
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <740c3aec0605070429l6efc5884v34ada379e7690177@mail.gmail.com>
References: <4453B025.3080100@acm.org> <e35ds3$p77$1@sea.gmane.org>
	<445644D1.3050200@v.loewis.de> <e36f6e$pns$1@sea.gmane.org>
	<44573D75.40007@canterbury.ac.nz>
	<ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>
	<445805EA.7000501@canterbury.ac.nz>
	<740c3aec0605031621v48f33dd0g591cb12a131764e8@mail.gmail.com>
	<44598128.8060506@canterbury.ac.nz>
	<740c3aec0605070429l6efc5884v34ada379e7690177@mail.gmail.com>
Message-ID: <d11dcfba0605071125v46bd69bbo70dd69d6c0327dd1@mail.gmail.com>

On 5/7/06, BJ?rn Lindqvist <bjourne at gmail.com> wrote:
> I do know enough about Python to know that the make_person function is
> a really bad example.

Totally agreed.  I've been ignoring most of that discussion because it
seemed really irrelevant.

> would be nice to instead see some real examples of the usefulness of
> the required keyword-only arguments.

The most obvious one to me is the optparse module, where add_option
takes all kinds of different keyword arguments, and there's really no
intention of these ever being specified as positional arguments:
    http://docs.python.org/lib/module-optparse.html


STeVe
--
Grammar am for people who can't think for myself.
        --- Bucky Katt, Get Fuzzy

From unknown_kev_cat at hotmail.com  Sun May  7 21:43:02 2006
From: unknown_kev_cat at hotmail.com (Joe Smith)
Date: Sun, 7 May 2006 15:43:02 -0400
Subject: [Python-Dev] PEP 3101 Update
References: <445D5357.9060409@acm.org>
	<445DEC49.6050709@gradient.cis.upenn.edu>
Message-ID: <e3like$r9o$1@sea.gmane.org>


"Edward Loper" <edloper at gradient.cis.upenn.edu> wrote in message 
news:445DEC49.6050709 at gradient.cis.upenn.edu...
> Talin wrote:
>>      Braces can be escaped using a backslash:
>>
>>          "My name is {0} :-\{\}".format('Fred')
>>
>>      Which would produce:
>>
>>          "My name is Fred :-{}"
>
> Do backslashes also need to be backslashed then?  If not, then what is
> the translation of this:?
>
>     r'abc\{%s\}' % 'x'
>
> I guess the only sensible translation if backslashes aren't backslashed
> would be:
>
>     .format('x')
>
> But the parsing of that format string seems fairly unintuitive to me.
> If backslashes do need to be backslashed, then that fact needs to be
> mentioned.

AFAICT there would be no way to use raw strings with that method. That 
method
would be using the escape sequence "\{" to indicate a bracket. Raw strings 
cannot have escape sequences.
Additional backslashes are added to raw strings to remove anything that 
resembles an escape sequence.
So either an additional layer of escaping is required (like with regex'es) 
or ".format()" would
simply not be usable with raw strings.

The problem of having an additional level of escaping is very shown with 
regexes. A simgle slash
as the end value is either "\\\\" or  r"\\". 



From sluggoster at gmail.com  Sun May  7 22:48:18 2006
From: sluggoster at gmail.com (Mike Orr)
Date: Sun, 7 May 2006 13:48:18 -0700
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <d11dcfba0605071125v46bd69bbo70dd69d6c0327dd1@mail.gmail.com>
References: <4453B025.3080100@acm.org> <445644D1.3050200@v.loewis.de>
	<e36f6e$pns$1@sea.gmane.org> <44573D75.40007@canterbury.ac.nz>
	<ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>
	<445805EA.7000501@canterbury.ac.nz>
	<740c3aec0605031621v48f33dd0g591cb12a131764e8@mail.gmail.com>
	<44598128.8060506@canterbury.ac.nz>
	<740c3aec0605070429l6efc5884v34ada379e7690177@mail.gmail.com>
	<d11dcfba0605071125v46bd69bbo70dd69d6c0327dd1@mail.gmail.com>
Message-ID: <6e9196d20605071348w68cdfe3dh35b49f3d4371e1dc@mail.gmail.com>

On 5/7/06, Steven Bethard <steven.bethard at gmail.com> wrote:
> The most obvious one to me is the optparse module, where add_option
> takes all kinds of different keyword arguments, and there's really no
> intention of these ever being specified as positional arguments:
>     http://docs.python.org/lib/module-optparse.html

Or MySQLdb, which specifically recommends keyword arguments for the
constructor.  Mainly because the underlying library is controlled by a
third party, so it's unpredictable when and where new arguments will
be added.

(Not that I'm in favor of keyword-only arguments, for the reasons
Terry Reedy has mentioned.)

--
Mike Orr <sluggoster at gmail.com>
(mso at oz.net address is semi-reliable)

From noamraph at gmail.com  Mon May  8 00:07:09 2006
From: noamraph at gmail.com (Noam Raphael)
Date: Mon, 8 May 2006 01:07:09 +0300
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <b348a0850605061521w3c06e181hed0ae3a9aa49fe25@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<b348a0850605061521w3c06e181hed0ae3a9aa49fe25@mail.gmail.com>
Message-ID: <b348a0850605071507s5376186fk93fd3fefad67c602@mail.gmail.com>

Hello all again!

Thanks to Mike's suggestion, I now opened a new wiki page,
AlternativePathDiscussion, in
http://wiki.python.org/moin/AlternativePathDiscussion

The idea is to organize the discussion by dividing it into multiple
sections, and seeing what is agreed and what should be further
discussed. I now wrote there what I think, and quoted a few opinions
from other posts. The posts by others are only a minority - what's
written there currently is mostly what I think. I'm sorry for the
inconvinience, but please go there and post your opinions (you can
copy and paste from your emails, of course).

I apologize first for not replying to each post, and second for only
writing my opinions in the wiki. I simply write pretty slowly in
English, and am already developing a growing sleep-hours deficit. I
hope you forgive me.

Have a good day,
Noam

From greg.ewing at canterbury.ac.nz  Mon May  8 03:03:21 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 08 May 2006 13:03:21 +1200
Subject: [Python-Dev] PEP 3101 Update
In-Reply-To: <e3like$r9o$1@sea.gmane.org>
References: <445D5357.9060409@acm.org>
	<445DEC49.6050709@gradient.cis.upenn.edu> <e3like$r9o$1@sea.gmane.org>
Message-ID: <445E98D9.8090004@canterbury.ac.nz>

Joe Smith wrote:

> AFAICT there would be no way to use raw strings with that method. 
 > ...
> Additional backslashes are added to raw strings to remove anything that 
> resembles an escape sequence.

You seem to be very confused about the way strings work. If
you look at the repr() of a string containing a backslash,
you will see two backslashes, but they're only in the repr(),
*not* in the string itself.

Some things to ponder:

 >>> "\{" == r"\{"
True
 >>> len("\{")
2
 >>> len(r"\{")
2

--
Greg

From steven.bethard at gmail.com  Mon May  8 03:14:59 2006
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 7 May 2006 19:14:59 -0600
Subject: [Python-Dev] PEP 3101 Update
In-Reply-To: <445DEC49.6050709@gradient.cis.upenn.edu>
References: <445D5357.9060409@acm.org>
	<445DEC49.6050709@gradient.cis.upenn.edu>
Message-ID: <d11dcfba0605071814sab07943sc2865053f3f31a9f@mail.gmail.com>

On 5/7/06, Edward Loper <edloper at gradient.cis.upenn.edu> wrote:
> Talin wrote:
> >      Braces can be escaped using a backslash:
> >
> >          "My name is {0} :-\{\}".format('Fred')
> >
> >      Which would produce:
> >
> >          "My name is Fred :-{}"
>
> Do backslashes also need to be backslashed then?  If not, then what is
> the translation of this:?
>
>      r'abc\{%s\}' % 'x'

I believe the proposal is taking advantage of the fact that '\{' is
not interpreted as an escape sequence -- it is interpreted as a
literal backslash followed by an open brace:

>>> '\{'
'\\{'
>>> '\\{'
'\\{'
>>> r'\{'
'\\{'

Thus, for 'abc\{0\}'.format('x'), you should get an error because
there are no replacement fields in the format string.

STeVe
--
Grammar am for people who can't think for myself.
        --- Bucky Katt, Get Fuzzy

From steven.bethard at gmail.com  Mon May  8 03:37:10 2006
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 7 May 2006 19:37:10 -0600
Subject: [Python-Dev] PEP 3101 Update
In-Reply-To: <445D5357.9060409@acm.org>
References: <445D5357.9060409@acm.org>
Message-ID: <d11dcfba0605071837k71e64b81p5709c5bfcb035a5c@mail.gmail.com>

On 5/6/06, Talin <talin at acm.org> wrote:
> I've updated PEP 3101 based on the feedback collected so far.
[snip]
>      Compound names are a sequence of simple names seperated by
>      periods:
>
>          "My name is {0.name} :-\{\}".format(dict(name='Fred'))
>
>      Compound names can be used to access specific dictionary entries,
>      array elements, or object attributes.  In the above example, the
>      '{0.name}' field refers to the dictionary entry 'name' within
>      positional argument 0.

I'm still not a big fan of mixing together getitem-style access and
getattribute-style access.  That makes classes that support both
ambiguous in this context.  You either need to specify the order in
which these are checked (e.g. attribute then item or item then
attribute), or, preferably, you need to extend the syntax to allow
getitem-style access too.

Just to be clear, I'm not suggesting that you support anything more
then items and attributes.  So this is *not* a request to allow
arbitrary expressions.  In fact, the only use-case I see in the PEP
needs only item access, not attribute access, so maybe you could drop
attribute access?

Can't you just extend the syntax for *only* item access?  E.g. something like:

    "My name is {0[name]} :-\{\}".format(dict(name='Fred'))


STeVe
--
Grammar am for people who can't think for myself.
        --- Bucky Katt, Get Fuzzy

From tim.peters at gmail.com  Mon May  8 03:58:08 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 7 May 2006 21:58:08 -0400
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <445D9EAB.7010901@v.loewis.de>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
	<445C628F.3050801@canterbury.ac.nz> <445D9EAB.7010901@v.loewis.de>
Message-ID: <1f7befae0605071858r3a52c837q44e842b6918fc234@mail.gmail.com>

[Martin v. L?wis]
> ...
> Or, to put it yet in a different way: whether or not commit privileges
> are restricted, you need to add the sprinters to the committers list
> first, unless you want to allow anonymous commits to these branches.
>
> Just to not be mistaken: it is technically fairly easy to add somebody
> to the committers list. So technical, there is no problem to add all
> sprinters.

The more realistically ;-) I try to picture the suggested
alternatives, the more sensible this one sounds.  Some people at the
sprint (like me, wrt the Iceland sprint) could volunteer to be
responsible for checking checkins for appropriateness, and in any case
everyone subscribed to python-checkins would see what's going on. 
That's a major goodness all by itself, for "more eyeballs" reasons. 
It should be possible to quickly stop anyone abusing the privilege.

However ... speaking as a PSF Director, I wouldn't be comfortable with
this unless sprinters signed a PSF contribution form beforehand.  Else
we get code in the PSF repository with clear-as-mud copyright and
licensing issues.  Having contribution forms in place would also ease
fears that sprint output might be "subverted" to non-open status.

From martin at v.loewis.de  Mon May  8 07:23:21 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 08 May 2006 07:23:21 +0200
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <1f7befae0605071858r3a52c837q44e842b6918fc234@mail.gmail.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>	
	<445C628F.3050801@canterbury.ac.nz> <445D9EAB.7010901@v.loewis.de>
	<1f7befae0605071858r3a52c837q44e842b6918fc234@mail.gmail.com>
Message-ID: <445ED5C9.8070903@v.loewis.de>

Tim Peters wrote:
> The more realistically ;-) I try to picture the suggested
> alternatives, the more sensible this one sounds.  Some people at the
> sprint (like me, wrt the Iceland sprint) could volunteer to be
> responsible for checking checkins for appropriateness, and in any case
> everyone subscribed to python-checkins would see what's going on. That's
> a major goodness all by itself, for "more eyeballs" reasons. It should
> be possible to quickly stop anyone abusing the privilege.

I completely agree. Just make sure you master the mechanics of adding
committers.

> However ... speaking as a PSF Director, I wouldn't be comfortable with
> this unless sprinters signed a PSF contribution form beforehand.  Else
> we get code in the PSF repository with clear-as-mud copyright and
> licensing issues.  Having contribution forms in place would also ease
> fears that sprint output might be "subverted" to non-open status.

That would be desirable, indeed. It shouldn't be too difficult to
collect the forms "on site".

Regards,
Martin


From talin at acm.org  Mon May  8 09:14:03 2006
From: talin at acm.org (Talin)
Date: Mon, 8 May 2006 07:14:03 +0000 (UTC)
Subject: [Python-Dev] PEP 3101 Update
References: <445D5357.9060409@acm.org>
	<d11dcfba0605071837k71e64b81p5709c5bfcb035a5c@mail.gmail.com>
Message-ID: <loom.20060508T091227-325@post.gmane.org>

Steven Bethard <steven.bethard <at> gmail.com> writes:

> I'm still not a big fan of mixing together getitem-style access and
> getattribute-style access.  That makes classes that support both
> ambiguous in this context.  You either need to specify the order in
> which these are checked (e.g. attribute then item or item then
> attribute), or, preferably, you need to extend the syntax to allow
> getitem-style access too.
> 
> Just to be clear, I'm not suggesting that you support anything more
> then items and attributes.  So this is *not* a request to allow
> arbitrary expressions.  In fact, the only use-case I see in the PEP
> needs only item access, not attribute access, so maybe you could drop
> attribute access?
> 
> Can't you just extend the syntax for *only* item access?  E.g. something like:
> 
>     "My name is {0[name]} :-\{\}".format(dict(name='Fred'))

I'm not opposed to the idea of adding item access, although I believe that
attribute access is also useful. In either case, its not terribly hard to
implement.

I'd like to hear what other people have to say on this issue.

-- Talin



From talin at acm.org  Mon May  8 09:14:44 2006
From: talin at acm.org (Talin)
Date: Mon, 8 May 2006 07:14:44 +0000 (UTC)
Subject: [Python-Dev] PEP 3101 Update
References: <445D5357.9060409@acm.org>
	<445DEC49.6050709@gradient.cis.upenn.edu>
	<d11dcfba0605071814sab07943sc2865053f3f31a9f@mail.gmail.com>
Message-ID: <loom.20060508T091424-845@post.gmane.org>

Steven Bethard <steven.bethard <at> gmail.com> writes:

> I believe the proposal is taking advantage of the fact that '\{' is
> not interpreted as an escape sequence -- it is interpreted as a
> literal backslash followed by an open brace:

This is exactly correct.

-- Talin



From edloper at gradient.cis.upenn.edu  Mon May  8 13:58:55 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Mon, 08 May 2006 07:58:55 -0400
Subject: [Python-Dev] PEP 3101 Update
In-Reply-To: <d11dcfba0605071814sab07943sc2865053f3f31a9f@mail.gmail.com>
References: <445D5357.9060409@acm.org>	<445DEC49.6050709@gradient.cis.upenn.edu>
	<d11dcfba0605071814sab07943sc2865053f3f31a9f@mail.gmail.com>
Message-ID: <445F327F.3020104@gradient.cis.upenn.edu>

Steven Bethard wrote:
> On 5/7/06, Edward Loper <edloper at gradient.cis.upenn.edu> wrote:
>> Talin wrote:
>>>      Braces can be escaped using a backslash:
>>>
>>>          "My name is {0} :-\{\}".format('Fred')
>>>
>>>      Which would produce:
>>>
>>>          "My name is Fred :-{}"
>> Do backslashes also need to be backslashed then?  If not, then what is
>> the translation of this:?
>>
>>      r'abc\{%s\}' % 'x'
> 
> I believe the proposal is taking advantage of the fact that '\{' is
> not interpreted as an escape sequence -- it is interpreted as a
> literal backslash followed by an open brace:

Yes, I knew that fact, but it's not related to my question.  The basic 
issue is, that if you have a quote character ('\\'), then you usually 
need to be able to quote that quote character ('\\\\') in at least some 
contexts.

So I'll try again with a different example.  I'll avoid using raw 
strings, just because doing so seems to have confused a couple people 
about what my question is about.  Let's say I have a program now that 
contains the following line:

   print 'Foo\\%s' % x

And I'd like to translate that to use str.format.  How do I do it?  If I 
just replace %s with {0} then I get:

   print 'Foo\\{0}'.format(x)

but this will presumably raise an exception, since the '\\{' (which is 
identical to '\{') gets treated as a quoted brace, and the '}' is 
unmatched.  If it were possible to backslash backslashes, then I could do:

   print 'Foo\\\\{1}' % x

(which can also be spelled r'Foo\\{1}' -- i.e., the string now contains 
two backslash characters).

But I haven't seen any mention of backslashing backslashes so far.

-Edward

From tzot at mediconsa.com  Mon May  8 14:25:16 2006
From: tzot at mediconsa.com (Christos Georgiou)
Date: Mon, 8 May 2006 15:25:16 +0300
Subject: [Python-Dev] Visual studio 2005 express now free
References: <ca471dc20604210237h797824ccu796146dae44a4689@mail.gmail.com>
	<44491299.9040607@v.loewis.de>
Message-ID: <e3ndbn$fm5$1@sea.gmane.org>


""Martin v. L?wis"" <martin at v.loewis.de> wrote in message 
news:44491299.9040607 at v.loewis.de...

> - Paul Moore has contributed a Python build procedure for the
>  free version of the 2003 compiler. This one is without IDE,
>  but still, it should allow people without a VS 2003 license
>  to work on Python itself; it should also be possible to develop
>  extensions with that compiler (although I haven't verified
>  that distutils would pick that up correctly).

AFAIR Paul Moore was based on some other instructions posted previously 
somewhere on the net, which I followed at least a year back; it was a little 
hard to fix all variables etc, but I managed to build for 2.4 an extension 
of mine (haar transformation for image comparisons) successfully using just 
the setup.py of my module. 



From root at renet.ru  Sat May  6 22:59:24 2006
From: root at renet.ru (Vladimir Yu. Stepanov)
Date: Sun, 7 May 2006 00:59:24 +0400
Subject: [Python-Dev] total ordering.
In-Reply-To: <20060506025902.67DA.JCARLSON@uci.edu>
References: <20060505105630.67BE.JCARLSON@uci.edu> <445C67EE.1030704@renet.ru>
	<20060506025902.67DA.JCARLSON@uci.edu>
Message-ID: <20060506205924.GA14684@fox.renet.ru>

On Sat, May 06, 2006 at 03:12:11AM -0700, Josiah Carlson wrote:
> 
> "Vladimir 'Yu' Stepanov" <vys at renet.ru> wrote:
> > Josiah Carlson wrote:
> > > This problem has nothing to do with dictionaries and hashing, it has to
> > > do with the fact that there may not be a total ordering on the elements
> > > of a sequence.
> > 
> > It is sad. I did not know it. Therefore and have not understood.
> > 
> > I have looked in Google on "python dev total ordering". On intentions to
> > correct this problem I have not found anything. It already earlier was
> > discussed ?
> 
> Not really.  It has to do with how non-comparable instances compare to
> each other.  Generally speaking, when instances aren't comparable,
> Python compares their type, which is actually comparing the names of
> types. This wouldn't be a big deal, except that:
> 
> >>> str < tuple < unicode
> True
> 
> And you can actually compare str and unicode, so, if you have a str that
> is greater than the unicode, you run into this issue.  With unicode
> becoming str in Py3k, we may not run into this issue much then, unless
> bytes are comparable to str, in which case we end up witht the same
> problems.
> 
> Actually, according to my google of "python dev total ordering", it
> gives me...
> 
> http://mail.python.org/pipermail/python-dev/2003-March/034169.html
> http://mail.python.org/pipermail/python-list/2003-March/154142.html
> 
> Which were earlier discussions on this topic, which are quite topical. 
> The ultimate solution is to choose a total ordering on types and
> consider the problem solved.  Then list.sort() and binary trees will
> work properly.
> 

Thanks for so detailed answer.

Under these references discussion is conducted is presented in the form
of "so is", instead of "as it to correct". Therefore I would like to
mention a question "as it to correct".

In case of incomparability of types can be it is meaningful
compare the identifiers compared to each concrete type. More truly
on them to calculate weight which will play a direct role in case
of impossibility of comparison of types.

Below one of variants of the decision of this problem is resulted.

To each type of data in Python to add three fields:
...
        int     tp_id;
        int     tp_qualifier;
        int     tp_weight;
...

Rigidly to appoint some built in and external to types
identifiers (tp_id). The others receive free identifiers on their
initialization. Except for that types should be classified on
their basic properties - tp_qualifier. The type can be
classified as:
0 = None
1 = integer (bool, int, long, etc..)
2 = float (decimal, float, etc..)
3 = math (complex, matrix may be ?? )
4 = string (str, unicode, splice, etc..)
5 = sequence (iterators, deque, xrange, enumerate, etc..)
6 = array (tuple, list, bytes, array, etc..)
7 = dictionary (dict, defdict, etc..)
8 = set (set, etc..)
9 = file (file, socket.socket, cStringIO.StringIO)

...
127 = non qualifier (sre.*, datetime.*, ClassType, IntsenceType, etc..)

It is the approximate list on which it will be convenient
to classify types (in my opinion).

The weight can be precisely set in structure or if be equal -1
should is calculated under the formula:

    ob_type->tp_weight = (ob_type->tp_qualifier<<24) + (ob_type->tp_id<<8)

Thus objects with similar properties will follow one after
another. A problem can make external types. But if to develop
further classification, problems to arise should not.


-- 
SY. Vladimir Stepanov

From engelbert.gruber at ssg.co.at  Mon May  8 16:35:47 2006
From: engelbert.gruber at ssg.co.at (engelbert.gruber at ssg.co.at)
Date: Mon, 8 May 2006 16:35:47 +0200 (CEST)
Subject: [Python-Dev] rest2latex - pydoc writer - tables
In-Reply-To: <e2tbfe$akr$1@sea.gmane.org>
References: <ee2a432c0604182110w49a84a38iace62e9d9b3bf424@mail.gmail.com>
	<200604191457.24195.anthony@interlink.com.au>
	<1145424790.21740.86.camel@geddy.wooz.org>
	<ee2a432c0604182255t4bda0170g14b84c4ddb6b8a85@mail.gmail.com>
	<4445FADF.90307@python.net> <4451DE1B.2090804@ghaering.de>
	<4451E6C1.4040506@python.net>
	<Pine.LNX.4.64.0604281216430.5662@lx3.local>
	<e2tbfe$akr$1@sea.gmane.org>
Message-ID: <Pine.LNX.4.64.0605081626570.11462@lx3.local>

On Fri, 28 Apr 2006, Thomas Heller wrote:

> I must work on the docs themselves, so I currently have only two things:
>
> - the table in the ctypes tutorial has a totally different look than the other
>   tables in the docs.  Compare
>   http://docs.python.org/dev/lib/ctypes-simple-data-types.html
>   with
>   http://docs.python.org/dev/lib/module-struct.html

could you try docutils.sf.net sandbox pydoc-writer checkout
(svn or the snapshot or tomorrow
  http://docutils.sourceforge.net/sandbox/pydoc-writer/mkpydoc.py)

> - feature request: it would be very nice if it were possible to generate links
>   into the index for functions and types from the rest sources.

into the index means what ? into the module index ?

cheers

-- 
  --- Engelbert Gruber -------+
   SSG Fintl,Gruber,Lassnig  /
   A6170 Zirl   Innweg 5b   /
   Tel. ++43-5238-93535 ---+

From mcherm at mcherm.com  Mon May  8 18:25:06 2006
From: mcherm at mcherm.com (Michael Chermside)
Date: Mon, 08 May 2006 09:25:06 -0700
Subject: [Python-Dev] PEP 3101 Update
Message-ID: <20060508092506.wf13rak25334k8ko@login.werra.lunarpages.com>

One small comment:

>     The conversion specifier consists of a sequence of zero or more
>     characters, each of which can consist of any printable character
>     except for a non-escaped '}'.

"Escaped"? How are they escaped? (with '\'?) If so, how are backslashes
escaped (with '\\'?) And does the mechanism un-escape these for you
before passing them to __format__?


Later...
>     - Variable field width specifiers use a nested version of the {}
>       syntax, allowing the width specifier to be either a positional
>       or keyword argument:
>
>         "{0:{1}.{2}d}".format(a, b, c)

This violates the specification given above... it has non-escaped '}'
characters. Make up one rule and be consistant.

-- Michael Chermside


From talin at acm.org  Mon May  8 21:36:32 2006
From: talin at acm.org (Talin)
Date: Mon, 8 May 2006 19:36:32 +0000 (UTC)
Subject: [Python-Dev] PEP 3101 Update
References: <20060508092506.wf13rak25334k8ko@login.werra.lunarpages.com>
Message-ID: <loom.20060508T213350-595@post.gmane.org>

Michael Chermside <mcherm <at> mcherm.com> writes:

> One small comment:
> 
> >     The conversion specifier consists of a sequence of zero or more
> >     characters, each of which can consist of any printable character
> >     except for a non-escaped '}'.
> 
> "Escaped"? How are they escaped? (with '\'?) If so, how are backslashes
> escaped (with '\\'?) And does the mechanism un-escape these for you
> before passing them to __format__?
> 
> Later...
> >     - Variable field width specifiers use a nested version of the {}
> >       syntax, allowing the width specifier to be either a positional
> >       or keyword argument:
> >
> >         "{0:{1}.{2}d}".format(a, b, c)
> 
> This violates the specification given above... it has non-escaped '}'
> characters. Make up one rule and be consistant.

What would you suggest? I'd be interested in hearing what kinds of
ideas people have for fixing these problems.

---------------------------------------------
-- Talin




From mcherm at mcherm.com  Mon May  8 22:06:41 2006
From: mcherm at mcherm.com (Michael Chermside)
Date: Mon, 08 May 2006 13:06:41 -0700
Subject: [Python-Dev] PEP 3101 Update
Message-ID: <20060508130641.3sw6c13hh4tw8www@login.werra.lunarpages.com>

I wrote:
> >     - Variable field width specifiers use a nested version of the {}
> >       syntax, allowing the width specifier to be either a positional
> >       or keyword argument:
> >
> >         "{0:{1}.{2}d}".format(a, b, c)
>
> This violates [the rule that '}' must be escaped]

Talin writes:
> What would you suggest? I'd be interested in hearing what kinds of
> ideas people have for fixing these problems.

Honestly? I propose that the PEP not provide any means to pass
format specifiers as arguments. After all, one PEP need not support
everything anyone would ever want to do with string interpolation.
Fixed format attributes are powerful enough for nearly all
applications, and those needing a more powerful tool can either use
multiple interpolation passes (simple and obvious, but a bit ugly)
or define a custom formatting class.

In fact, if _I_ were writing the PEP, I'd probably have opted for
simplicity in a few more areas. For one thing, where you have:

>    string.format() will format each field using the following steps:
>
>     1) See if the value to be formatted has a __format__ method.  If
>        it does, then call it.
>
>     2) Otherwise, check the internal formatter within string.format
>        that contains knowledge of certain builtin types.
>
>     3) Otherwise, call str() or unicode() as appropriate.

I would have left out (2) by adding __format__ methods to those
built-in types just to keep the description (and the implementation)
as simple as possible.

And probably I would have ommitted the whole ability to support custom
formatting classes. Not that they're not a nice idea, but they are a
separate feature which can be built on top of the basic format() method
and I would have tried to take one small step at a time.

But please don't read this as saying I object to the above ideas...
I wouldn't have said anything if you hadn't asked what my approach
would be.

-- Michael Chermside


From tim.peters at gmail.com  Mon May  8 23:09:16 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 8 May 2006 17:09:16 -0400
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <445ED5C9.8070903@v.loewis.de>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
	<445C628F.3050801@canterbury.ac.nz> <445D9EAB.7010901@v.loewis.de>
	<1f7befae0605071858r3a52c837q44e842b6918fc234@mail.gmail.com>
	<445ED5C9.8070903@v.loewis.de>
Message-ID: <1f7befae0605081409i78eb8f4t56e7ba86ff51d4ea@mail.gmail.com>

[Tim Peters, on giving commit privs to a sprinter after
 extracting a signed PSF contributor form]
>> The more realistically ;-) I try to picture the suggested
>> alternatives, the more sensible this one sounds. ...

[Martin v. L?wis]
> I completely agree. Just make sure you master the mechanics of adding
> committers.

So there's a practical problem:  is that procedure documented
somewhere?  And who is _able_ to do this?  For example, taking the
"you" literally, I doubt I have access to the machine(s) on which it
would need to be done.  I'd be happy to learn how to do it, and to
document the procedure (if it's not already documented -- I couldn't
find it), if someone can teach me the ropes.

>> However ... speaking as a PSF Director, I wouldn't be comfortable with
>> this unless sprinters signed a PSF contribution form beforehand.  Else
>> we get code in the PSF repository with clear-as-mud copyright and
>> licensing issues.  Having contribution forms in place would also ease
>> fears that sprint output might be "subverted" to non-open status.

> That would be desirable, indeed. It shouldn't be too difficult to
> collect the forms "on site".

More like herding cows than herding cats ;-)

From aahz at pythoncraft.com  Mon May  8 23:15:08 2006
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 8 May 2006 14:15:08 -0700
Subject: [Python-Dev] PEP 3102: Keyword-only arguments
In-Reply-To: <740c3aec0605070429l6efc5884v34ada379e7690177@mail.gmail.com>
References: <4455A0CF.3080008@v.loewis.de> <e35ds3$p77$1@sea.gmane.org>
	<445644D1.3050200@v.loewis.de> <e36f6e$pns$1@sea.gmane.org>
	<44573D75.40007@canterbury.ac.nz>
	<ca471dc20605020744m333c855cqbf9016340ea5d699@mail.gmail.com>
	<445805EA.7000501@canterbury.ac.nz>
	<740c3aec0605031621v48f33dd0g591cb12a131764e8@mail.gmail.com>
	<44598128.8060506@canterbury.ac.nz>
	<740c3aec0605070429l6efc5884v34ada379e7690177@mail.gmail.com>
Message-ID: <20060508211507.GA19278@panix.com>

On Sun, May 07, 2006, BJ?rn Lindqvist wrote:
>
> I do know enough about Python to know that the make_person function is
> a really bad example. The one obvious way to write make_peson is to
> use four positional arguments, name, age, phone, location and not
> complicate the function by throwing in required keyword arguments. It
> would be nice to instead see some real examples of the usefulness of
> the required keyword-only arguments. 

See previous discussion about super() -- I don't have time to look up
examples right now.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"Argue for your limitations, and sure enough they're yours."  --Richard Bach

From thomas at python.org  Mon May  8 23:20:29 2006
From: thomas at python.org (Thomas Wouters)
Date: Mon, 8 May 2006 23:20:29 +0200
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
	basics (1)
In-Reply-To: <20060508201217.GA23430@python.psfb.org>
References: <20060508201217.GA23430@python.psfb.org>
Message-ID: <9e804ac0605081420g186b456fqe6cd9bf5fd03c270@mail.gmail.com>

On 5/8/06, Neal Norwitz <neal at metaslash.com> wrote:

> test_ctypes
> test test_ctypes failed -- Traceback (most recent call last):
>   File "/home/neal/python/trunk/Lib/ctypes/test/test_python_api.py", line
> 41, in test_PyInt_Long
>     self.failUnlessEqual(grc(42), ref42)
> AssertionError: 336 != 337
>

We've been seeing this error for a while now, and given the test, it isn't
entirely surprising. The test tries to do what regrtest -R:: also does:
check for refcount leakage. I'm not entirely sure why it's failing as I
can't reproduce it (although it could be because 42's refcount is actually
42 :) but it does seem to me that this kind of refleak-checking is somewhat
redundant in the Python testsuite, and it is obvious to me that the breakage
is precisely because it's being run in the Python testsuite (which is a lot
less reliable an environment than ctypes' own, standalone testsuite.)

Thomas, given the refcount-leakage-coverage the ctypes tests are getting in
the Python distribution, do you want to keep running these tests? Is there a
way to not run them in the Python testsuite, but still keep them for the
standalone tests? (If you want that, that is.) Alternatively, we could try
to fix the test. If the problem is indeed what I think it is (42's refcount
being 42 or 41 or 43 at the first grc() call there; I'm not sure) we could
perhaps work around it, using something like:

Index: Lib/ctypes/test/test_python_api.py
===================================================================
--- Lib/ctypes/test/test_python_api.py  (revision 45940)
+++ Lib/ctypes/test/test_python_api.py  (working copy)
@@ -35,6 +35,11 @@

     def test_PyInt_Long(self):
         ref42 = grc(42)
+        if ref42 == 42:
+            # 42's refcount was 42, but since grc() returned 42, it's now
43.
+            # We want to adjust ref42 without tossing our own reference to
42
+            # (or the refcount will go back down to 42.)
+            old42, ref42 = ref42, ref42 + 1
         pythonapi.PyInt_FromLong.restype = py_object
         self.failUnlessEqual(pythonapi.PyInt_FromLong(42), 42)

(Untested, since I can't reproduce, but I think I have it right.)

--
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060508/25caa6dd/attachment.htm 

From rhettinger at ewtllc.com  Mon May  8 23:29:22 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Mon, 08 May 2006 14:29:22 -0700
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <1f7befae0605081409i78eb8f4t56e7ba86ff51d4ea@mail.gmail.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>	<445C628F.3050801@canterbury.ac.nz>
	<445D9EAB.7010901@v.loewis.de>	<1f7befae0605071858r3a52c837q44e842b6918fc234@mail.gmail.com>	<445ED5C9.8070903@v.loewis.de>
	<1f7befae0605081409i78eb8f4t56e7ba86ff51d4ea@mail.gmail.com>
Message-ID: <445FB832.3060308@ewtllc.com>

Tim Peters wrote:

>[Tim Peters, on giving commit privs to a sprinter after
> extracting a signed PSF contributor form]
>  
>
>>>The more realistically ;-) I try to picture the suggested
>>>alternatives, the more sensible this one sounds. ...
>>>      
>>>
>
>[Martin v. L?wis]
>  
>
>>I completely agree. Just make sure you master the mechanics of adding
>>committers.
>>    
>>
Part of the mechanics also involves getting the users set-up on their 
own machines.  For me, it was a complete PITA because of the 
Tortoise/Putty/Pageant/SSH2 dance and then trying to get Python to 
compile with the only compiler I had (MSVC++6).  The advantage of a 
separate sprint repository is that we can essentially leave it unsecured 
and make it easy for everyone to freely checkin / checkout and experiment. 

For the Iceland sprint and bug days, we need a procedure that is 
minimizes the time lost for getting everyone set.  For bug days, it is 
especially critical because a half-day lost may eat-up most of the time 
available for contributing.  For the Iceland sprint, it is important 
because this is just one of many set-up tasks (others include 
downloading and compiling psyco, pypy, etc).


Raymond




From tim.peters at gmail.com  Mon May  8 23:55:30 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 8 May 2006 17:55:30 -0400
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <445FB832.3060308@ewtllc.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>
	<445C628F.3050801@canterbury.ac.nz> <445D9EAB.7010901@v.loewis.de>
	<1f7befae0605071858r3a52c837q44e842b6918fc234@mail.gmail.com>
	<445ED5C9.8070903@v.loewis.de>
	<1f7befae0605081409i78eb8f4t56e7ba86ff51d4ea@mail.gmail.com>
	<445FB832.3060308@ewtllc.com>
Message-ID: <1f7befae0605081455r57405da9sb51c325ce44454ff@mail.gmail.com>

[Raymond Hettinger]
> Part of the mechanics also involves getting the users set-up on their
> own machines.

Yes.

> For me, it was a complete PITA because of the
> Tortoise/Putty/Pageant/SSH2 dance and then trying to get Python to
> compile with the only compiler I had (MSVC++6).  The advantage of a
> separate sprint repository is that we can essentially leave it unsecured
> and make it easy for everyone to freely checkin / checkout and experiment.

This seems to be mixing up unrelated problems.  For example, the
repository in use has nothing to do with that Python became hard to
compile under MSVC 6, right?

For a new library module, sure, any repository will do for development
(& if it contains new C code, then compiler setup is required
regardless of repository).  If, OTOH, someone is looking to tweak
existing code in the Python core, then svn.python.org is "the obvious"
place to work from.

You may (unsure) have been a unique case wrt
Tortoise/Putty/Pageant/SSH2:  I expect most open-source programmers
working on Windows already have those installed (e.g., I did, and long
before Python required them).  Tortoise isn't necessary, but the PuTTY
tools are.  I expect (but don't know) that learning how to use SVN (or
SVK) correctly would be a bigger setup cost for those who haven't
already used them.

Signing a PSF contrib form and getting an SSH2 key installed for
svn.python.org are definitely hassles.  OTOH, new code can't ever get
into Python without signing that form, and collecting forms at a
sprint is a lot easier for most people than hassling with snail mail
or faxing.

So I'd like to collect forms at the sprint regardless of which
repository is in use.

> For the Iceland sprint and bug days, we need a procedure that is
> minimizes the time lost for getting everyone set.  For bug days, it is
> especially critical because a half-day lost may eat-up most of the time
> available for contributing.  For the Iceland sprint, it is important
> because this is just one of many set-up tasks (others include
> downloading and compiling psyco, pypy, etc).

I'm far less hopeful about easing the pain for bug days -- that's
overwhelmingly a repeated "one person generates a small patch, then
bugs someone else to review and commit" dance.  Sprints are more about
people actively collaborating, and across days.

From martin at v.loewis.de  Tue May  9 00:20:09 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 09 May 2006 00:20:09 +0200
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <1f7befae0605081409i78eb8f4t56e7ba86ff51d4ea@mail.gmail.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>	
	<445C628F.3050801@canterbury.ac.nz> <445D9EAB.7010901@v.loewis.de>	
	<1f7befae0605071858r3a52c837q44e842b6918fc234@mail.gmail.com>	
	<445ED5C9.8070903@v.loewis.de>
	<1f7befae0605081409i78eb8f4t56e7ba86ff51d4ea@mail.gmail.com>
Message-ID: <445FC419.8020505@v.loewis.de>

Tim Peters wrote:
> [Martin v. L?wis]
>> I completely agree. Just make sure you master the mechanics of adding
>> committers.
> 
> So there's a practical problem:  is that procedure documented
> somewhere?

I once posted the procedure to all current pydotorg project admins.
I hesitate to post the precise instructions to a public mailing list
(security by obscurity - but you hackers should all know there is
real security in place also).

>  And who is _able_ to do this?  For example, taking the
> "you" literally, I doubt I have access to the machine(s) on which it
> would need to be done.

Actually, you (tim.peters) have the necessary permissions.

> I'd be happy to learn how to do it, and to
> document the procedure (if it's not already documented -- I couldn't
> find it), if someone can teach me the ropes.

Sure will do (in private mail). Maybe I'm just paranoid - if you
think the procedure should be published, I wouldn't object.

Regards,
Martin


From martin at v.loewis.de  Tue May  9 00:23:41 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 09 May 2006 00:23:41 +0200
Subject: [Python-Dev] Python sprint mechanics
In-Reply-To: <445FB832.3060308@ewtllc.com>
References: <1f7befae0605042216t6da65be1v736899a11de4a719@mail.gmail.com>	<445C628F.3050801@canterbury.ac.nz>
	<445D9EAB.7010901@v.loewis.de>	<1f7befae0605071858r3a52c837q44e842b6918fc234@mail.gmail.com>	<445ED5C9.8070903@v.loewis.de>
	<1f7befae0605081409i78eb8f4t56e7ba86ff51d4ea@mail.gmail.com>
	<445FB832.3060308@ewtllc.com>
Message-ID: <445FC4ED.1030200@v.loewis.de>

Raymond Hettinger wrote:
> Part of the mechanics also involves getting the users set-up on their
> own machines.  For me, it was a complete PITA because of the
> Tortoise/Putty/Pageant/SSH2 dance and then trying to get Python to
> compile with the only compiler I had (MSVC++6).  The advantage of a
> separate sprint repository is that we can essentially leave it unsecured
> and make it easy for everyone to freely checkin / checkout and experiment.
> For the Iceland sprint and bug days, we need a procedure that is
> minimizes the time lost for getting everyone set.  For bug days, it is
> especially critical because a half-day lost may eat-up most of the time
> available for contributing.  For the Iceland sprint, it is important
> because this is just one of many set-up tasks (others include
> downloading and compiling psyco, pypy, etc).

If manual merging-back of patches is acceptable, sprinters (and bug-day
people) could just operate on the 2.5a2 source release, or (if they
manage to master Subversion) on a anonymous subversion checkout.

Of course, putting either the trunk or the 2.5aX sources into another
repository might also work - except that anybody doing the merge-back
would have to come up with sensible commit messages, and (ideally)
sensible attributions.

Regards,
Martin

From theller at python.net  Tue May  9 08:53:46 2006
From: theller at python.net (Thomas Heller)
Date: Tue, 09 May 2006 08:53:46 +0200
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
	basics (1)
In-Reply-To: <9e804ac0605081420g186b456fqe6cd9bf5fd03c270@mail.gmail.com>
References: <20060508201217.GA23430@python.psfb.org>
	<9e804ac0605081420g186b456fqe6cd9bf5fd03c270@mail.gmail.com>
Message-ID: <e3pe9q$35h$1@sea.gmane.org>

Thomas Wouters wrote:
> On 5/8/06, Neal Norwitz <neal at metaslash.com> wrote:
> 
>> test_ctypes
>> test test_ctypes failed -- Traceback (most recent call last):
>>   File "/home/neal/python/trunk/Lib/ctypes/test/test_python_api.py", line
>> 41, in test_PyInt_Long
>>     self.failUnlessEqual(grc(42), ref42)
>> AssertionError: 336 != 337
>>
> 
> We've been seeing this error for a while now, and given the test, it isn't
> entirely surprising. The test tries to do what regrtest -R:: also does:
> check for refcount leakage. I'm not entirely sure why it's failing as I
> can't reproduce it (although it could be because 42's refcount is actually
> 42 :) but it does seem to me that this kind of refleak-checking is somewhat
> redundant in the Python testsuite, and it is obvious to me that the 
> breakage
> is precisely because it's being run in the Python testsuite (which is a lot
> less reliable an environment than ctypes' own, standalone testsuite.)

I'm not sure I understand this analysis - how can the refcount be 42 and then,
a little bit later, be 366 or 337?

> Thomas, given the refcount-leakage-coverage the ctypes tests are getting in
> the Python distribution, do you want to keep running these tests? Is 
> there a
> way to not run them in the Python testsuite, but still keep them for the
> standalone tests? (If you want that, that is.)

Yes there is.

> Alternatively, we could try
> to fix the test. If the problem is indeed what I think it is (42's refcount
> being 42 or 41 or 43 at the first grc() call there; I'm not sure) we could
> perhaps work around it, using something like:

Hm, an easier way would be to use an integer other than 42 which cannot be its
own refcount, say, a negative value, or a much larger value like 12345678.

Thomas (H.)


From thomas at python.org  Tue May  9 10:35:32 2006
From: thomas at python.org (Thomas Wouters)
Date: Tue, 9 May 2006 10:35:32 +0200
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
	basics (1)
In-Reply-To: <e3pe9q$35h$1@sea.gmane.org>
References: <20060508201217.GA23430@python.psfb.org>
	<9e804ac0605081420g186b456fqe6cd9bf5fd03c270@mail.gmail.com>
	<e3pe9q$35h$1@sea.gmane.org>
Message-ID: <9e804ac0605090135x16d8236cka9e366062cdaf8be@mail.gmail.com>

On 5/9/06, Thomas Heller <theller at python.net> wrote:
>
> Thomas Wouters wrote:
> > On 5/8/06, Neal Norwitz <neal at metaslash.com> wrote:
> >
> >> test_ctypes
> >> test test_ctypes failed -- Traceback (most recent call last):
> >>   File "/home/neal/python/trunk/Lib/ctypes/test/test_python_api.py",
> line
> >> 41, in test_PyInt_Long
> >>     self.failUnlessEqual(grc(42), ref42)
> >> AssertionError: 336 != 337
> >>
> >
> > We've been seeing this error for a while now, and given the test, it
> isn't
> > entirely surprising. The test tries to do what regrtest -R:: also does:
> > check for refcount leakage. I'm not entirely sure why it's failing as I
> > can't reproduce it (although it could be because 42's refcount is
> actually
> > 42 :) but it does seem to me that this kind of refleak-checking is
> somewhat
> > redundant in the Python testsuite, and it is obvious to me that the
> > breakage
> > is precisely because it's being run in the Python testsuite (which is a
> lot
> > less reliable an environment than ctypes' own, standalone testsuite.)
>
> I'm not sure I understand this analysis - how can the refcount be 42 and
> then,
> a little bit later, be 366 or 337?


Eh, yes,  obviously, it can't, and I wasn't paying enough attention. I guess
something else is getting a reference to 42 somewhere, which'd have to be in
the unittest internals. And since those can change at any time, there's no
way to run the test reliably unless there are no external calls between the
two getrefcount calls.

> Thomas, given the refcount-leakage-coverage the ctypes tests are getting
> in
> > the Python distribution, do you want to keep running these tests? Is
> > there a
> > way to not run them in the Python testsuite, but still keep them for the
> > standalone tests? (If you want that, that is.)
>
> Yes there is.


I'd suggest doing that, then.

> Alternatively, we could try
> > to fix the test. If the problem is indeed what I think it is (42's
> refcount
> > being 42 or 41 or 43 at the first grc() call there; I'm not sure) we
> could
> > perhaps work around it, using something like:
>
> Hm, an easier way would be to use an integer other than 42 which cannot be
> its
> own refcount, say, a negative value, or a much larger value like 12345678.


I don't think that'll work, because they aren't interned. The refcount of
the constant 12345678 and of the result of pythonapi.PyInt_FromLong(12345678)
will be unrelated.

--
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060509/2d395a3e/attachment-0001.htm 

From theller at python.net  Tue May  9 20:23:08 2006
From: theller at python.net (Thomas Heller)
Date: Tue, 09 May 2006 20:23:08 +0200
Subject: [Python-Dev] rest2latex - pydoc writer - tables
In-Reply-To: <Pine.LNX.4.64.0605081626570.11462@lx3.local>
References: <ee2a432c0604182110w49a84a38iace62e9d9b3bf424@mail.gmail.com>	<200604191457.24195.anthony@interlink.com.au>	<1145424790.21740.86.camel@geddy.wooz.org>	<ee2a432c0604182255t4bda0170g14b84c4ddb6b8a85@mail.gmail.com>	<4445FADF.90307@python.net>
	<4451DE1B.2090804@ghaering.de>	<4451E6C1.4040506@python.net>	<Pine.LNX.4.64.0604281216430.5662@lx3.local>	<e2tbfe$akr$1@sea.gmane.org>
	<Pine.LNX.4.64.0605081626570.11462@lx3.local>
Message-ID: <4460DE0C.40501@python.net>

engelbert.gruber at ssg.co.at wrote:
> On Fri, 28 Apr 2006, Thomas Heller wrote:
> 
>> I must work on the docs themselves, so I currently have only two things:
>>
>> - the table in the ctypes tutorial has a totally different look than the other
>>   tables in the docs.  Compare
>>   http://docs.python.org/dev/lib/ctypes-simple-data-types.html
>>   with
>>   http://docs.python.org/dev/lib/module-struct.html
> 
> could you try docutils.sf.net sandbox pydoc-writer checkout
> (svn or the snapshot or tomorrow
>   http://docutils.sourceforge.net/sandbox/pydoc-writer/mkpydoc.py)

The table looks perfect now - thanks!

>> - feature request: it would be very nice if it were possible to generate links
>>   into the index for functions and types from the rest sources.
> 
> into the index means what ? into the module index ?

Into the library index.  I assume that requires definitions contained in
\begin{funcdesc}...\end{funcdesc} blocks (or datadesc, methoddesc, and so on).

Thomas


From theller at python.net  Tue May  9 22:22:09 2006
From: theller at python.net (Thomas Heller)
Date: Tue, 09 May 2006 22:22:09 +0200
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
	basics (1)
In-Reply-To: <9e804ac0605090135x16d8236cka9e366062cdaf8be@mail.gmail.com>
References: <20060508201217.GA23430@python.psfb.org>	<9e804ac0605081420g186b456fqe6cd9bf5fd03c270@mail.gmail.com>	<e3pe9q$35h$1@sea.gmane.org>
	<9e804ac0605090135x16d8236cka9e366062cdaf8be@mail.gmail.com>
Message-ID: <e3qtlh$bio$1@sea.gmane.org>

Thomas Wouters wrote:
> On 5/9/06, Thomas Heller <theller at python.net> wrote:
>>
>> Thomas Wouters wrote:
>> > On 5/8/06, Neal Norwitz <neal at metaslash.com> wrote:
>> >
>> >> test_ctypes
>> >> test test_ctypes failed -- Traceback (most recent call last):
>> >>   File "/home/neal/python/trunk/Lib/ctypes/test/test_python_api.py",
>> line
>> >> 41, in test_PyInt_Long
>> >>     self.failUnlessEqual(grc(42), ref42)
>> >> AssertionError: 336 != 337
>> >>

>> Thomas, given the refcount-leakage-coverage the ctypes tests are getting
>> in
>> > the Python distribution, do you want to keep running these tests? Is
>> > there a
>> > way to not run them in the Python testsuite, but still keep them for 
>> the
>> > standalone tests? (If you want that, that is.)
>>
>> Yes there is.
> 
> 
> I'd suggest doing that, then.

This is done, now.  And thanks for the analysis.

Thomas


From blais at furius.ca  Tue May  9 23:31:36 2006
From: blais at furius.ca (Martin Blais)
Date: Tue, 9 May 2006 17:31:36 -0400
Subject: [Python-Dev] Alternative path suggestion
In-Reply-To: <b348a0850605071507s5376186fk93fd3fefad67c602@mail.gmail.com>
References: <b348a0850605021602h6a3f0e5dq7ec847a9b99f964f@mail.gmail.com>
	<b348a0850605061521w3c06e181hed0ae3a9aa49fe25@mail.gmail.com>
	<b348a0850605071507s5376186fk93fd3fefad67c602@mail.gmail.com>
Message-ID: <8393fff0605091431y5e446fackb17c38c2776478ba@mail.gmail.com>

Unfortunately, I have no time to read all the details on this
interesting thread (I wish), but I also have an alternative path
implementation, feel free to scavenge anything from it you might find
useful.   I put it here:

http://furius.ca/pubcode/pub/conf/common/lib/python/ipath.py.html

I think it is similar in spirit to what you proposed, a path being
treated as a tuple, without the separators.  This only makes sense. 
Using this class solved some really annoying problems while working on
a project where I developed on my Linux laptop and released on the
client's Windows machines.

(I wish I could contribute in a more participatory way, but I'm really
really swamped.)

Anyway, +1 for a path encapsulation that is not a string, and that
abstracts case-sensitiveness and path separators automatically.

cheers,




On 5/7/06, Noam Raphael <noamraph at gmail.com> wrote:
> Hello all again!
>
> Thanks to Mike's suggestion, I now opened a new wiki page,
> AlternativePathDiscussion, in
> http://wiki.python.org/moin/AlternativePathDiscussion
>
> The idea is to organize the discussion by dividing it into multiple
> sections, and seeing what is agreed and what should be further
> discussed. I now wrote there what I think, and quoted a few opinions
> from other posts. The posts by others are only a minority - what's
> written there currently is mostly what I think. I'm sorry for the
> inconvinience, but please go there and post your opinions (you can
> copy and paste from your emails, of course).
>
> I apologize first for not replying to each post, and second for only
> writing my opinions in the wiki. I simply write pretty slowly in
> English, and am already developing a growing sleep-hours deficit. I
> hope you forgive me.
>
> Have a good day,
> Noam
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/blais%40furius.ca
>


--
Martin Blais
Furius Python Training -- http://furius.ca/training/

From lcaamano at gmail.com  Wed May 10 03:30:06 2006
From: lcaamano at gmail.com (Luis P Caamano)
Date: Tue, 9 May 2006 21:30:06 -0400
Subject: [Python-Dev] possible use of __decorates__ in
	functools.decorator
In-Reply-To: <c56e219d0605070835y28cef932odde47906546ffcac@mail.gmail.com>
References: <c56e219d0605070835y28cef932odde47906546ffcac@mail.gmail.com>
Message-ID: <c56e219d0605091830y1cbac0bam50e9dae885d8a466@mail.gmail.com>

Turns out I didn't need to use the __decorates__ attribute to get the
name of the class.  I posted on clp (what I should've done in the
first place) and got an answer back pretty quickly.  After a duh!,
hitting my forehead and a couple "of couse" comments, it was as simple
as saving the original caller via a sys._getframe(2) call in the
decorator factory.

Sorry for the noise.

--
Luis P Caamano
Atlanta, GA USA

From arigo at tunes.org  Wed May 10 13:08:24 2006
From: arigo at tunes.org (Armin Rigo)
Date: Wed, 10 May 2006 13:08:24 +0200
Subject: [Python-Dev] New methods for weakref.Weak*Dictionary types
In-Reply-To: <1f7befae0605011357s6427602cnc6ca6a8bfa442ec8@mail.gmail.com>
References: <200605011614.30613.fdrake@acm.org>
	<1f7befae0605011357s6427602cnc6ca6a8bfa442ec8@mail.gmail.com>
Message-ID: <20060510110824.GA3892@code0.codespeak.net>

Hi Tim,

On Mon, May 01, 2006 at 04:57:06PM -0400, Tim Peters wrote:
> """
>     # Return a list of weakrefs to all the objects in the collection.
>     # Because a weak dict is used internally, iteration is dicey (the
>     # underlying dict may change size during iteration, due to gc or
>     # activity from other threads).

But then, isn't the real problem the fact that applications cannot
safely iterate over weak dicts?  This fact could be viewed as a bug, and
fixed without API changes.  For example, I can imagine returning to the
client an iterator that "locks" the dictionary.  Upon exhaustion, or via
the __del__ of the iterator, or even in the 'finally:' part of the
generator if that's how iteration is implemented, the dict is unlocked.
Here "locking" means that weakrefs going away during this time are not
eagerly removed from the dict; they will be removed only when the dict
is unlocked.


A bientot,

Armin.

From tim.peters at gmail.com  Wed May 10 19:08:11 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 10 May 2006 13:08:11 -0400
Subject: [Python-Dev] New methods for weakref.Weak*Dictionary types
In-Reply-To: <20060510110824.GA3892@code0.codespeak.net>
References: <200605011614.30613.fdrake@acm.org>
	<1f7befae0605011357s6427602cnc6ca6a8bfa442ec8@mail.gmail.com>
	<20060510110824.GA3892@code0.codespeak.net>
Message-ID: <1f7befae0605101008g42190d60nc163d53eb3095c76@mail.gmail.com>

[Tim Peters]
>> """
>>     # Return a list of weakrefs to all the objects in the collection.
>>     # Because a weak dict is used internally, iteration is dicey (the
>>     # underlying dict may change size during iteration, due to gc or
>>     # activity from other threads).

[Armin Rigo]
> But then, isn't the real problem the fact that applications cannot
> safely iterate over weak dicts?

Well, in the presence of threads, iterating over any dict (weak or
not) may blow up:  while some thread is iterating over the dict, other
threads can change the size of the dict, and then the dict iterator
will blow up the next time it's invoked.  In the context from which
that comment was extracted, that's a potential problem (which is why
the comment mentions "activity from other threads" :-)).

> This fact could be viewed as a bug, and fixed without API changes.
>  For example, I can imagine returning to the client an iterator that "locks" the
> dictionary.  Upon exhaustion, or via the __del__ of the iterator, or even in the
> 'finally:' part of the generator if that's how iteration is implemented, the dict is
> unlocked.
>
> Here "locking" means that weakrefs going away during this time are not
> eagerly removed from the dict; they will be removed only when the dict
> is unlocked.

That could remove one source of potential iteration surprises unique
to weak dicts, due to "magical" removal of dict entries (note that it
would probably need a thread-safe count of outstanding iterators, and
not "unlock the dict" until the count fell to 0).  Other threads could
still change the dict's size _seemingly_ "by magic" (from the dict
iterator's POV).  I don't know whether fixing "magical" weak-dict
removal without fixing "seemingly magical" weak-dict removal or
addition via other threads would be worth the bother.  Anyone burned
by either now has learned to avoid the iter{keys,values,items}()
methods.

Without more support in the dict implementation (and support that
would probably be difficult to add), the only thoroughly safe strategy
is to atomically materialize a hidden collection of the keys, values,
or items to be iterated over, and have the iterator march over those. 
In effect, apps do that themselves now by iterating over
.keys()/values()/items() instead of their .iterXYZ() versions.  Many
apps can get away without that, though, so there's value in keeping
the current "obvious" dict iterators.

From martin at v.loewis.de  Wed May 10 21:24:53 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 10 May 2006 21:24:53 +0200
Subject: [Python-Dev] [Python-checkins] r45925 - in python/trunk:
 Lib/tempfile.py	Lib/test/test_os.py Misc/NEWS Modules/posixmodule.c
In-Reply-To: <4461EB0A.5000800@egenix.com>
References: <20060506163302.CA3D71E4004@bag.python.org>	<445CE018.7040609@egenix.com>	<445D9B0C.5080808@v.loewis.de>	<445DF658.7050609@egenix.com>
	<445E1E09.1040908@v.loewis.de>	<445E3615.7000108@egenix.com>
	<445E433C.10908@v.loewis.de>	<445F23BD.5040906@egenix.com>
	<445FC265.6060508@v.loewis.de> <4461EB0A.5000800@egenix.com>
Message-ID: <44623E05.4080700@v.loewis.de>

M.-A. Lemburg wrote:
> I've tried to find documentation on _dosmaperr() but there's
> nothing on MSDN. Is this an official API ?

No. It always was undocumented, and apparently intentionally so
(both the API, and the precise details of the mapping).

> FWIW, this is the official system error code table:
> 
> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/debug/base/system_error_codes.asp

This is actually the contents of winerror.h (in a different presentation
form).

> A complete table would be quite large...

Actually, no. _dosmaperr maps the majority of the codes to EINVAL
(invalid argument). Only a small subset is special-cased.

> Right, but the APIs you changed used to raise IOError
> using the DOS error codes - this is where there's an
> incompatibility, since they now raise WindowsErrors
> with completely different error codes.
> 
> The discussion
> 
> http://mail.python.org/pipermail/python-dev/2006-January/060243.html
> 
> suggests that you are aware of this.

I fully understand that this is an incompatible change; sure.

I'm saying that changing WindowsError to include set errno
to DOS error codes would *also* be an incompatible change.

> Since code catching IOError will also see any WindowsError
> exception, I'd opt for making .errno always return the
> DOS/BSD error codes and have WindowsError grow an
> additional .winerrno attribute which contains the
> more fine-grained Win32 error code.

Ok. I'll try to come up with a patch, but let me repeat
this: even though this would nearly restore the behavior
for the functions that I changed, it would *a different*
incompatible change, since it would change the behavior
of the places that *currently* (i.e. in 2.4)
raise WindowsError.

Regards,
Martin


From mal at egenix.com  Wed May 10 21:51:24 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 10 May 2006 21:51:24 +0200
Subject: [Python-Dev] [Python-checkins] r45925 - in python/trunk:
 Lib/tempfile.py	Lib/test/test_os.py Misc/NEWS Modules/posixmodule.c
In-Reply-To: <44623E05.4080700@v.loewis.de>
References: <20060506163302.CA3D71E4004@bag.python.org>	<445CE018.7040609@egenix.com>	<445D9B0C.5080808@v.loewis.de>	<445DF658.7050609@egenix.com>
	<445E1E09.1040908@v.loewis.de>	<445E3615.7000108@egenix.com>
	<445E433C.10908@v.loewis.de>	<445F23BD.5040906@egenix.com>
	<445FC265.6060508@v.loewis.de>	<4461EB0A.5000800@egenix.com>
	<44623E05.4080700@v.loewis.de>
Message-ID: <4462443C.7010608@egenix.com>

Martin v. L?wis wrote:
> I'm saying that changing WindowsError to include set errno
> to DOS error codes would *also* be an incompatible change.
> 
>> Since code catching IOError will also see any WindowsError
>> exception, I'd opt for making .errno always return the
>> DOS/BSD error codes and have WindowsError grow an
>> additional .winerrno attribute which contains the
>> more fine-grained Win32 error code.
> 
> Ok. I'll try to come up with a patch, but let me repeat
> this: even though this would nearly restore the behavior
> for the functions that I changed, it would *a different*
> incompatible change, since it would change the behavior
> of the places that *currently* (i.e. in 2.4)
> raise WindowsError.

Thanks !

I know that this will be a different incompatible change,
but only for the better, I think.

The only way around this breakage I see would be to introduce
the ErrorCode integer sub-class I posted on the checkins list
which then allows .errno to also compare equal to the .winerrno
values as well:

# Error code objects
class ErrorCode(int):
    values = ()
    def __new__(cls, basevalue, *aliasvalues):
        return int.__new__(cls, basevalue)
    def __init__(self, basevalue, *aliasvalues):
        self.values = (basevalue,) + aliasvalues
    def __cmp__(self, other):
        if other in self.values:
            return 0
        if other < self.values:
            return 1
        else:
            return -1

EEXISTS = ErrorCode(17, 183, 80)

I'm not sure whether it's worth the trouble.

BTW, and intended as offer for compromise, should we instead
add the Win32 codes to the errno module (or a new winerrno
module) ?! I can write a parser that takes winerror.h and
generates the module code.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 10 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From martin at v.loewis.de  Wed May 10 22:13:23 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 10 May 2006 22:13:23 +0200
Subject: [Python-Dev] [Python-checkins] r45925 - in python/trunk:
 Lib/tempfile.py	Lib/test/test_os.py Misc/NEWS Modules/posixmodule.c
In-Reply-To: <4462443C.7010608@egenix.com>
References: <20060506163302.CA3D71E4004@bag.python.org>	<445CE018.7040609@egenix.com>	<445D9B0C.5080808@v.loewis.de>	<445DF658.7050609@egenix.com>
	<445E1E09.1040908@v.loewis.de>	<445E3615.7000108@egenix.com>
	<445E433C.10908@v.loewis.de>	<445F23BD.5040906@egenix.com>
	<445FC265.6060508@v.loewis.de>	<4461EB0A.5000800@egenix.com>
	<44623E05.4080700@v.loewis.de> <4462443C.7010608@egenix.com>
Message-ID: <44624963.4020604@v.loewis.de>

M.-A. Lemburg wrote:
> BTW, and intended as offer for compromise, should we instead
> add the Win32 codes to the errno module (or a new winerrno
> module) ?! I can write a parser that takes winerror.h and
> generates the module code.

Instead won't help: the breakage will still occur. It would
be possible to put symbolic constants into the except clauses,
instead of using the numeric values, but you still have to
add all these "except WindowsError,e" clauses, even if
the constants were available.

However, in addition would be useful: people will want to
check for specific Win32 error codes, just because they are
more descriptive. I propose to call the module "winerror"
(in parallel to winerror.h, just as the errno module
parallels errno.h)

Adding them all to the errno would work for most cases,
except that you get conflicts for errno.errcode.

Regards,
Martin


From mwh at python.net  Thu May 11 01:46:10 2006
From: mwh at python.net (Michael Hudson)
Date: Thu, 11 May 2006 00:46:10 +0100
Subject: [Python-Dev] 2.5 open issues
In-Reply-To: <ee2a432c0604282005l574b7472u88a586b7e5ca3a76@mail.gmail.com>
	(Neal Norwitz's message of "Fri, 28 Apr 2006 20:05:26 -0700")
References: <ee2a432c0604272258t480f527fm5dfe93f5cea53d42@mail.gmail.com>
	<20060428122045.GB8336@localhost.localdomain>
	<ee2a432c0604282005l574b7472u88a586b7e5ca3a76@mail.gmail.com>
Message-ID: <2m8xp9e94t.fsf@starship.python.net>

"Neal Norwitz" <nnorwitz at gmail.com> writes:

> Here's what's left for 2.5 after the most recent go around.
>
> There's no owner for these items.  If no one takes them, they won't
> get done and I will move them to deferred within a week:
>
>   * Add @decorator decorator to functional, rename to functools?
>     What's the benefit of @decorator?  Who made functional?  It's new
> in 2.5, right?  If so, move it or it will be functional for all 2.x.
>
>   * Remove the fpectl module?
>    Does anyone use this?  It can probably be removed, but someone
> needs to do the work.

This was my idea; I'm not really offering to do the work now and don't
think that it's terribly urgent, so I propose leaving this until 2.6.

Cheers,
mwh

-- 
  <dash> it's like a bicycle
  <dash> but with internet                      -- from Twisted.Quotes

From nnorwitz at gmail.com  Thu May 11 08:42:23 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Wed, 10 May 2006 23:42:23 -0700
Subject: [Python-Dev] nag, nag -- 2.5 open issues
Message-ID: <ee2a432c0605102342g2c950874hc32d4656ddb79f8c@mail.gmail.com>

If you are addressed on this message, it means you have open issues
that need to be resolved for 2.5.  Some of these issues are
documentation, others are code issues.  This information comes from
PEP 356.  The best way to get me to stop bugging you is to close out
these tasks. :-)

Note the next alpha is coming up in a week or two, so now is a good
time to finish these.

Who is the owner for getting new icons installed with the new logo?
Georg did some work, but it seems like a lot more work needs to be
done to get them installed on all platforms.  This will be deferred
unless someone makes it happen.

Code issues:
+++++++++++
Fred, Fredrik, Martin:  xmlplus/xmlcore situation re: ElementTree
Jeremy: AST compiler issues
Philip, AMK:  add wsgiref to stdlib -- is this happening?

We should really try to get these resolved soon, especially the XML
issue since that could result in some sort of API change.  The AST
issues should just be bug fixes AFAIK.

Documentation missing:
+++++++++++++++++++
Fredrik: ElementTree
Gerhard: pysqlite
Martin: msilib  -- Martin/Andrew is this done?
Thomas: ctypes

I think we heard from Gerhard and Thomas about their doc plans.
Fredrik, are you planning to get ET doc into the python docs?

Let me know if there's anything I missed.

If you don't expect to have time to finish these tasks, then it's your
job to find someone else who will (and it would be good to reply to
this message).  Update the PEP or tell me and I'll update the PEP.

n

From mal at egenix.com  Thu May 11 13:26:11 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 11 May 2006 13:26:11 +0200
Subject: [Python-Dev] [Python-checkins] r45925 - in python/trunk:
 Lib/tempfile.py	Lib/test/test_os.py Misc/NEWS Modules/posixmodule.c
In-Reply-To: <44624963.4020604@v.loewis.de>
References: <20060506163302.CA3D71E4004@bag.python.org>	<445CE018.7040609@egenix.com>	<445D9B0C.5080808@v.loewis.de>	<445DF658.7050609@egenix.com>
	<445E1E09.1040908@v.loewis.de>	<445E3615.7000108@egenix.com>
	<445E433C.10908@v.loewis.de>	<445F23BD.5040906@egenix.com>
	<445FC265.6060508@v.loewis.de>	<4461EB0A.5000800@egenix.com>
	<44623E05.4080700@v.loewis.de>	<4462443C.7010608@egenix.com>
	<44624963.4020604@v.loewis.de>
Message-ID: <44631F53.1010507@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
>> BTW, and intended as offer for compromise, should we instead
>> add the Win32 codes to the errno module (or a new winerrno
>> module) ?! I can write a parser that takes winerror.h and
>> generates the module code.
> 
> Instead won't help: the breakage will still occur. It would
> be possible to put symbolic constants into the except clauses,
> instead of using the numeric values, but you still have to
> add all these "except WindowsError,e" clauses, even if
> the constants were available.

Right, I meant "instead of using ErrorCode approach".

> However, in addition would be useful: people will want to
> check for specific Win32 error codes, just because they are
> more descriptive. I propose to call the module "winerror"
> (in parallel to winerror.h, just as the errno module
> parallels errno.h)

Ok.

> Adding them all to the errno would work for most cases,
> except that you get conflicts for errno.errcode.

I think it's better to separate the two - you wouldn't
want to load all the winerror codes on Unix.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 11 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From vys at renet.ru  Thu May 11 14:04:54 2006
From: vys at renet.ru (Vladimir 'Yu' Stepanov)
Date: Thu, 11 May 2006 16:04:54 +0400
Subject: [Python-Dev] total ordering.
In-Reply-To: <ca471dc20605070825k24c81178p1a2fb1123211f98d@mail.gmail.com>
References: <20060505105630.67BE.JCARLSON@uci.edu>
	<445C67EE.1030704@renet.ru>	<20060506025902.67DA.JCARLSON@uci.edu>	<20060506205924.GA14684@fox.renet.ru>
	<ca471dc20605070825k24c81178p1a2fb1123211f98d@mail.gmail.com>
Message-ID: <44632866.4090906@renet.ru>

Guido van Rossum wrote:
> On 5/6/06, Vladimir Yu. Stepanov <root at renet.ru> wrote:
> [proposing a total ordering between types]
>
> It Ain't Gonna Happen. (From now on, I'll write this as IAGH.)
>
> In Python 3000, we'll actually *remove* ordering between arbitrary
> types as a feature; only types that explicitly care to be ordered with
> respect to one another will be ordered. Equality tests are unaffected,
> x==y will simply return False if x and y are of incomparable types;
> but x<y (etc.) will raise an exception.
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/vys%40renet.ru
>
>
>   
When comparison is made like ("something" < 123), generating of
exception is necessary. At sorting or use of binary trees it is not
so obvious. The behavior function of comparison in this case
depends on needs of the user. At present time use of redefined
function is complicated.

I shall give an example. By a call of a method sort for the list
can give absolutely different exceptions, depending on the order
of sorted values or data (thanks for David Mertz and Josiah
Carlson):
-----------------------------------------------------------------
> >> [chr(255),1j,1,u't'].sort()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: no ordering relation is defined for complex numbers
> >> [chr(255),1j,u't',1].sort()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0: 
ordinal not in range(128)
-----------------------------------------------------------------

If for Python-3000 similar it will be shown concerning types
str(), int(), complex() and so on, and the type of exceptions
will strongly vary, it will make problematic redefinition of
behavior of function of sorting.

It would be quite good to create one more class of exceptions
which would unify generation of mistakes at comparison of
non-comparable types or data.


From vys at renet.ru  Thu May 11 14:30:38 2006
From: vys at renet.ru (Vladimir 'Yu' Stepanov)
Date: Thu, 11 May 2006 16:30:38 +0400
Subject: [Python-Dev] binary trees.
In-Reply-To: <20060506025902.67DA.JCARLSON@uci.edu>
References: <20060505105630.67BE.JCARLSON@uci.edu> <445C67EE.1030704@renet.ru>
	<20060506025902.67DA.JCARLSON@uci.edu>
Message-ID: <44632E6E.8030207@renet.ru>

Josiah Carlson wrote:
> And you can actually compare str and unicode, so, if you have a str that
> is greater than the unicode, you run into this issue.  With unicode
> becoming str in Py3k, we may not run into this issue much then, unless
> bytes are comparable to str, in which case we end up witht the same
> problems.
>
> Actually, according to my google of "python dev total ordering", it
> gives me...
>
> http://mail.python.org/pipermail/python-dev/2003-March/034169.html
> http://mail.python.org/pipermail/python-list/2003-March/154142.html
>
> Which were earlier discussions on this topic, which are quite topical. 
> The ultimate solution is to choose a total ordering on types and
> consider the problem solved.  Then list.sort(
>   

Under the second reference there was a question. complex it can be
partially comparable with int, long, float? In fact 1 == 1+0j?

From ronaldoussoren at mac.com  Thu May 11 15:13:50 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 11 May 2006 15:13:50 +0200
Subject: [Python-Dev] python 2.4 and universal binaries
Message-ID: <6DB21F1F-5D54-4A50-9516-B143F4AA2C37@mac.com>

Hi,

I'd like to backport the patches I've done to the trunk regarding  
universal binary support for OSX and endian issues in Mac specific  
modules.

The last set seems easy enough, all of those are clearly bugfixes.  
I'm not sure if the universal binary patches are acceptable for  
backport (and specifically the change to pyconfig.h.in)

The rationale for this is simple: Apple seems to pick up a recent  
copy of python for every new major release of OSX (Python 2.2.x for  
Jaguar, Python 2.3.0 for Panther, Python 2.3.5 for Tiger) and is  
therefore likely to use Python 2.4.x for the next release of the OS.   
The official python 2.4 tree currently doesn't support building  
univeral binaries. Given that Apple ships a broken universal build of  
python 2.3 with the new intel macs I'm worrying that they won't fix  
this for their next OS release, which would increase the support load  
on the pythonmac-sig list. By adding support for universal binaries  
to python 2.4 we'd reduce the change for a broken python in the next  
OSX release.

Ronald

From guido at python.org  Thu May 11 16:27:54 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 May 2006 07:27:54 -0700
Subject: [Python-Dev] python 2.4 and universal binaries
In-Reply-To: <6DB21F1F-5D54-4A50-9516-B143F4AA2C37@mac.com>
References: <6DB21F1F-5D54-4A50-9516-B143F4AA2C37@mac.com>
Message-ID: <ca471dc20605110727s1ea75908ldd79f03bf4365987@mail.gmail.com>

Sounds like an all-round good plan to me.

On 5/11/06, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
> Hi,
>
> I'd like to backport the patches I've done to the trunk regarding
> universal binary support for OSX and endian issues in Mac specific
> modules.
>
> The last set seems easy enough, all of those are clearly bugfixes.
> I'm not sure if the universal binary patches are acceptable for
> backport (and specifically the change to pyconfig.h.in)
>
> The rationale for this is simple: Apple seems to pick up a recent
> copy of python for every new major release of OSX (Python 2.2.x for
> Jaguar, Python 2.3.0 for Panther, Python 2.3.5 for Tiger) and is
> therefore likely to use Python 2.4.x for the next release of the OS.
> The official python 2.4 tree currently doesn't support building
> univeral binaries. Given that Apple ships a broken universal build of
> python 2.3 with the new intel macs I'm worrying that they won't fix
> this for their next OS release, which would increase the support load
> on the pythonmac-sig list. By adding support for universal binaries
> to python 2.4 we'd reduce the change for a broken python in the next
> OSX release.
>
> Ronald
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu May 11 16:36:36 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 May 2006 07:36:36 -0700
Subject: [Python-Dev] total ordering.
In-Reply-To: <44632866.4090906@renet.ru>
References: <20060505105630.67BE.JCARLSON@uci.edu> <445C67EE.1030704@renet.ru>
	<20060506025902.67DA.JCARLSON@uci.edu>
	<20060506205924.GA14684@fox.renet.ru>
	<ca471dc20605070825k24c81178p1a2fb1123211f98d@mail.gmail.com>
	<44632866.4090906@renet.ru>
Message-ID: <ca471dc20605110736n588f7548wf9cea8d5f79e3564@mail.gmail.com>

On 5/11/06, Vladimir 'Yu' Stepanov <vys at renet.ru> wrote:
> If for Python-3000 similar it will be shown concerning types
> str(), int(), complex() and so on, and the type of exceptions
> will strongly vary, it will make problematic redefinition of
> behavior of function of sorting.

Not really. We'll just document that sort() should only be used on a
list of objects that implement a total ordering. The behavior
otherwise will simply be undefined; it will raise whatever exception
is first raised by an unsupported comparison (most likely TypeError).
In practice this won't be a problem.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From edloper at gradient.cis.upenn.edu  Thu May 11 18:06:31 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Thu, 11 May 2006 12:06:31 -0400
Subject: [Python-Dev] total ordering.
In-Reply-To: <ca471dc20605110736n588f7548wf9cea8d5f79e3564@mail.gmail.com>
References: <20060505105630.67BE.JCARLSON@uci.edu>
	<445C67EE.1030704@renet.ru>	<20060506025902.67DA.JCARLSON@uci.edu>	<20060506205924.GA14684@fox.renet.ru>	<ca471dc20605070825k24c81178p1a2fb1123211f98d@mail.gmail.com>	<44632866.4090906@renet.ru>
	<ca471dc20605110736n588f7548wf9cea8d5f79e3564@mail.gmail.com>
Message-ID: <44636107.7010606@gradient.cis.upenn.edu>

Guido van Rossum wrote:
> On 5/11/06, Vladimir 'Yu' Stepanov <vys at renet.ru> wrote:
>> If for Python-3000 similar it will be shown concerning types
>> str(), int(), complex() and so on, and the type of exceptions
>> will strongly vary, it will make problematic redefinition of
>> behavior of function of sorting.
> 
> Not really. We'll just document that sort() should only be used on a
> list of objects that implement a total ordering. [...]

It might be useful in some cases to have a keyword argument to 
sort/sorted that says to ignore exceptions arising from comparing 
elements, and leaves the ordering of non-comparable values undefined.

-Edward

From martin at v.loewis.de  Thu May 11 22:59:14 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 11 May 2006 22:59:14 +0200
Subject: [Python-Dev] nag, nag -- 2.5 open issues
In-Reply-To: <ee2a432c0605102342g2c950874hc32d4656ddb79f8c@mail.gmail.com>
References: <ee2a432c0605102342g2c950874hc32d4656ddb79f8c@mail.gmail.com>
Message-ID: <4463A5A2.8030603@v.loewis.de>

Neal Norwitz wrote:
> Martin: msilib  -- Martin/Andrew is this done?

That's done, yes.

Martin

From martin at v.loewis.de  Thu May 11 23:10:04 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 11 May 2006 23:10:04 +0200
Subject: [Python-Dev] python 2.4 and universal binaries
In-Reply-To: <6DB21F1F-5D54-4A50-9516-B143F4AA2C37@mac.com>
References: <6DB21F1F-5D54-4A50-9516-B143F4AA2C37@mac.com>
Message-ID: <4463A82C.9070208@v.loewis.de>

Ronald Oussoren wrote:
> The rationale for this is simple: Apple seems to pick up a recent  
> copy of python for every new major release of OSX (Python 2.2.x for  
> Jaguar, Python 2.3.0 for Panther, Python 2.3.5 for Tiger) and is  
> therefore likely to use Python 2.4.x for the next release of the OS.   
> The official python 2.4 tree currently doesn't support building  
> univeral binaries. Given that Apple ships a broken universal build of  
> python 2.3 with the new intel macs I'm worrying that they won't fix  
> this for their next OS release, which would increase the support load  
> on the pythonmac-sig list. By adding support for universal binaries  
> to python 2.4 we'd reduce the change for a broken python in the next  
> OSX release.

It's fine with me to backport this. Please be aware, though, that the
release of 2.4.4 is likely to occur *after* the release of 2.5.0.
Not sure when Apple decides to freeze the Python version for the next
OS release, but if they really chose the most recent one "by policy",
that will be either 2.5.0 or 2.4.3.

Regards,
Martin

From tdelaney at avaya.com  Fri May 12 00:39:59 2006
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Fri, 12 May 2006 08:39:59 +1000
Subject: [Python-Dev] total ordering.
Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1E6E1@au3010avexu1.global.avaya.com>

Edward Loper wrote:

> It might be useful in some cases to have a keyword argument to
> sort/sorted that says to ignore exceptions arising from comparing
> elements, and leaves the ordering of non-comparable values undefined.

Why? Far better to use a key (or cmp if you really want) that imposes a
total (or at least consistent) ordering. Otherwise sorting is random.

Tim Delaney

From terry at jon.es  Fri May 12 00:55:16 2006
From: terry at jon.es (Terry Jones)
Date: Fri, 12 May 2006 00:55:16 +0200
Subject: [Python-Dev] Efficient set complement and operation on
	large/infinite sets.
Message-ID: <17507.49364.735663.587038@terry.jones.tc>

I'm about to write some code to manage sets, and wanted to float a few
thoughts here because I have various ideas about how to implement what I
want to do, and I think one of them could be done by changing Python's set
type in useful and backward compatible way.

Apologies if this is discussed in the archives. I didn't see it.

My original motivation is a need to work efficiently with very large sets,
let's say 10M elements. In some cases, you can operate on a set without
needing to explicitly represent all its elements. For example, if my set is
the Universal (U) set, then X in U, should return True. If my set is U and
I union it with any other set, it is unchanged, etc. By also maintaining a
finite set of things that are _not_ in my large set, I can do other
operations efficiently. E.g., calling U.remove(X) would _add_ X to the set
of elements that were known not to be in the big set that you were trying
to represent efficiently. In most cases, such a set class would simply do
the opposite/inverse operation on its finite exclusion set.

While it's convenient to warm up by talk about the infinite Universal set,
we could just as easily be talking about arbitrarily large finite sets.
There are some examples below.

Another way to discuss the same thing is to talk about set complements. It
would be nice to be able to complement (or invert) a set. Once you did so,
you might have an infinite set on your hands, but as the last paragraph
argues, you can still operate on an infinite set efficiently. Naturally,
you can't fully enumerate it, or take its size.

I think that giving Python sets the ability to handle complementarity would
have some nice uses - and that these go well beyond the particular thing
that I need right now.

For example, suppose you want to represent the set of all floating point
numbers. You just create an empty set and complement it. Then you can test
it as usual, and begin to exclude things from it. Or if you have a system
with 10M objects and you need to support operations on these objects via a
query language that lets the user type "not X" where X is a set of objects,
there is a natural and efficient way (constant time) to execute the query -
you just mark the set as being inverted (complemented).

If I were going to implement the above by changing Python's built in set
type, here's what I think I'd do:

Add four optional keyword arguments to the set constructor:

  - invert=False
  - universalSet=None
  - universalSetSize=None
  - universeExclusionFunc=None

invert is just a flag, indicating the sense of the set.

If inverse is False:

  the set behaves exactly as a normal Python set and the other three
  new arguments are ignored.

If inverse is True:

  universalSet represents the universal set. If it is None, then the
  universal set is infinite. Otherwise, universalSet is an iterable. The
  implementation should not call this iterable unless it's unavoidable, on
  the presumption that if the programmer wanted to operate directly on the
  contents of this iterable, they'd be doing it in a conventional fashion
  (e.g., with a normal set). The universalSetSize, used for len()
  calculations, is the number of elements in universalSet, if known &
  if finite.

  The universeExclusionFunc can be called with a single element argument to
  see if the element should be considered excluded from the universalSet.
  This may seem like a weird idea, but it's the key to flexibility and
  efficiency. In an invert=True set, the code would use the normal set
  content as a mutable set of objects that are not in the universe, as
  described above, but would, in addition, use the universeExclusionFunc
  (if defined) to identify elements not in the set (because they are not in
  the universe), and thus avoid the use of the expensive (or infinite)
  universalSet.

Note that an inverted (or normal) set can be inverted by simply setting
invert=False, so this requires a new invert() method (which could be
triggered by the use of something like 'not' or '!' or '~'). In this case,
an inverted set becomes a normal Python set. The elements represented by
universeExclusionFunc remain invalid - they are simply not in the universe
(and, if deemed sensible, this could be sanity checked in add()).

If it's not clear, when a set is inverted, any iterable given to __init__,
(i.e., the iterable argument in the normal case of constructing a Python
set), is just treated as usual (but in this case will be taken to
represent things not in the set initially).


Here are some examples of usage:

1. Suppose you want to work with the set of integers [0, 100000000) and
   that initially your set is all such integers. You could create this set
   via:

       S = set(invert=True,
               universalSet=xrange(100000000),
               universalSetSize=100000000,
               universeExclusionFunc=(lambda x: x >= 100000000)

   This has the intended effect, is efficient, and no-one need call the
   iterator. You can (I think) do everything you could do with a normal set
   (including inverting it). If the user happens to want to list all the
   elements in the set, that's their business and they can do it, except in
   the case where universalSet=None, in which case I would raise an
   exception.

2. Suppose you wanted to represent the set of all floating point numbers
   > 0.0 that were not integers. You'd do this via:

       def excludeFunc(f):
           try:
               f = float(f)
           except ValueError:
               return True
           if f <= 0.0:
               return True
           if f - math.floor(f) == 0.0:
               return True
           return False

       S = set(invert=True, universeExclusionFunc=excludeFunc)

3. Let's say you wanted to work with the set of all IP addresses, as
   represented by lists of 4 integers, but that you want to exclude
   127.0.0.1, and all the reserved RFC1918 address blocks

       def excludeFunc(ip):
           # Insert code to check type and length.
           if (ip == [127, 0, 0, 1]):
               return True
           if (ip[0] == 10 or 
               (172, 16, 0, 0) <= ip <= (172, 31, 255, 255) or
               (191, 168, 0, 0) <= ip <= (192, 168, 255, 255)):
               return True
           # Insert code to check all bytes in [0..255], return True if not.
           return False

       S = set(invert=True, universeExclusionFunc=excludeFunc)

It's pretty easy to think of other examples of wanting to efficiently use
big sets, or to use small sets and then needing to complement them and
operate efficiently on their large complements.

One reaction to these examples might be that it's no big deal to write your
own code to, say, manage sets of IP addresses. I think that's not so easy
to do efficiently if the set is massive (and you'd end up keeping track of
the complement or using offline storage). Doing it as suggested above makes
it simple, and once you call the constructor, you just do normal Python set
operations.

This suggestion goes a little way towards giving "continuous" sets. One way
to categorize set support is:

  A. Only support finite sets.

  B. Provide category A but also support infinite sets where at least
     one of the set and its complement is finite.

  C. Provide category B but also allow for the case where a set and its
     complement may both be infinite.

Thus I am proposing something that would take set support in Python from
category A to category B.  Implementing B efficiently is provably possible
since all operations are done on either a set or its complement, whichever
is known to be finite, and Python already handles those operations (via its
category A support).


The downsides that I see at the moment are:

1. Two new exceptions that can arise. These only happen on sets that are in
   the inverted state, and only when the universal set is infinite:

     - You try to enumerate an infinite set.
     - You call len() on an infinite set.

2. Operations on regular sets get slightly slower because virtually all set
   operations would need to test the invert flag. So that's an extra
   Boolean test on every set operation.

I don't think the implementation is too hard - the Python sets.py code is
very clean - though there may be gotchas.

Anyway, I wanted to see what people think of this before I go ahead and
solve my actual problem. If the above is considered potentially worthwhile
to Python, I can work on adapting sets.py. If not, I'll do the faster and
simpler thing and write my own class.

Regards,
Terry

From guido at python.org  Fri May 12 01:29:15 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 May 2006 16:29:15 -0700
Subject: [Python-Dev] Efficient set complement and operation on
	large/infinite sets.
In-Reply-To: <17507.49364.735663.587038@terry.jones.tc>
References: <17507.49364.735663.587038@terry.jones.tc>
Message-ID: <ca471dc20605111629l2a497295i1f86da7a8579feb9@mail.gmail.com>

Hm...  Without reading though all this, I expect that you'd be better
off implementing this for yourself without attempting to pull the
standard library sets into the picture (especially since sets.py is
obsolete as of 2.4; set and frozenset are now built-in types).  You're
really after rather specialized set representations. I've done this
myself and as long as you stick to solving *just* the problem at hand,
it's easy. If you want a general solution, it's hard...

--Guido

On 5/11/06, Terry Jones <terry at jon.es> wrote:
> I'm about to write some code to manage sets, and wanted to float a few
> thoughts here because I have various ideas about how to implement what I
> want to do, and I think one of them could be done by changing Python's set
> type in useful and backward compatible way.
>
> Apologies if this is discussed in the archives. I didn't see it.
>
> My original motivation is a need to work efficiently with very large sets,
> let's say 10M elements. In some cases, you can operate on a set without
> needing to explicitly represent all its elements. For example, if my set is
> the Universal (U) set, then X in U, should return True. If my set is U and
> I union it with any other set, it is unchanged, etc. By also maintaining a
> finite set of things that are _not_ in my large set, I can do other
> operations efficiently. E.g., calling U.remove(X) would _add_ X to the set
> of elements that were known not to be in the big set that you were trying
> to represent efficiently. In most cases, such a set class would simply do
> the opposite/inverse operation on its finite exclusion set.
>
> While it's convenient to warm up by talk about the infinite Universal set,
> we could just as easily be talking about arbitrarily large finite sets.
> There are some examples below.
>
> Another way to discuss the same thing is to talk about set complements. It
> would be nice to be able to complement (or invert) a set. Once you did so,
> you might have an infinite set on your hands, but as the last paragraph
> argues, you can still operate on an infinite set efficiently. Naturally,
> you can't fully enumerate it, or take its size.
>
> I think that giving Python sets the ability to handle complementarity would
> have some nice uses - and that these go well beyond the particular thing
> that I need right now.
>
> For example, suppose you want to represent the set of all floating point
> numbers. You just create an empty set and complement it. Then you can test
> it as usual, and begin to exclude things from it. Or if you have a system
> with 10M objects and you need to support operations on these objects via a
> query language that lets the user type "not X" where X is a set of objects,
> there is a natural and efficient way (constant time) to execute the query -
> you just mark the set as being inverted (complemented).
>
> If I were going to implement the above by changing Python's built in set
> type, here's what I think I'd do:
>
> Add four optional keyword arguments to the set constructor:
>
>   - invert=False
>   - universalSet=None
>   - universalSetSize=None
>   - universeExclusionFunc=None
>
> invert is just a flag, indicating the sense of the set.
>
> If inverse is False:
>
>   the set behaves exactly as a normal Python set and the other three
>   new arguments are ignored.
>
> If inverse is True:
>
>   universalSet represents the universal set. If it is None, then the
>   universal set is infinite. Otherwise, universalSet is an iterable. The
>   implementation should not call this iterable unless it's unavoidable, on
>   the presumption that if the programmer wanted to operate directly on the
>   contents of this iterable, they'd be doing it in a conventional fashion
>   (e.g., with a normal set). The universalSetSize, used for len()
>   calculations, is the number of elements in universalSet, if known &
>   if finite.
>
>   The universeExclusionFunc can be called with a single element argument to
>   see if the element should be considered excluded from the universalSet.
>   This may seem like a weird idea, but it's the key to flexibility and
>   efficiency. In an invert=True set, the code would use the normal set
>   content as a mutable set of objects that are not in the universe, as
>   described above, but would, in addition, use the universeExclusionFunc
>   (if defined) to identify elements not in the set (because they are not in
>   the universe), and thus avoid the use of the expensive (or infinite)
>   universalSet.
>
> Note that an inverted (or normal) set can be inverted by simply setting
> invert=False, so this requires a new invert() method (which could be
> triggered by the use of something like 'not' or '!' or '~'). In this case,
> an inverted set becomes a normal Python set. The elements represented by
> universeExclusionFunc remain invalid - they are simply not in the universe
> (and, if deemed sensible, this could be sanity checked in add()).
>
> If it's not clear, when a set is inverted, any iterable given to __init__,
> (i.e., the iterable argument in the normal case of constructing a Python
> set), is just treated as usual (but in this case will be taken to
> represent things not in the set initially).
>
>
> Here are some examples of usage:
>
> 1. Suppose you want to work with the set of integers [0, 100000000) and
>    that initially your set is all such integers. You could create this set
>    via:
>
>        S = set(invert=True,
>                universalSet=xrange(100000000),
>                universalSetSize=100000000,
>                universeExclusionFunc=(lambda x: x >= 100000000)
>
>    This has the intended effect, is efficient, and no-one need call the
>    iterator. You can (I think) do everything you could do with a normal set
>    (including inverting it). If the user happens to want to list all the
>    elements in the set, that's their business and they can do it, except in
>    the case where universalSet=None, in which case I would raise an
>    exception.
>
> 2. Suppose you wanted to represent the set of all floating point numbers
>    > 0.0 that were not integers. You'd do this via:
>
>        def excludeFunc(f):
>            try:
>                f = float(f)
>            except ValueError:
>                return True
>            if f <= 0.0:
>                return True
>            if f - math.floor(f) == 0.0:
>                return True
>            return False
>
>        S = set(invert=True, universeExclusionFunc=excludeFunc)
>
> 3. Let's say you wanted to work with the set of all IP addresses, as
>    represented by lists of 4 integers, but that you want to exclude
>    127.0.0.1, and all the reserved RFC1918 address blocks
>
>        def excludeFunc(ip):
>            # Insert code to check type and length.
>            if (ip == [127, 0, 0, 1]):
>                return True
>            if (ip[0] == 10 or
>                (172, 16, 0, 0) <= ip <= (172, 31, 255, 255) or
>                (191, 168, 0, 0) <= ip <= (192, 168, 255, 255)):
>                return True
>            # Insert code to check all bytes in [0..255], return True if not.
>            return False
>
>        S = set(invert=True, universeExclusionFunc=excludeFunc)
>
> It's pretty easy to think of other examples of wanting to efficiently use
> big sets, or to use small sets and then needing to complement them and
> operate efficiently on their large complements.
>
> One reaction to these examples might be that it's no big deal to write your
> own code to, say, manage sets of IP addresses. I think that's not so easy
> to do efficiently if the set is massive (and you'd end up keeping track of
> the complement or using offline storage). Doing it as suggested above makes
> it simple, and once you call the constructor, you just do normal Python set
> operations.
>
> This suggestion goes a little way towards giving "continuous" sets. One way
> to categorize set support is:
>
>   A. Only support finite sets.
>
>   B. Provide category A but also support infinite sets where at least
>      one of the set and its complement is finite.
>
>   C. Provide category B but also allow for the case where a set and its
>      complement may both be infinite.
>
> Thus I am proposing something that would take set support in Python from
> category A to category B.  Implementing B efficiently is provably possible
> since all operations are done on either a set or its complement, whichever
> is known to be finite, and Python already handles those operations (via its
> category A support).
>
>
> The downsides that I see at the moment are:
>
> 1. Two new exceptions that can arise. These only happen on sets that are in
>    the inverted state, and only when the universal set is infinite:
>
>      - You try to enumerate an infinite set.
>      - You call len() on an infinite set.
>
> 2. Operations on regular sets get slightly slower because virtually all set
>    operations would need to test the invert flag. So that's an extra
>    Boolean test on every set operation.
>
> I don't think the implementation is too hard - the Python sets.py code is
> very clean - though there may be gotchas.
>
> Anyway, I wanted to see what people think of this before I go ahead and
> solve my actual problem. If the above is considered potentially worthwhile
> to Python, I can work on adapting sets.py. If not, I'll do the faster and
> simpler thing and write my own class.
>
> Regards,
> Terry
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From terry at jon.es  Fri May 12 01:33:10 2006
From: terry at jon.es (Terry Jones)
Date: Fri, 12 May 2006 01:33:10 +0200
Subject: [Python-Dev] Efficient set complement and operation
	on	large/infinite sets.
In-Reply-To: Your message at 00:55:16 on Friday, 12 May 2006
References: <17507.49364.735663.587038@terry.jones.tc>
Message-ID: <17507.51638.270879.768063@terry.jones.tc>

A quick followup to my own posting:

I meant to say something about implementing __rand__() and pop(). I'd
either add another optional function argument to the constructor. It would
return a random element from the universe. Then for __rand__() and pop(),
you'd call until it (hopefully!) returned something not excluded. Or, do
something non-random, like return a random (non-excluded) integer. Or, just
raise an exception.

I think I prefer the extra argument approach, where the docs state clearly
that you can expect to wait longer and longer for random elements as you
empty a finite inverted set. I prefer this approach because getting a
random element from a set is something you really should be able to
do. Just raising an exception is the cleanest and clearest choice.

One thing I certainly would not consider is trying to mess around with the
excluded set (which may in any case be empty) to figure out a suitable
return type.

And yes, I agree in advance, adding 5 new optional arguments to the set()
constructor isn't pretty. Is the added functionality is worth it?

Terry

From terry at jon.es  Fri May 12 01:51:12 2006
From: terry at jon.es (Terry Jones)
Date: Fri, 12 May 2006 01:51:12 +0200
Subject: [Python-Dev] Efficient set complement and operation on
	large/infinite sets.
In-Reply-To: Your message at 16:29:15 on Thursday, 11 May 2006
References: <17507.49364.735663.587038@terry.jones.tc>
	<ca471dc20605111629l2a497295i1f86da7a8579feb9@mail.gmail.com>
Message-ID: <17507.52720.787656.535086@terry.jones.tc>

>>>>> "Guido" == Guido van Rossum <guido at python.org> writes:
Guido> Hm...  Without reading though all this, I expect that you'd be
Guido> better off implementing this for yourself without attempting to pull
Guido> the standard library sets into the picture (especially since sets.py
Guido> is obsolete as of 2.4; set and frozenset are now built-in types).
Guido> You're really after rather specialized set representations. I've
Guido> done this myself and as long as you stick to solving *just* the
Guido> problem at hand, it's easy. If you want a general solution, it's
Guido> hard...

I certainly would be better off in the short term, and probably the long
term too. It's likely what I'll do in any case as it's much, much quicker,
I only need a handful of the set operations, and I don't need to talk to
anyone :-)

I don't think I'm proposing something specialized. Set complement is
something one learns in primary school. It's just difficult to provide in
general, as you say.

Aside from my own itch, which I know how to scratch, my question is whether
it's worth trying to work this into Python itself. Sounds like not.

Terry

From rhettinger at ewtllc.com  Fri May 12 02:34:16 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Thu, 11 May 2006 17:34:16 -0700
Subject: [Python-Dev] Efficient set complement and operation
 on	large/infinite sets.
In-Reply-To: <17507.52720.787656.535086@terry.jones.tc>
References: <17507.49364.735663.587038@terry.jones.tc>	<ca471dc20605111629l2a497295i1f86da7a8579feb9@mail.gmail.com>
	<17507.52720.787656.535086@terry.jones.tc>
Message-ID: <4463D808.4030101@ewtllc.com>


>Guido> Hm...  Without reading though all this, I expect that you'd be
>Guido> better off implementing this for yourself without attempting to pull
>Guido> the standard library sets into the picture (especially since sets.py
>Guido> is obsolete as of 2.4; set and frozenset are now built-in types).
>Guido> You're really after rather specialized set representations. I've
>Guido> done this myself and as long as you stick to solving *just* the
>Guido> problem at hand, it's easy. If you want a general solution, it's
>Guido> hard...
>
>I certainly would be better off in the short term, and probably the long
>term too. It's likely what I'll do in any case as it's much, much quicker,
>I only need a handful of the set operations, and I don't need to talk to
>anyone :-)
>
>I don't think I'm proposing something specialized. Set complement is
>something one learns in primary school. It's just difficult to provide in
>general, as you say.
>
>Aside from my own itch, which I know how to scratch, my question is whether
>it's worth trying to work this into Python itself. Sounds like not.
>  
>
There's room in the world for alternate implementations of sets, each 
with its own strengths and weaknesses. For example, bitsets potentially 
offer some speed and space economies for certain apps. Alternatve 
implementations will most likely start-off as third-party extension 
modules, and if they prove essential, they may make it to the 
collections module.

As for the built-in set types, I recommend leaving those alone and 
keeping a API as simple as possible.

The __rand__ idea is interesting but I don't see how to implement an 
equiprobable hash table selection in O(1) time -- keep in mind that a 
set may be very sparse at the time of the selection.


Raymond

From anthony at interlink.com.au  Fri May 12 04:04:18 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 12 May 2006 12:04:18 +1000
Subject: [Python-Dev] python 2.4 and universal binaries
In-Reply-To: <6DB21F1F-5D54-4A50-9516-B143F4AA2C37@mac.com>
References: <6DB21F1F-5D54-4A50-9516-B143F4AA2C37@mac.com>
Message-ID: <200605121204.23636.anthony@interlink.com.au>

This is fine with me.

Note that 2.4.4 won't be out until after 2.5.0, so it's a couple of 
months off yet.


From vmalloc at gmail.com  Tue May  9 15:39:52 2006
From: vmalloc at gmail.com (Rotem Yaari)
Date: Tue, 9 May 2006 16:39:52 +0300
Subject: [Python-Dev] pthreads, fork, import, and execvp
Message-ID: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>

Hello everyone!

We have been encountering several deadlocks in a threaded Python
application which calls  subprocess.Popen (i.e. fork()) in some of its
threads.

This has occurred on Python 2.4.1 on a 2.4.27 Linux kernel.

 Preliminary analysis of the hang shows that the child process blocks
upon entering the  execvp function, in which the import_lock is acquired
due to the following line:

def _ execvpe(file,  args,  env=None):
    from  errno import ENOENT, ENOTDIR
    ...

It is known that when forking from a  pthreaded application, acquisition
attempts on locks which were already locked by other threads while
fork() was called will deadlock.

Due to these oddities we were wondering if it would be better to extract
the above import line from the  execvpe call, to prevent lock
acquisition attempts in such cases.

Another workaround could be re-assigning a new lock to import_lock
(such a thing is done with the global interpreter lock) at PyOS_AfterFork or
pthread_atfork.

We'd appreciate any opinions you might have on the subject.

Thanks in advance,

Yair and  Rotem

From gabriel.becedillas at gmail.com  Thu May 11 16:46:20 2006
From: gabriel.becedillas at gmail.com (Gabriel Becedillas)
Date: Thu, 11 May 2006 11:46:20 -0300
Subject: [Python-Dev] PyThreadState_SetAsyncExc,
	PyErr_Clear and native extensions
Message-ID: <e350afd50605110746p3c542055n6e1bcb0e6e3f46bf@mail.gmail.com>

I use PyThreadState_SetAsyncExc to stop a python thread but there are
situations when the thread doesn't stop and continues executing
normally. After some debugging, I realized that the problem is that
PyThreadState_SetAsyncExc was called when the thread was inside a
native extension, that for some reason calls PyErr_Clear. That code
happens to be inside boost::python.
I do need to stop the thread from executing Python code as soon as
possible (as soon as it returns from a native function is also
acceptable).
Because we have embedded Python's VM in our product, I'm thinking of
modifying PyErr_Clear() to return immediately if the thread was
stopped (we determine if the thread should stop using our own
functions).
Example:

void PyErr_Clear(void)
{
        if (!stop_executing_this_thread())
                PyErr_Restore(NULL, NULL, NULL);

}

Does anybody see any problem with this approach ?, Does anybody have a
cleaner/better solution ?
I've allready posted this in comp.lang.python a couple of days ago but
got no answers, that's why I'm posting here, coz I think the only ones
that could give me a solution/workaround are Python core developers.
Thanks a lot.

From greg.ewing at canterbury.ac.nz  Fri May 12 04:35:16 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 12 May 2006 14:35:16 +1200
Subject: [Python-Dev] PyThreadState_SetAsyncExc,
 PyErr_Clear and native extensions
In-Reply-To: <e350afd50605110746p3c542055n6e1bcb0e6e3f46bf@mail.gmail.com>
References: <e350afd50605110746p3c542055n6e1bcb0e6e3f46bf@mail.gmail.com>
Message-ID: <4463F464.70307@canterbury.ac.nz>

Gabriel Becedillas wrote:
> PyThreadState_SetAsyncExc was called when the thread was inside a
> native extension, that for some reason calls PyErr_Clear.

Maybe PyThreadState_SetAsyncExc should set a flag that
says "this is an async exception, don't clear it", and
have PyErr_Clear take notice of that flag.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | Carpe post meridiem!          	  |
Christchurch, New Zealand	   | (I'm not a morning person.)          |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Fri May 12 06:48:35 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 May 2006 21:48:35 -0700
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>
Message-ID: <ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>

Yeah, I think imports inside functions are overused.

On 5/9/06, Rotem Yaari <vmalloc at gmail.com> wrote:
> Hello everyone!
>
> We have been encountering several deadlocks in a threaded Python
> application which calls  subprocess.Popen (i.e. fork()) in some of its
> threads.
>
> This has occurred on Python 2.4.1 on a 2.4.27 Linux kernel.
>
>  Preliminary analysis of the hang shows that the child process blocks
> upon entering the  execvp function, in which the import_lock is acquired
> due to the following line:
>
> def _ execvpe(file,  args,  env=None):
>     from  errno import ENOENT, ENOTDIR
>     ...
>
> It is known that when forking from a  pthreaded application, acquisition
> attempts on locks which were already locked by other threads while
> fork() was called will deadlock.
>
> Due to these oddities we were wondering if it would be better to extract
> the above import line from the  execvpe call, to prevent lock
> acquisition attempts in such cases.
>
> Another workaround could be re-assigning a new lock to import_lock
> (such a thing is done with the global interpreter lock) at PyOS_AfterFork or
> pthread_atfork.
>
> We'd appreciate any opinions you might have on the subject.
>
> Thanks in advance,
>
> Yair and  Rotem
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Fri May 12 07:16:09 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 12 May 2006 07:16:09 +0200
Subject: [Python-Dev] PyThreadState_SetAsyncExc,
 PyErr_Clear and native extensions
In-Reply-To: <e350afd50605110746p3c542055n6e1bcb0e6e3f46bf@mail.gmail.com>
References: <e350afd50605110746p3c542055n6e1bcb0e6e3f46bf@mail.gmail.com>
Message-ID: <44641A19.3000807@v.loewis.de>

Gabriel Becedillas wrote:
> Does anybody see any problem with this approach ?, Does anybody have a
> cleaner/better solution ?

I don't think there *is* a solution: asynchronous exceptions and thread
cancellation just cannot work.

In the specific case, the caller of PyErr_Clear will continue its
computation, instead of immediately leaving the function. To solve
this specific problem, you would also have to give PyErr_Clear
an int return code (whether or not the exception was cleared),
and then you need to change all uses of PyErr_Clear to check for
failure, and return immediately (after performing local cleanup,
of course).

You then need to come up with a protocol to determine whether
an exception is "clearable"; the new exception hierarchy suggests
that one should "normally" only catch  Exception, and let any
other BaseException through. So PyErr_Clear should grow a flag
indicating whether you want to clear just all Exceptions, or
indeed all BaseExceptions.

Regards,
Martin

From nnorwitz at gmail.com  Fri May 12 09:30:11 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Fri, 12 May 2006 00:30:11 -0700
Subject: [Python-Dev] [Python-3000] Questions on optional type
	annotations
In-Reply-To: <44635388.8080600@gradient.cis.upenn.edu>
References: <43aa6ff70605102217r76720ec0l5158837dcba612e4@mail.gmail.com>
	<ca471dc20605102250n25323018ha658acff1af3ef3a@mail.gmail.com>
	<ee2a432c0605102322x5834898dk2a60c7ad72640e6e@mail.gmail.com>
	<44635388.8080600@gradient.cis.upenn.edu>
Message-ID: <ee2a432c0605120030i20a5a72ah1239fe322d6f38b1@mail.gmail.com>

On 5/11/06, Edward Loper <edloper at gradient.cis.upenn.edu> wrote:
> Neal Norwitz wrote:
> > Another benefit of this is the ability to get more info through
> > introspection.  Right now, you can't even find the number of arguments
> > of a function implemented in C.  You only know if it takes 0, 1, or
> > variable # of arguments and if it accepts keywords.
>
> By this, do you mean that it's currently possible to distinguish these
> cases (builtins that take 0 args, builtins that take 1 arg, builtins
> that take varargs, builtins that take keywords) via introspection in
> python?  If so, could you let me know how this is done?  For epydoc,
> this would at least let me give a little more info when no signature is
> present on the first line of the docstring.

Hmm, well I only said the info was available.  I didn't say it was
available from Python code. :-)

Actually, with the patch below the flags are available:

>>> len.__flags__
8

The flags are METH_XXX flags defined in Include/methodobject.h:

#define METH_OLDARGS  0x0000
#define METH_VARARGS  0x0001
#define METH_KEYWORDS 0x0002
#define METH_NOARGS   0x0004
#define METH_O        0x0008

If you want this patch included in 2.5, feel free to ask on
python-dev.  I guess I'm +0 on it.  This could be a trivial extension
module for older versions of python.  Aaaa, screw it, I copied
python-dev, we'll see if anyone cares.

n
--
Index: Objects/methodobject.c
===================================================================
--- Objects/methodobject.c      (revision 45961)
+++ Objects/methodobject.c      (working copy)
@@ -146,6 +146,12 @@
        return PyString_FromString(m->m_ml->ml_name);
 }

+static PyObject *
+meth_get__flags__(PyCFunctionObject *m, void *closure)
+{
+       return PyInt_FromLong(m->m_ml->ml_flags);
+}
+
 static int
 meth_traverse(PyCFunctionObject *m, visitproc visit, void *arg)
 {
@@ -174,6 +180,7 @@
        {"__doc__",  (getter)meth_get__doc__,  NULL, NULL},
        {"__name__", (getter)meth_get__name__, NULL, NULL},
        {"__self__", (getter)meth_get__self__, NULL, NULL},
+       {"__flags__", (getter)meth_get__flags__, NULL, NULL},
        {0}
 };

From terry at jon.es  Fri May 12 09:57:39 2006
From: terry at jon.es (Terry Jones)
Date: Fri, 12 May 2006 09:57:39 +0200
Subject: [Python-Dev] Efficient set complement and operation
 on	large/infinite sets.
In-Reply-To: Your message at 17:34:16 on Thursday, 11 May 2006
References: <17507.49364.735663.587038@terry.jones.tc>
	<ca471dc20605111629l2a497295i1f86da7a8579feb9@mail.gmail.com>
	<17507.52720.787656.535086@terry.jones.tc>
	<4463D808.4030101@ewtllc.com>
Message-ID: <17508.16371.358755.841172@terry.jones.tc>

>>>>> "Raymond" == Raymond Hettinger <rhettinger at ewtllc.com> writes:
Raymond> There's room in the world for alternate implementations of sets,
Raymond> each with its own strengths and weaknesses. 
...
Raymond> Alternatve implementations will most likely start-off as
Raymond> third-party extension modules

Raymond> As for the built-in set types, I recommend leaving those alone and 
Raymond> keeping a API as simple as possible.

Agreed. I implemented most of what I need last night - not surprisingly in
less time than it took to write the original mail.

Raymond> The __rand__ idea is interesting but I don't see how to implement
Raymond> an equiprobable hash table selection in O(1) time -- keep in mind
Raymond> that a set may be very sparse at the time of the selection.

I think the rand issue is enough to make the proposal questionable.

Another thing I realized which I think adds to the argument that this
should be separate, is that it's not so straightforward to deal with
operations on sets that have different universes. You can do it, but it's
messy, not elegant, and not fun (and no-one is asking for it, not even me).

This leads to a much nicer approach in which you have a separate module
that provides a UniversalSet class.  When you call the constructor, you
tell it whatever you can about what the universe looks like: how big it is,
how to generate random elements, how to exclude things, etc. This class
provides a simple (a la Python set.__init__) method that hands back a set
within this universe.  You're then free to operate on that set, and other
instances provided by UniversalSet, as usual, and you get to do complement.
It's easy to imagine subclasses such as InfiniteSet, FiniteSet, etc.
Python's sets are the case where nothing is known about the universe (aside
from implementation details, like elements being hashable). The various
different kinds of UniversalSet subclasses would provide methods according
to what was known about their universes.

That would clearly be a 3rd party extension.

Terry

From ncoghlan at gmail.com  Fri May 12 11:04:15 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 12 May 2006 19:04:15 +1000
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>
Message-ID: <44644F8F.3060800@gmail.com>

Rotem Yaari wrote:
> This has occurred on Python 2.4.1 on a 2.4.27 Linux kernel.
> 
>  Preliminary analysis of the hang shows that the child process blocks
> upon entering the  execvp function, in which the import_lock is acquired
> due to the following line:
> 
> def _ execvpe(file,  args,  env=None):
>     from  errno import ENOENT, ENOTDIR
>     ...

While I agree there's no good reason to embed this import inside the function, 
I also don't see how you could be getting a deadlock unless you have modules 
that either spawn new threads or fork the process as a side effect of being 
imported.

Neither of those is guaranteed to work properly due to the fact that the 
original thread holds the import lock, leading to the possibility of deadlock 
if the spawned thread tries to import anything (as seems to be happening here) 
and the original thread then waits for a result from the spawned thread.

Spawning new threads or processes can really only be safely done when invoked 
directly or indirectly from the __main__ module (as that is run without 
holding the import lock). When done as a side effect of module import, the 
module needs to ensure that the result of the spawned thread or process is 
only waited for after the original module import is complete (and the import 
lock released as a result).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From mal at egenix.com  Fri May 12 23:22:55 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 12 May 2006 23:22:55 +0200
Subject: [Python-Dev] [Python-checkins] r45925 - in python/trunk:
 Lib/tempfile.py	Lib/test/test_os.py Misc/NEWS Modules/posixmodule.c
In-Reply-To: <44631F53.1010507@egenix.com>
References: <20060506163302.CA3D71E4004@bag.python.org>	<445CE018.7040609@egenix.com>	<445D9B0C.5080808@v.loewis.de>	<445DF658.7050609@egenix.com>
	<445E1E09.1040908@v.loewis.de>	<445E3615.7000108@egenix.com>
	<445E433C.10908@v.loewis.de>	<445F23BD.5040906@egenix.com>
	<445FC265.6060508@v.loewis.de>	<4461EB0A.5000800@egenix.com>
	<44623E05.4080700@v.loewis.de>	<4462443C.7010608@egenix.com>
	<44624963.4020604@v.loewis.de> <44631F53.1010507@egenix.com>
Message-ID: <4464FCAF.5080708@egenix.com>

M.-A. Lemburg wrote:
> Martin v. L?wis wrote:
>> M.-A. Lemburg wrote:
>>> BTW, and intended as offer for compromise, should we instead
>>> add the Win32 codes to the errno module (or a new winerrno
>>> module) ?! I can write a parser that takes winerror.h and
>>> generates the module code.
>> Instead won't help: the breakage will still occur. It would
>> be possible to put symbolic constants into the except clauses,
>> instead of using the numeric values, but you still have to
>> add all these "except WindowsError,e" clauses, even if
>> the constants were available.
> 
> Right, I meant "instead of using ErrorCode approach".
> 
>> However, in addition would be useful: people will want to
>> check for specific Win32 error codes, just because they are
>> more descriptive. I propose to call the module "winerror"
>> (in parallel to winerror.h, just as the errno module
>> parallels errno.h)
> 
> Ok.
> 
>> Adding them all to the errno would work for most cases,
>> except that you get conflicts for errno.errcode.
> 
> I think it's better to separate the two - you wouldn't
> want to load all the winerror codes on Unix.

I've written a parser which only includes the ERROR_*
definitions from the winerror.h file.

The generated file is quite large (160kB) and it's
currently a Python module defining the symbols and an
errorcode dictionary (just like the errno module does).

Generating a C extension wouldn't really decrease the
size of the mapping tables, since they would also end
being dictionaries on the heap.

An alternative would be to use a lookup object which
then searches in a static C array of data (e.g. a
large switch statement which the compiler then
optimizes).

This would have the advantage of not
using up too much process memory and could be shared
between Python processes.

The downside is that the approach would not be compatible
to the errno module interface, since all lookups would
have to go through the lookup object, e.g.

winerror.ERROR.ACCESS_DENIED

instead of:

winerror.ERROR_ACCESS_DENIED

What do you think ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 12 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From gh at ghaering.de  Sat May 13 02:54:09 2006
From: gh at ghaering.de (=?ISO-8859-1?Q?Gerhard_H=E4ring?=)
Date: Sat, 13 May 2006 02:54:09 +0200
Subject: [Python-Dev] Status: sqlite3 module docs
Message-ID: <44652E31.8000208@ghaering.de>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

I've now moved over all the content from the pysqlite reference manual to
the sqlite3 module docs.

It would be nice if now somebody with more experience with Python
documentation could look over it.

A few things I know could be changed:

- - Just like with the pysqlite docs, the reference manual tries to have many
examples. I've put these under Doc/lib/sqlite/*.py. All other modules have
the .py files directly in Doc/lib/ - I don't know if this is necessary for
some reason unknown to me. For building HTML docs, having them in a
subdirectory works fine.

- - Some (or some more) of the examples could probably be moved into a
directory in Demo/ instead.

- - There's no tutorial for the sqlite3 module right now, all is in
reference-manual style.

I'll continue to work on this if somebody tells me what to do, but for now
it would be nice if somebody could clean it up (and maybe also tell me why
certain clean-ups are done - I don't know how indentation is handled for
text and examples, e. g.).

- -- Gerhard
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFEZS4xdIO4ozGCH14RAo8zAJ9aoqZ2Bjsu0JYh6WyGUXYItKf8oQCeIwL4
jbhixu+TOIYT0PF0/hJGY+g=
=/cGc
-----END PGP SIGNATURE-----

From martin at v.loewis.de  Sat May 13 08:28:53 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 13 May 2006 08:28:53 +0200
Subject: [Python-Dev] [Python-checkins] r45925 - in python/trunk:
 Lib/tempfile.py	Lib/test/test_os.py Misc/NEWS Modules/posixmodule.c
In-Reply-To: <4464FCAF.5080708@egenix.com>
References: <20060506163302.CA3D71E4004@bag.python.org>	<445CE018.7040609@egenix.com>	<445D9B0C.5080808@v.loewis.de>	<445DF658.7050609@egenix.com>
	<445E1E09.1040908@v.loewis.de>	<445E3615.7000108@egenix.com>
	<445E433C.10908@v.loewis.de>	<445F23BD.5040906@egenix.com>
	<445FC265.6060508@v.loewis.de>	<4461EB0A.5000800@egenix.com>
	<44623E05.4080700@v.loewis.de>	<4462443C.7010608@egenix.com>
	<44624963.4020604@v.loewis.de> <44631F53.1010507@egenix.com>
	<4464FCAF.5080708@egenix.com>
Message-ID: <44657CA5.4030708@v.loewis.de>

M.-A. Lemburg wrote:
> What do you think ?

I think the size could be further reduced by restricting the set of
error codes. For example, if the COM error codes are left out, I only
get a Python file with 60k source size (although the bytecode size
is then 130k). I'm not sure whether GetLastError can ever return a COM
error code, but in my experience, it never did.

I wouldn't want to introduce a clumsy interface just to achieve space
savings. If people consider the space consumption of 200k disk
size unacceptable (or would never use the module because of the runtime
size effects), the entire idea should be dropped (IMO). OTOH, I don't
find such a size increase unacceptable myself.

Regards,
Martin

From mal at egenix.com  Sat May 13 11:48:47 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 13 May 2006 11:48:47 +0200
Subject: [Python-Dev] [Python-checkins] r45925 - in python/trunk:
 Lib/tempfile.py	Lib/test/test_os.py Misc/NEWS Modules/posixmodule.c
In-Reply-To: <44657CA5.4030708@v.loewis.de>
References: <20060506163302.CA3D71E4004@bag.python.org>	<445CE018.7040609@egenix.com>	<445D9B0C.5080808@v.loewis.de>	<445DF658.7050609@egenix.com>
	<445E1E09.1040908@v.loewis.de>	<445E3615.7000108@egenix.com>
	<445E433C.10908@v.loewis.de>	<445F23BD.5040906@egenix.com>
	<445FC265.6060508@v.loewis.de>	<4461EB0A.5000800@egenix.com>
	<44623E05.4080700@v.loewis.de>	<4462443C.7010608@egenix.com>
	<44624963.4020604@v.loewis.de>	<44631F53.1010507@egenix.com>
	<4464FCAF.5080708@egenix.com> <44657CA5.4030708@v.loewis.de>
Message-ID: <4465AB7F.7030308@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
>> What do you think ?
> 
> I think the size could be further reduced by restricting the set of
> error codes. For example, if the COM error codes are left out, I only
> get a Python file with 60k source size (although the bytecode size
> is then 130k). I'm not sure whether GetLastError can ever return a COM
> error code, but in my experience, it never did.

I was leaving those out already - only the codes named 'ERROR_*'
get included (see attached parser and generator).

> I wouldn't want to introduce a clumsy interface just to achieve space
> savings. If people consider the space consumption of 200k disk
> size unacceptable (or would never use the module because of the runtime
> size effects), the entire idea should be dropped (IMO). OTOH, I don't
> find such a size increase unacceptable myself.

Using a lookup object is not really clumsy - you can still access
all the values by attribute access. The only difference is that
they don't live in the module namespace, but get accessed via
an object.

I'm not worried about the disk space being used. The heap
memory usage is what's worrying: the import of the module lets
the non-shared memory size of the process grow by 700kB
on my AMD64 box.

Regards,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 13 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: makewinerror.py
Url: http://mail.python.org/pipermail/python-dev/attachments/20060513/3e74d7d3/attachment.asc 

From martin at v.loewis.de  Sat May 13 14:15:34 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 13 May 2006 14:15:34 +0200
Subject: [Python-Dev] [Python-checkins] r45925 - in python/trunk:
 Lib/tempfile.py	Lib/test/test_os.py Misc/NEWS Modules/posixmodule.c
In-Reply-To: <4465AB7F.7030308@egenix.com>
References: <20060506163302.CA3D71E4004@bag.python.org>	<445CE018.7040609@egenix.com>	<445D9B0C.5080808@v.loewis.de>	<445DF658.7050609@egenix.com>
	<445E1E09.1040908@v.loewis.de>	<445E3615.7000108@egenix.com>
	<445E433C.10908@v.loewis.de>	<445F23BD.5040906@egenix.com>
	<445FC265.6060508@v.loewis.de>	<4461EB0A.5000800@egenix.com>
	<44623E05.4080700@v.loewis.de>	<4462443C.7010608@egenix.com>
	<44624963.4020604@v.loewis.de>	<44631F53.1010507@egenix.com>
	<4464FCAF.5080708@egenix.com> <44657CA5.4030708@v.loewis.de>
	<4465AB7F.7030308@egenix.com>
Message-ID: <4465CDE6.9060501@v.loewis.de>

M.-A. Lemburg wrote:
> I was leaving those out already - only the codes named 'ERROR_*'
> get included (see attached parser and generator).

Right. One might debate whether DNS_INFO_AXFR_COMPLETE (9751L)
or WSAEACCES (10013L) should be included as well.

I got a smaller source file as I included only forward mappings,
and used a loop to create the backwards mappings.

> Using a lookup object is not really clumsy - you can still access
> all the values by attribute access. The only difference is that
> they don't live in the module namespace, but get accessed via
> an object.

So how much space would that save?

> I'm not worried about the disk space being used. The heap
> memory usage is what's worrying: the import of the module lets
> the non-shared memory size of the process grow by 700kB
> on my AMD64 box.

That number must be misleading somehow. There are 1510 strings,
with a total length of 39972. There are 1510 integers also,
and they all get added into three dictionaries.

On a 32-bit machine, these should consume 76968 bytes for the
strings (*), 18120 bytes for the integers, and 100000 bytes
for the dict entries (**), for a total of 200000 bytes
at run-time.

On a 64-bit machine, the strings should consume 101128 bytes (***),
the integers 24160, and the dict entries 200000 bytes,
for a total of 325000 bytes.

>From that, I would conclude that one should avoid 64-bit machines
if one is worried about memory usage :-)

Regards,
Martin

(*) assuming 20 bytes string header, 1 byte null-termination,
and a rounding-up to the next multiple of 8
(**) assuming 12 bytes per dict entry in three dictionaries
(winerror.__dict__, winerror.errorcode, interning dict),
and assuming an average fill ratio of the dicts of 50%
(***) assuming 40 bytes string header, provided long is
a 64-bit type on that platform

From g.brandl at gmx.net  Sat May 13 15:35:46 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 13 May 2006 15:35:46 +0200
Subject: [Python-Dev] Py_ssize_t formatting
Message-ID: <e44mav$jcr$1@sea.gmane.org>

Hi,

which formatting code should be used for a Py_ssize_t value in e.g.
PyString_FromFormat?

'%zd' won't work since my value can be negative, and the 'z' modifier
converts the argument to size_t.

Georg


From martin at v.loewis.de  Sat May 13 15:23:44 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 13 May 2006 15:23:44 +0200
Subject: [Python-Dev] Py_ssize_t formatting
In-Reply-To: <e44mav$jcr$1@sea.gmane.org>
References: <e44mav$jcr$1@sea.gmane.org>
Message-ID: <4465DDE0.5090004@v.loewis.de>

Georg Brandl wrote:
> which formatting code should be used for a Py_ssize_t value in e.g.
> PyString_FromFormat?

%zd

> '%zd' won't work since my value can be negative, and the 'z' modifier
> converts the argument to size_t.

That's a bug. It should print it signed. If unsigned printing of size_t
is desired, %zu should be used.

Regards,
Martin

From g.brandl at gmx.net  Sat May 13 16:02:45 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 13 May 2006 16:02:45 +0200
Subject: [Python-Dev] Py_ssize_t formatting
In-Reply-To: <4465DDE0.5090004@v.loewis.de>
References: <e44mav$jcr$1@sea.gmane.org> <4465DDE0.5090004@v.loewis.de>
Message-ID: <e44nti$n92$1@sea.gmane.org>

Martin v. L?wis wrote:
> Georg Brandl wrote:
>> which formatting code should be used for a Py_ssize_t value in e.g.
>> PyString_FromFormat?
> 
> %zd
> 
>> '%zd' won't work since my value can be negative, and the 'z' modifier
>> converts the argument to size_t.
> 
> That's a bug. It should print it signed. If unsigned printing of size_t
> is desired, %zu should be used.

I saw you fixed it, thanks.

Similary, there's no way for a structmember to be of type Py_ssize_t...

Georg


From martin at v.loewis.de  Sat May 13 16:24:37 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 13 May 2006 16:24:37 +0200
Subject: [Python-Dev] Py_ssize_t formatting
In-Reply-To: <e44nti$n92$1@sea.gmane.org>
References: <e44mav$jcr$1@sea.gmane.org> <4465DDE0.5090004@v.loewis.de>
	<e44nti$n92$1@sea.gmane.org>
Message-ID: <4465EC25.8040800@v.loewis.de>

Georg Brandl wrote:
> Similary, there's no way for a structmember to be of type Py_ssize_t...

Right. At least, not with changing structmember.[ch].

Regards,
Martin

From martin at v.loewis.de  Sat May 13 18:49:06 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 13 May 2006 18:49:06 +0200
Subject: [Python-Dev] cleaned windows icons
Message-ID: <44660E02.6030206@v.loewis.de>

Somebody has contributed an edited version of the current icons,
at

https://sourceforge.net/tracker/index.php?func=detail&aid=1481304&group_id=5470&atid=305470

He claims that his version is prettier in small sizes; I'm
uncertain whether this is the case.

What do you think?

Regards,
Martin

From g.brandl at gmx.net  Sat May 13 19:44:49 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 13 May 2006 19:44:49 +0200
Subject: [Python-Dev] Py_ssize_t formatting
In-Reply-To: <4465EC25.8040800@v.loewis.de>
References: <e44mav$jcr$1@sea.gmane.org>
	<4465DDE0.5090004@v.loewis.de>	<e44nti$n92$1@sea.gmane.org>
	<4465EC25.8040800@v.loewis.de>
Message-ID: <e454u0$qcv$1@sea.gmane.org>

Martin v. L?wis wrote:
> Georg Brandl wrote:
>> Similary, there's no way for a structmember to be of type Py_ssize_t...
> 
> Right. At least, not with changing structmember.[ch].

Did you mean "without"? Can I submit a patch?

Georg


From nnorwitz at gmail.com  Sat May 13 20:16:21 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sat, 13 May 2006 11:16:21 -0700
Subject: [Python-Dev] Py_ssize_t formatting
In-Reply-To: <4465DDE0.5090004@v.loewis.de>
References: <e44mav$jcr$1@sea.gmane.org> <4465DDE0.5090004@v.loewis.de>
Message-ID: <ee2a432c0605131116t7a81f0b3ifbeef7e6ce47bec2@mail.gmail.com>

On 5/13/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>
> > '%zd' won't work since my value can be negative, and the 'z' modifier
> > converts the argument to size_t.
>
> That's a bug. It should print it signed. If unsigned printing of size_t
> is desired, %zu should be used.

Looking in stringobject.c, I don't see how %zu (or %lu) can be used
with String_FromFormatV.  Here's the code around line 260 (note
ssize_t in the comment really means Py_ssize_t):

                        /* handle the long flag, but only for %ld.  others
                           can be added when necessary. */
                        if (*f == 'l' && *(f+1) == 'd') {
                                longflag = 1;
                                ++f;
                        }
                        /* handle the ssize_t flag. */
                        if (*f == 'z' && *(f+1) == 'd') {
                                size_tflag = 1;
                                ++f;
                        }

n

From martin at v.loewis.de  Sat May 13 23:29:36 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 13 May 2006 23:29:36 +0200
Subject: [Python-Dev] Py_ssize_t formatting
In-Reply-To: <ee2a432c0605131116t7a81f0b3ifbeef7e6ce47bec2@mail.gmail.com>
References: <e44mav$jcr$1@sea.gmane.org> <4465DDE0.5090004@v.loewis.de>
	<ee2a432c0605131116t7a81f0b3ifbeef7e6ce47bec2@mail.gmail.com>
Message-ID: <44664FC0.4090006@v.loewis.de>

Neal Norwitz wrote:
>> That's a bug. It should print it signed. If unsigned printing of size_t
>> is desired, %zu should be used.
> 
> Looking in stringobject.c, I don't see how %zu (or %lu) can be used
> with String_FromFormatV.

Right. It currently cannot be used. So if it is desired, it needs to
be added first, and then should be used.

Looking at the checkin messages, I understand that the change from "d"
to "u" was to make some compiler stop warning - that is the wrong
motivation for such a behavioral change.

Regards,
Martin

From tim.peters at gmail.com  Sun May 14 01:28:39 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 13 May 2006 19:28:39 -0400
Subject: [Python-Dev] Py_ssize_t formatting
In-Reply-To: <44664FC0.4090006@v.loewis.de>
References: <e44mav$jcr$1@sea.gmane.org> <4465DDE0.5090004@v.loewis.de>
	<ee2a432c0605131116t7a81f0b3ifbeef7e6ce47bec2@mail.gmail.com>
	<44664FC0.4090006@v.loewis.de>
Message-ID: <1f7befae0605131628n1d44af94h810607d7f180dc77@mail.gmail.com>

[Neal]
>> Looking in stringobject.c, I don't see how %zu (or %lu) can be used
>> with String_FromFormatV.

[Martin]
> Right. It currently cannot be used. So if it is desired, it needs to
> be added first, and then should be used.

I added it:  %u, %lu, and %zu can be used now in PyString_FromFormat,
PyErr_Format, and PyString_FromFormatV.

Since PyString_FromFormat and PyErr_Format have exactly the same rules
(both inherited from PyString_FromFormatV), but their docs were way
out of synch, it would be good if someone with more LaTeX Fu changed
one of them to just point to the other.  I simply did a massive
copy+paste job to get them in synch again.

From martin at v.loewis.de  Sun May 14 08:43:59 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 14 May 2006 08:43:59 +0200
Subject: [Python-Dev] Py_ssize_t formatting
In-Reply-To: <e454u0$qcv$1@sea.gmane.org>
References: <e44mav$jcr$1@sea.gmane.org>	<4465DDE0.5090004@v.loewis.de>	<e44nti$n92$1@sea.gmane.org>	<4465EC25.8040800@v.loewis.de>
	<e454u0$qcv$1@sea.gmane.org>
Message-ID: <4466D1AF.90300@v.loewis.de>

Georg Brandl wrote:
>> Right. At least, not with changing structmember.[ch].
> 
> Did you mean "without"? 

Oops, right.

> Can I submit a patch?

I personally don't mind having more types added to structmember,
so I'm +0 on adding Py_ssize_t to the list of types supported.
I wonder what the specific application is that you have in mind,
though.

Regards,
Martin

From thomas at python.org  Sun May 14 12:38:44 2006
From: thomas at python.org (Thomas Wouters)
Date: Sun, 14 May 2006 12:38:44 +0200
Subject: [Python-Dev] Py_ssize_t formatting
In-Reply-To: <4466D1AF.90300@v.loewis.de>
References: <e44mav$jcr$1@sea.gmane.org> <4465DDE0.5090004@v.loewis.de>
	<e44nti$n92$1@sea.gmane.org> <4465EC25.8040800@v.loewis.de>
	<e454u0$qcv$1@sea.gmane.org> <4466D1AF.90300@v.loewis.de>
Message-ID: <9e804ac0605140338v5496fb7cv1dbc846f38e2255e@mail.gmail.com>

On 5/14/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>
> Georg Brandl wrote:
> >> Right. At least, not with changing structmember.[ch].
> >
> > Did you mean "without"?
>
> Oops, right.
>
> > Can I submit a patch?
>
> I personally don't mind having more types added to structmember,
> so I'm +0 on adding Py_ssize_t to the list of types supported.
> I wonder what the specific application is that you have in mind,
> though.


I ended up needing T_SSIZE_T or T_SIZE_T (I forget which) in my first
attempt to fully Py_ssize-t-ify ctypes (but I was learning a lot about
ctypes at the same time, and I ended up breaking a lot of stuff.) I'm now
not sure whether it's really necessary for ctypes, but it was trivial to
add, not to mention symmetric and completely logical. However, I also have a
C extension of my own that exposes something quite like 'length' as an
attribute, and it really ought to be a Py_ssize_t struct member (instead of
the long it is now.)

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060514/553d2368/attachment.html 

From draconux at gmail.com  Sat May 13 17:30:40 2006
From: draconux at gmail.com (draconux)
Date: Sat, 13 May 2006 17:30:40 +0200
Subject: [Python-Dev] correction of a bug
Message-ID: <60d29d630605130830qa9a5497lc2bd18ae8f825f87@mail.gmail.com>

Hello all ,


I found a bug in python
I'm using python 2.4 with debian etch

string.lstrip("source/old_prog","source/") return "ld_prog" instead of
"old_prog"
So I wanted to create a patch in order to contribute to python, but I have
some question :

- grep 'def lstrip' show that this function are defined in string.py
In this fill I read :

# NOTE: Everything below here is deprecated.  Use string methods instead.
# This stuff will go away in Python 3.0.

so lstrip is deprecated, and we are recommanded to use string method
But this is string.py which define string method, isn't it ?

- The function lstrip is :
  def lstrip(s, chars=None):
    """lstrip(s [,chars]) -> string

    Return a copy of the string s with leading whitespace removed.
    If chars is given and not None, remove characters in chars instead.

    """
    return s.lstrip(chars)

but where is defined this lstrip(chars) function ????
I can't found it with grep.

Sorry I'm a c programer, and I'm not used with python
for me, It's easier to debug c program than python :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060513/09f2cc20/attachment.htm 

From skip at pobox.com  Sun May 14 22:06:03 2006
From: skip at pobox.com (skip at pobox.com)
Date: Sun, 14 May 2006 15:06:03 -0500
Subject: [Python-Dev] correction of a bug
In-Reply-To: <60d29d630605130830qa9a5497lc2bd18ae8f825f87@mail.gmail.com>
References: <60d29d630605130830qa9a5497lc2bd18ae8f825f87@mail.gmail.com>
Message-ID: <17511.36267.1967.963807@montanaro.dyndns.org>


    draconux> I found a bug in python I'm using python 2.4 with debian etch

    draconux> string.lstrip("source/old_prog","source/") return "ld_prog"
    draconux> instead of "old_prog"

The above is the same as

    "source/old_prog".lstrip("source/")

String methods are defined in the Objects/stringobject.c file of the source
distribution.

Skip

From me+python-dev at modelnine.org  Sun May 14 22:22:10 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Sun, 14 May 2006 22:22:10 +0200
Subject: [Python-Dev] correction of a bug
In-Reply-To: <60d29d630605130830qa9a5497lc2bd18ae8f825f87@mail.gmail.com>
References: <60d29d630605130830qa9a5497lc2bd18ae8f825f87@mail.gmail.com>
Message-ID: <200605142222.10525.me+python-dev@modelnine.org>

Am Samstag 13 Mai 2006 17:30 schrieb draconux:
> I found a bug in python
> I'm using python 2.4 with debian etch
>
> string.lstrip("source/old_prog","source/") return "ld_prog" instead of
> "old_prog"

This is not a bug, but rather expected behaviour. Read the specification of 
lstrip() correctly.

--- Heiko.

From edloper at gradient.cis.upenn.edu  Sun May 14 22:27:50 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Sun, 14 May 2006 16:27:50 -0400
Subject: [Python-Dev] correction of a bug
In-Reply-To: <60d29d630605130830qa9a5497lc2bd18ae8f825f87@mail.gmail.com>
References: <60d29d630605130830qa9a5497lc2bd18ae8f825f87@mail.gmail.com>
Message-ID: <446792C6.4060102@gradient.cis.upenn.edu>

draconux wrote:
> 
> Hello all ,
> string.lstrip("source/old_prog","source/") return "ld_prog" instead of 
> "old_prog"

You are misunderstanding what the second argument to lstrip does.  It is 
interpreted as a list of characters; and lstrip will remove the maximal 
prefix of the string that consists of these characters.  E.g.:

 >>> 'aaabbbcccaax'.lstrip('abc')
'x'

The first character in your string that is not one of the characters 
's', 'o', 'u', 'r', 'c', 'e', or '/' is 'l', so it strips all characters 
up to that one.

-Edward

From ncoghlan at gmail.com  Mon May 15 11:26:48 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 15 May 2006 19:26:48 +1000
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>
	<ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>
Message-ID: <44684958.9010206@gmail.com>

Guido van Rossum wrote:
> Yeah, I think imports inside functions are overused.

Rotem took the time to explain the problem to me more fully in private email, 
and it appears that using import statements inside functions is currently 
fundamentally unsafe in a posix environment where both multiple threads and 
fork() are used. If any thread other than the one invoking fork() holds the 
import lock at the time of the fork then import statements in the spawned 
process will deadlock.

So I believe fixing this properly would indeed require assigning a new lock 
object in one of the two places Rotem suggested.

Cheers,
Nick.

> 
> On 5/9/06, Rotem Yaari <vmalloc at gmail.com> wrote:
>> Hello everyone!
>>
>> We have been encountering several deadlocks in a threaded Python
>> application which calls  subprocess.Popen (i.e. fork()) in some of its
>> threads.
>>
>> This has occurred on Python 2.4.1 on a 2.4.27 Linux kernel.
>>
>>  Preliminary analysis of the hang shows that the child process blocks
>> upon entering the  execvp function, in which the import_lock is acquired
>> due to the following line:
>>
>> def _ execvpe(file,  args,  env=None):
>>     from  errno import ENOENT, ENOTDIR
>>     ...
>>
>> It is known that when forking from a  pthreaded application, acquisition
>> attempts on locks which were already locked by other threads while
>> fork() was called will deadlock.
>>
>> Due to these oddities we were wondering if it would be better to extract
>> the above import line from the  execvpe call, to prevent lock
>> acquisition attempts in such cases.
>>
>> Another workaround could be re-assigning a new lock to import_lock
>> (such a thing is done with the global interpreter lock) at PyOS_AfterFork or
>> pthread_atfork.
>>
>> We'd appreciate any opinions you might have on the subject.
>>
>> Thanks in advance,
>>
>> Yair and  Rotem
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
> 
> 


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From mal at egenix.com  Mon May 15 13:43:13 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 15 May 2006 13:43:13 +0200
Subject: [Python-Dev] [Python-checkins] r45925 - in python/trunk:
 Lib/tempfile.py	Lib/test/test_os.py Misc/NEWS Modules/posixmodule.c
In-Reply-To: <4465CDE6.9060501@v.loewis.de>
References: <20060506163302.CA3D71E4004@bag.python.org>	<445CE018.7040609@egenix.com>	<445D9B0C.5080808@v.loewis.de>	<445DF658.7050609@egenix.com>
	<445E1E09.1040908@v.loewis.de>	<445E3615.7000108@egenix.com>
	<445E433C.10908@v.loewis.de>	<445F23BD.5040906@egenix.com>
	<445FC265.6060508@v.loewis.de>	<4461EB0A.5000800@egenix.com>
	<44623E05.4080700@v.loewis.de>	<4462443C.7010608@egenix.com>
	<44624963.4020604@v.loewis.de>	<44631F53.1010507@egenix.com>
	<4464FCAF.5080708@egenix.com>	<44657CA5.4030708@v.loewis.de>
	<4465AB7F.7030308@egenix.com> <4465CDE6.9060501@v.loewis.de>
Message-ID: <44686951.50504@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
>> I was leaving those out already - only the codes named 'ERROR_*'
>> get included (see attached parser and generator).
> 
> Right. One might debate whether DNS_INFO_AXFR_COMPLETE (9751L)
> or WSAEACCES (10013L) should be included as well.

The WSA codes are already included in the errno module.
Not sure about the DNS codes, but it would be easy
to add them as well.

> I got a smaller source file as I included only forward mappings,
> and used a loop to create the backwards mappings.
> 
>> Using a lookup object is not really clumsy - you can still access
>> all the values by attribute access. The only difference is that
>> they don't live in the module namespace, but get accessed via
>> an object.
> 
> So how much space would that save?

I'll have to write the lookup code first.

I expect a savings of between 2-3 times since the data will
be stored in C static data. This is only swapped in as
needed and can be shared between processes, so quantifying
the savings is difficult.

>> I'm not worried about the disk space being used. The heap
>> memory usage is what's worrying: the import of the module lets
>> the non-shared memory size of the process grow by 700kB
>> on my AMD64 box.
> 
> That number must be misleading somehow. 

The number does look a bit high - it is possible that the
process has to swap in some of the shared memory. But then
again, the shared memory also increases:

USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND

Before import:

lemburg  26515  0.0  0.4 19936 4620 pts/3    S+   13:37   0:00 python

After:

lemburg  26515  0.0  0.5 20780 5320 pts/3    S+   13:37   0:00 python

> There are 1510 strings,
> with a total length of 39972. There are 1510 integers also,
> and they all get added into three dictionaries.

Well, the strings and integers count twice: once in the module
namespace and once in the errorcode dictionary.

> On a 32-bit machine, these should consume 76968 bytes for the
> strings (*), 18120 bytes for the integers, and 100000 bytes
> for the dict entries (**), for a total of 200000 bytes
> at run-time.
> 
> On a 64-bit machine, the strings should consume 101128 bytes (***),
> the integers 24160, and the dict entries 200000 bytes,
> for a total of 325000 bytes.

Given that the code strings and integers are created
twice in my version of the module, the numbers sound about
right.

I agree that creating only one dictionary statically
and the other mapping dynamically will already be a
saving of 50% simply by sharing the string and integer
objects.

>>From that, I would conclude that one should avoid 64-bit machines
> if one is worried about memory usage :-)
> 
> Regards,
> Martin
> 
> (*) assuming 20 bytes string header, 1 byte null-termination,
> and a rounding-up to the next multiple of 8
> (**) assuming 12 bytes per dict entry in three dictionaries
> (winerror.__dict__, winerror.errorcode, interning dict),
> and assuming an average fill ratio of the dicts of 50%
> (***) assuming 40 bytes string header, provided long is
> a 64-bit type on that platform

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 15 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From rccall at optonline.net  Mon May 15 17:14:29 2006
From: rccall at optonline.net (R. Christian Call)
Date: Mon, 15 May 2006 11:14:29 -0400
Subject: [Python-Dev] Building with VS 2003 .NET
Message-ID: <44689AD5.3000605@optonline.net>

I recently checked out the 2006-02-04 python trunk, but I can't get it 
to build in Visual Studio 2003 .NET.

When I open up the PCbuild\pcbuild.sln file in VS2003 .NET and then try 
to build the solution, I get the following errors:

fatal error C1083:  Cannot open include file: 'bzlib.h': No such file or 
directory
fatal error C1083:  Cannot open include file: 'tcl.h': No such file or 
directory
fatal error C1083:  Cannot open include file: 'tcl.h': No such file or 
directory
error PRJ0019: A tool returned an error code from "Performing Makefile 
project actions"
fatal error C1083:  Cannot open include file: 'dbl.h': No such file or 
directory

I'm surprised, and mystified, by this--since it seems to me that the 
.sln file ought to build properly.

Is there something else I need to check out from subversion in order to 
get the code to build?

I'd appreciate any help that anyone can offer.

Thanks,

Chris


From p.f.moore at gmail.com  Mon May 15 17:39:17 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 15 May 2006 16:39:17 +0100
Subject: [Python-Dev] Building with VS 2003 .NET
In-Reply-To: <44689AD5.3000605@optonline.net>
References: <44689AD5.3000605@optonline.net>
Message-ID: <79990c6b0605150839s89b6b36v7f348915d380bf86@mail.gmail.com>

On 5/15/06, R. Christian Call <rccall at optonline.net> wrote:
> I recently checked out the 2006-02-04 python trunk, but I can't get it
> to build in Visual Studio 2003 .NET.
>
> When I open up the PCbuild\pcbuild.sln file in VS2003 .NET and then try
> to build the solution, I get the following errors:
>
> fatal error C1083:  Cannot open include file: 'bzlib.h': No such file or
> directory
> fatal error C1083:  Cannot open include file: 'tcl.h': No such file or
> directory
> fatal error C1083:  Cannot open include file: 'tcl.h': No such file or
> directory
> error PRJ0019: A tool returned an error code from "Performing Makefile
> project actions"
> fatal error C1083:  Cannot open include file: 'dbl.h': No such file or
> directory
>
> I'm surprised, and mystified, by this--since it seems to me that the
> .sln file ought to build properly.
>
> Is there something else I need to check out from subversion in order to
> get the code to build?
>
> I'd appreciate any help that anyone can offer.

Python depends on a number of external libraries (TCL, BZip2, Berkeley
DB). You need to obtain and set up these libraries, or disable these
extensions. Full details are in the "readme.txt" file in the PCBuild
directory. You should read and follow the instructions there.

You should probably post questions like this to the comp.lang.python newsgroup.

Paul.

From tim.peters at gmail.com  Mon May 15 18:51:25 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 15 May 2006 12:51:25 -0400
Subject: [Python-Dev] Building with VS 2003 .NET
In-Reply-To: <44689AD5.3000605@optonline.net>
References: <44689AD5.3000605@optonline.net>
Message-ID: <1f7befae0605150951h5e39623arf477dcf01f2da571@mail.gmail.com>

[R. Christian Call]
> I recently checked out the 2006-02-04 python trunk, but I can't get it
> to build in Visual Studio 2003 .NET.
>
> When I open up the PCbuild\pcbuild.sln file in VS2003 .NET and then try
> to build the solution, I get the following errors:
>
> fatal error C1083:  Cannot open include file: 'bzlib.h': No such file or
> directory
[etc]

> I'm surprised, and mystified, by this--since it seems to me that the
> .sln file ought to build properly.
>
> Is there something else I need to check out from subversion in order to
> get the code to build?

Read PCbuild/readme.txt.  Then do what it says ;-)

From tim.peters at gmail.com  Mon May 15 22:15:44 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 15 May 2006 16:15:44 -0400
Subject: [Python-Dev] [Python-checkins] r46005 - in python/trunk:
	Lib/tarfile.py Lib/test/test_tarfile.py Misc/NEWS
In-Reply-To: <20060515193036.3A7BF1E4015@bag.python.org>
References: <20060515193036.3A7BF1E4015@bag.python.org>
Message-ID: <1f7befae0605151315j67bfbc58j20bdb138235f0885@mail.gmail.com>

> Author: georg.brandl
> Date: Mon May 15 21:30:35 2006
> New Revision: 46005
>
> Modified:
>    python/trunk/Lib/tarfile.py
>    python/trunk/Lib/test/test_tarfile.py
>    python/trunk/Misc/NEWS
> Log:
> [ 1488881 ] tarfile.py: support for file-objects and bz2 (cp. #1488634)

"Something went wrong" here on Windows.  The Windows buildbot slaves
other than mine eventually died with:

"""
...
test_tarfile

command timed out: 1200 seconds without output
""":

I was working on my box at the time, and it became dramatically
unusable:  took about two minutes just to swap enough of the OS back
in to _bring up_ Task Manager to see what was going on.  The Python
test process was close to 2GB in size, and of course the box was
swapping madly.  So I killed python_d.exe and waited to see how the
other buildbots fared.  Sure appears to be a Windows-specific problem
-- or maybe all the Linux boxes come with a terabyte of RAM these days
;-)

I'll take a peek later, but won't be disappointed if someone else
figures it out first.

From tim.peters at gmail.com  Mon May 15 23:21:10 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 15 May 2006 17:21:10 -0400
Subject: [Python-Dev] [Python-checkins] r46005 - in python/trunk:
	Lib/tarfile.py Lib/test/test_tarfile.py Misc/NEWS
In-Reply-To: <1f7befae0605151315j67bfbc58j20bdb138235f0885@mail.gmail.com>
References: <20060515193036.3A7BF1E4015@bag.python.org>
	<1f7befae0605151315j67bfbc58j20bdb138235f0885@mail.gmail.com>
Message-ID: <1f7befae0605151421x70f475ddpbeec9734df0722f6@mail.gmail.com>

[Tim]
> "Something went wrong" here on Windows.  The Windows buildbot slaves
> other than mine eventually died with:
>
> """
> ...
> test_tarfile
>
> command timed out: 1200 seconds without output
> """:
>
> I was working on my box at the time, and it became dramatically
> unusable:  took about two minutes just to swap enough of the OS back
> in to _bring up_ Task Manager to see what was going on. ...

OK, I fixed that, but unfortunately the checkin comment was correct:

    The Windows buildbot slaves shouldn't swap themselves to death
    anymore.  However, test_tarfile may still fail because of a
    temp directory left behind from a previous failing run.
    Windows buildbot owners may need to remove that directory
    by hand.

I did that on my box, and test_tarfile passes there now, but
test_tarfile is still failing on other Windows slaves; e.g.,

http://www.python.org/dev/buildbot/all/x86%20W2k%20trunk/builds/694/step-test/0

...

WindowsError: [Error 183] Cannot create a file when that file already exists:
 'c:\\docume~1\\trentm~1.act\\locals~1\\temp\\testtar.dir\\directory'

From brett at python.org  Mon May 15 23:23:39 2006
From: brett at python.org (Brett Cannon)
Date: Mon, 15 May 2006 14:23:39 -0700
Subject: [Python-Dev] [Python-checkins] r46005 - in python/trunk:
	Lib/tarfile.py Lib/test/test_tarfile.py Misc/NEWS
In-Reply-To: <1f7befae0605151421x70f475ddpbeec9734df0722f6@mail.gmail.com>
References: <20060515193036.3A7BF1E4015@bag.python.org>
	<1f7befae0605151315j67bfbc58j20bdb138235f0885@mail.gmail.com>
	<1f7befae0605151421x70f475ddpbeec9734df0722f6@mail.gmail.com>
Message-ID: <bbaeab100605151423x1807eb99x5db98a5f672c465d@mail.gmail.com>

On 5/15/06, Tim Peters <tim.peters at gmail.com> wrote:
>
> [Tim]
> > "Something went wrong" here on Windows.  The Windows buildbot slaves
> > other than mine eventually died with:
> >
> > """
> > ...
> > test_tarfile
> >
> > command timed out: 1200 seconds without output
> > """:
> >
> > I was working on my box at the time, and it became dramatically
> > unusable:  took about two minutes just to swap enough of the OS back
> > in to _bring up_ Task Manager to see what was going on. ...
>
> OK, I fixed that, but unfortunately the checkin comment was correct:
>
>     The Windows buildbot slaves shouldn't swap themselves to death
>     anymore.  However, test_tarfile may still fail because of a
>     temp directory left behind from a previous failing run.
>     Windows buildbot owners may need to remove that directory
>     by hand.
>
> I did that on my box, and test_tarfile passes there now, but
> test_tarfile is still failing on other Windows slaves; e.g.,
>
>
> http://www.python.org/dev/buildbot/all/x86%20W2k%20trunk/builds/694/step-test/0
>
> ...
>
> WindowsError: [Error 183] Cannot create a file when that file already
> exists:
> 'c:\\docume~1\\trentm~1.act\\locals~1\\temp\\testtar.dir\\directory'
> _______________________________________________
>
>
Ignorant question probably, but why can't the test just check for the
directory first, and remove it if it exists?

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060515/ba266eba/attachment.htm 

From tim.peters at gmail.com  Mon May 15 23:35:12 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 15 May 2006 17:35:12 -0400
Subject: [Python-Dev] [Python-checkins] r46005 - in python/trunk:
	Lib/tarfile.py Lib/test/test_tarfile.py Misc/NEWS
In-Reply-To: <bbaeab100605151423x1807eb99x5db98a5f672c465d@mail.gmail.com>
References: <20060515193036.3A7BF1E4015@bag.python.org>
	<1f7befae0605151315j67bfbc58j20bdb138235f0885@mail.gmail.com>
	<1f7befae0605151421x70f475ddpbeec9734df0722f6@mail.gmail.com>
	<bbaeab100605151423x1807eb99x5db98a5f672c465d@mail.gmail.com>
Message-ID: <1f7befae0605151435x26ca40f0nd566eed898f9c626@mail.gmail.com>

[Brett Cannon]
>  Ignorant question probably, but why can't the test just check for the
> directory first, and remove it if it exists?

Because it's a stupid hack that should never be necessary:  the test
cleans up after itself in a "finally" clause.  It only looks
attractive right now because an earlier bug caused the Windows boxes
to swap themselves to death, and the "finally" clause never executed.
So, what the heck, sure, I checked that in :-)

From richard at commonground.com.au  Tue May 16 02:15:11 2006
From: richard at commonground.com.au (Richard Jones)
Date: Tue, 16 May 2006 10:15:11 +1000
Subject: [Python-Dev] FYI: building on OS X
Message-ID: <DA03C540-C05F-4D11-B8EC-F0F9B5ED0AB8@commonground.com.au>

I just built 2.5 SVN on my OS X 10.4 laptop. I ran into the missing  
sys/statvfs.h problem that has been reported here in the past. To  
solve the problem I needed to upgrade my XCode install from 1.1 to  
2.2.1 - the missing header files (there was actually a whole lot of  
them) were then installed.

I don't know whether this needs to be mentioned somewhere in the  
build docs.


     Richard


From brett at python.org  Tue May 16 02:17:15 2006
From: brett at python.org (Brett Cannon)
Date: Mon, 15 May 2006 17:17:15 -0700
Subject: [Python-Dev] FYI: building on OS X
In-Reply-To: <DA03C540-C05F-4D11-B8EC-F0F9B5ED0AB8@commonground.com.au>
References: <DA03C540-C05F-4D11-B8EC-F0F9B5ED0AB8@commonground.com.au>
Message-ID: <bbaeab100605151717l721a3a72v6e4ee4c4e8200970@mail.gmail.com>

On 5/15/06, Richard Jones <richard at commonground.com.au> wrote:
>
> I just built 2.5 SVN on my OS X 10.4 laptop. I ran into the missing
> sys/statvfs.h problem that has been reported here in the past. To
> solve the problem I needed to upgrade my XCode install from 1.1 to
> 2.2.1 - the missing header files (there was actually a whole lot of
> them) were then installed.
>
> I don't know whether this needs to be mentioned somewhere in the
> build docs.



Wouldn't hurt.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060515/21ac012e/attachment.html 

From martin at v.loewis.de  Tue May 16 07:34:15 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 16 May 2006 07:34:15 +0200
Subject: [Python-Dev] [Python-checkins] r45925 - in python/trunk:
 Lib/tempfile.py	Lib/test/test_os.py Misc/NEWS Modules/posixmodule.c
In-Reply-To: <44686951.50504@egenix.com>
References: <20060506163302.CA3D71E4004@bag.python.org>	<445CE018.7040609@egenix.com>	<445D9B0C.5080808@v.loewis.de>	<445DF658.7050609@egenix.com>
	<445E1E09.1040908@v.loewis.de>	<445E3615.7000108@egenix.com>
	<445E433C.10908@v.loewis.de>	<445F23BD.5040906@egenix.com>
	<445FC265.6060508@v.loewis.de>	<4461EB0A.5000800@egenix.com>
	<44623E05.4080700@v.loewis.de>	<4462443C.7010608@egenix.com>
	<44624963.4020604@v.loewis.de>	<44631F53.1010507@egenix.com>
	<4464FCAF.5080708@egenix.com>	<44657CA5.4030708@v.loewis.de>
	<4465AB7F.7030308@egenix.com> <4465CDE6.9060501@v.loewis.de>
	<44686951.50504@egenix.com>
Message-ID: <44696457.8020906@v.loewis.de>

M.-A. Lemburg wrote:
> Well, the strings and integers count twice: once in the module
> namespace and once in the errorcode dictionary.

That shouldn't be the case: the strings are interned (as they
are identifier-like), so you have the very same string object
in both dictionaries.

The numbers shouldn't be duplicated because they occur
in the co_consts array of the global code object, and because
the compiler should share them there.

> Given that the code strings and integers are created
> twice in my version of the module, the numbers sound about
> right.

If they are indeed created twice, something is wrong.

> I agree that creating only one dictionary statically
> and the other mapping dynamically will already be a
> saving of 50% simply by sharing the string and integer
> objects.

No, they should be shared already, so that shouldn't save
anything.

Regards,
Martin


From martin at v.loewis.de  Tue May 16 08:40:48 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 16 May 2006 08:40:48 +0200
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <44684958.9010206@gmail.com>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>	<ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>
	<44684958.9010206@gmail.com>
Message-ID: <446973F0.3010007@v.loewis.de>

Nick Coghlan wrote:
> Rotem took the time to explain the problem to me more fully in private email, 
> and it appears that using import statements inside functions is currently 
> fundamentally unsafe in a posix environment where both multiple threads and 
> fork() are used. If any thread other than the one invoking fork() holds the 
> import lock at the time of the fork then import statements in the spawned 
> process will deadlock.
> 
> So I believe fixing this properly would indeed require assigning a new lock 
> object in one of the two places Rotem suggested.

I still don't understand the problem fully. According to POSIX, fork does:

"A process is created with a single thread. If a multi-threaded process
calls fork(), the new process contains a replica of the calling thread
and its entire address space, possibly including the states of mutexes
and other resources."

So if the the import lock was held during the fork call, the new thread
will hold the import lock of the new process, and subsequent imports
will block.

However, then the problem is not with the execve implementation, but
with the fact that the import lock was held when the process forked.

Rotem should simply avoid to fork() in the toplevel code of a module.

This problem should be fixed in Python 2.5, whose PyOS_AfterFork now
reads

void
PyOS_AfterFork(void)
{
#ifdef WITH_THREAD
        PyEval_ReInitThreads();
        main_thread = PyThread_get_thread_ident();
        main_pid = getpid();
        _PyImport_ReInitLock();
#endif
}

This was added in

------------------------------------------------------------------------
r39522 | gvanrossum | 2005-09-14 20:09:42 +0200 (Mi, 14 Sep 2005) | 4 lines

- Changes donated by Elemental Security to make it work on AIX 5.3
  with IBM's 64-bit compiler (SF patch #1284289).  This also closes SF
  bug #105470: test_pwd fails on 64bit system (Opteron).

------------------------------------------------------------------------

Unfortunately, that fix would apply to AIX only, though:

void
_PyImport_ReInitLock(void)
{
#ifdef _AIX
        if (import_lock != NULL)
                import_lock = PyThread_allocate_lock();
#endif
}

If we want to support fork while an import is going on, we should
release the import lock if it is held. Alternatively, the code
could throw away the old import lock on all systems; that seems
like a waste of resources to me, though. One should also reset
import_lock_thread and import_lock_level.

I'd be curious if Elemental Security added this code only to solve
the very same problem, i.e. a fork that happened with the import
lock held.

Regards,
Martin


From martin at v.loewis.de  Tue May 16 09:09:20 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 16 May 2006 09:09:20 +0200
Subject: [Python-Dev] FYI: building on OS X
In-Reply-To: <DA03C540-C05F-4D11-B8EC-F0F9B5ED0AB8@commonground.com.au>
References: <DA03C540-C05F-4D11-B8EC-F0F9B5ED0AB8@commonground.com.au>
Message-ID: <44697AA0.40805@v.loewis.de>

Richard Jones wrote:
> I just built 2.5 SVN on my OS X 10.4 laptop. I ran into the missing  
> sys/statvfs.h problem that has been reported here in the past. To  
> solve the problem I needed to upgrade my XCode install from 1.1 to  
> 2.2.1 - the missing header files (there was actually a whole lot of  
> them) were then installed.
> 
> I don't know whether this needs to be mentioned somewhere in the  
> build docs.

I think adding a remark to the build docs is NOT the best solution.
Instead, it would be good if the problem was fixed.

Looking at earlier communication, it appears that configure detects
the presence of fstatvfs(), and posixmodule.c infers from that
that sys/statvfs.h must be present. This is apparently misleading,
as you can manage to install OSX so that the header files installed
don't match the system that is running.

I added some additional autoconf tests in 46010 and 46011;
unfortunately, I cannot test these changes as I don't have a system
where the build was broken.

Regards,
Martin


From ncoghlan at gmail.com  Tue May 16 11:28:39 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 16 May 2006 19:28:39 +1000
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <446973F0.3010007@v.loewis.de>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>	<ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>
	<44684958.9010206@gmail.com> <446973F0.3010007@v.loewis.de>
Message-ID: <44699B47.4030504@gmail.com>

Martin v. L?wis wrote:
> So if the the import lock was held during the fork call, the new thread
> will hold the import lock of the new process, and subsequent imports
> will block.

> However, then the problem is not with the execve implementation, but
> with the fact that the import lock was held when the process forked.
> 
> Rotem should simply avoid to fork() in the toplevel code of a module.

That's what I originally thought, but Rotem was able to clarify for me that 
the problem occurs even if the import lock is held by a *different* thread at 
the time fork() is called. So a deadlock can be triggered by perfectly 
legitimate code that has the misfortune to fork the process while another 
thread is doing a module import.

> Unfortunately, that fix would apply to AIX only, though:
> 
> void
> _PyImport_ReInitLock(void)
> {
> #ifdef _AIX
>         if (import_lock != NULL)
>                 import_lock = PyThread_allocate_lock();
> #endif
> }
> 
> If we want to support fork while an import is going on, we should
> release the import lock if it is held. Alternatively, the code
> could throw away the old import lock on all systems; that seems
> like a waste of resources to me, though. One should also reset
> import_lock_thread and import_lock_level.

Broadening the ifdef to cover all posix systems rather than just AIX might be 
the right thing to do. However, the first thing would be to reproduce the 
deadlock reliably.

A repeatable test case would look something like this:

class ForkingThread(Thread):
     def run(self):
         # create a subprocess using subprocess.Popen
         # and ensure it tries to do an import operation
         # (i.e. something similar to what Rotem & Yair's code is doing)

class ImportingThread(Thread):
     def __init__(self):
         Thread.__init__(self)
         self.importing = Event()
         self.cleanup = Event()
     def run(self):
         imp.acquire_lock()
         try:
             self.importing.set()
             self.cleanup.wait()
         finally:
             imp.release_lock()

def test_forked_import_deadlock():
     import_thread = ImportingThread()
     fork_thread = ForkingThread()
     import_thread.start()
     try:
         import_thread.importing.wait()
         fork_thread.start()   # Fork while the import thread holds the lock
         fork_thread.join(10)  # Give the subprocess time to finish
         assert not fork_thread.isAlive() # A timeout means it deadlocked
     finally:
         import_thread.cleanup.set() # Release the import lock

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From mal at egenix.com  Tue May 16 13:40:01 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 16 May 2006 13:40:01 +0200
Subject: [Python-Dev] [Python-checkins] r45925 - in python/trunk:
 Lib/tempfile.py	Lib/test/test_os.py Misc/NEWS Modules/posixmodule.c
In-Reply-To: <44696457.8020906@v.loewis.de>
References: <20060506163302.CA3D71E4004@bag.python.org>	<445CE018.7040609@egenix.com>	<445D9B0C.5080808@v.loewis.de>	<445DF658.7050609@egenix.com>
	<445E1E09.1040908@v.loewis.de>	<445E3615.7000108@egenix.com>
	<445E433C.10908@v.loewis.de>	<445F23BD.5040906@egenix.com>
	<445FC265.6060508@v.loewis.de>	<4461EB0A.5000800@egenix.com>
	<44623E05.4080700@v.loewis.de>	<4462443C.7010608@egenix.com>
	<44624963.4020604@v.loewis.de>	<44631F53.1010507@egenix.com>
	<4464FCAF.5080708@egenix.com>	<44657CA5.4030708@v.loewis.de>
	<4465AB7F.7030308@egenix.com>	<4465CDE6.9060501@v.loewis.de>
	<44686951.50504@egenix.com> <44696457.8020906@v.loewis.de>
Message-ID: <4469BA11.90109@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
>> Well, the strings and integers count twice: once in the module
>> namespace and once in the errorcode dictionary.
> 
> That shouldn't be the case: the strings are interned (as they
> are identifier-like), so you have the very same string object
> in both dictionaries.
> 
> The numbers shouldn't be duplicated because they occur
> in the co_consts array of the global code object, and because
> the compiler should share them there.

Hmm, you're right.

>> Given that the code strings and integers are created
>> twice in my version of the module, the numbers sound about
>> right.
> 
> If they are indeed created twice, something is wrong.
>
>> I agree that creating only one dictionary statically
>> and the other mapping dynamically will already be a
>> saving of 50% simply by sharing the string and integer
>> objects.
> 
> No, they should be shared already, so that shouldn't save
> anything.

I changed the generator to now use only one dictionary
and create the module symbols at import time. Here's
the new data:

USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND

Before:

lemburg  29980  0.0  0.4 19936 4620 pts/3    S+   13:27   0:00 python

After the import:

lemburg  29980  0.0  0.5 20756 5260 pts/3    S+   13:27   0:00 python

The RSS changed by 640kB - only 60kB less than with the
static approach.

I also tried this on a 32-bit machine: the RSS changes by 392kB.

Next to come: the C extension version...

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 16 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From coder_infidel at hotmail.com  Tue May 16 14:52:14 2006
From: coder_infidel at hotmail.com (Luke Dunstan)
Date: Tue, 16 May 2006 20:52:14 +0800
Subject: [Python-Dev] MSVC 6.0 patch
Message-ID: <BAY109-DAV67DE2A3E328C13F68400F8AA00@phx.gbl>

Hi,

This patch submitted in March allows Python trunk to be built using MSVC 6:

http://sourceforge.net/tracker/index.php?func=detail&aid=1457736&group_id=5470&atid=305470

It is not my patch but I am quite interested because the code changes are 
also required to build with MS eMbedded Visual C++ 4.0, the Windows CE 4.x 
compiler. I'm guessing that the person assigned to the patch is too busy so 
I hope somebody else can take over?

Thanks,
Luke

From vys at renet.ru  Tue May 16 18:11:08 2006
From: vys at renet.ru (Vladimir 'Yu' Stepanov)
Date: Tue, 16 May 2006 20:11:08 +0400
Subject: [Python-Dev] total ordering.
In-Reply-To: <ca471dc20605110736n588f7548wf9cea8d5f79e3564@mail.gmail.com>
References: <20060505105630.67BE.JCARLSON@uci.edu> <445C67EE.1030704@renet.ru>	
	<20060506025902.67DA.JCARLSON@uci.edu>	
	<20060506205924.GA14684@fox.renet.ru>	
	<ca471dc20605070825k24c81178p1a2fb1123211f98d@mail.gmail.com>	
	<44632866.4090906@renet.ru>
	<ca471dc20605110736n588f7548wf9cea8d5f79e3564@mail.gmail.com>
Message-ID: <4469F99C.6090305@renet.ru>

Guido van Rossum wrote:
> On 5/11/06, Vladimir 'Yu' Stepanov <vys at renet.ru> wrote:
>> If for Python-3000 similar it will be shown concerning types
>> str(), int(), complex() and so on, and the type of exceptions
>> will strongly vary, it will make problematic redefinition of
>> behavior of function of sorting.
>
> Not really. We'll just document that sort() should only be used on a
> list of objects that implement a total ordering. The behavior
> otherwise will simply be undefined; it will raise whatever exception
> is first raised by an unsupported comparison (most likely TypeError).
> In practice this won't be a problem.
>

In my opinion for functions of comparisons there is no additional
kind of exceptions. If made action is not operation of reduction of
type more often exception TypeError means a mistake of programming.
At creation of exception, characteristic only for comparison, it
is possible to involve methods `.__r(eq|ne|le|lt|ge|gt|cmp)__()'
not being afraid to hide a mistake of programming TypeError (sorry
for a tautology).

It will be possible it conveniently to use as exception of
management by a stream, for indication of necessity to involve
`.__r(eq|ne|le|lt|ge|gt|cmp)__()' a method. This kind of a class
can carry out function, similarly to StopIteration for `.next()'.
In case of unsuccessful comparison the user has an opportunity in
function `cmp' a method .sort() simple image to define the
behaviour necessary to it, including an opportunity of sorting of
diverse elements.

At present time similar function is carried out with exception
NotImplemented. This exception is generated in a number of
mathematical operations. For this reason I ask to consider an
opportunity of creation of a new class.

From jason.orendorff at gmail.com  Tue May 16 19:16:22 2006
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Tue, 16 May 2006 13:16:22 -0400
Subject: [Python-Dev] total ordering.
In-Reply-To: <4469F99C.6090305@renet.ru>
References: <20060505105630.67BE.JCARLSON@uci.edu> <445C67EE.1030704@renet.ru>
	<20060506025902.67DA.JCARLSON@uci.edu>
	<20060506205924.GA14684@fox.renet.ru>
	<ca471dc20605070825k24c81178p1a2fb1123211f98d@mail.gmail.com>
	<44632866.4090906@renet.ru>
	<ca471dc20605110736n588f7548wf9cea8d5f79e3564@mail.gmail.com>
	<4469F99C.6090305@renet.ru>
Message-ID: <bb8868b90605161016kda546eet2a4e79da8d5837a5@mail.gmail.com>

On 5/11/06, Vladimir 'Yu' Stepanov <vys at renet.ru> wrote:
> If for Python-3000 similar it will be shown concerning types
> str(), int(), complex() and so on, and the type of exceptions
> will strongly vary, it will make problematic redefinition of
> behavior of function of sorting.

I don't see what you mean by "redefinition of behavior of function of
sorting".  Is this something a Python programmer might want to do?
Can you give an example?


On 5/16/06, Vladimir 'Yu' Stepanov <vys at renet.ru> wrote:
> It will be possible it conveniently to use as exception of
> management by a stream, for indication of necessity to involve
> `.__r(eq|ne|le|lt|ge|gt|cmp)__()' a method. This kind of a class
> can carry out function, similarly to StopIteration for `.next()'.

There are no .__r(eq|ne|le|lt|ge|gt|cmp)__() methods, for a logical
reason which you might enjoy deducing yourself...

> At present time similar function is carried out with exception
> NotImplemented. This exception is generated in a number of
> mathematical operations. For this reason I ask to consider an
> opportunity of creation of a new class.

Can you explain this?  NotImplemented isn't an exception.
(NotImplementedError is, but that's something quite different.)
NotImplemented has exactly one purpose in Python, as far as I can
tell.  What mathematical operations do you mean?

-j

From tim.peters at gmail.com  Tue May 16 19:21:36 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 16 May 2006 13:21:36 -0400
Subject: [Python-Dev] [Python-checkins] r46002 - in
	python/branches/release24-maint: Misc/ACKS Misc/NEWS
	Objects/unicodeobject.c
In-Reply-To: <4469C1E5.5080906@egenix.com>
References: <20060515072224.229C81E4013@bag.python.org>
	<44683D6C.8010101@egenix.com> <44696CE8.3040709@v.loewis.de>
	<4469C1E5.5080906@egenix.com>
Message-ID: <1f7befae0605161021v373e6ee3j66f46d0f45b84e3d@mail.gmail.com>

[M.-A. Lemburg]
>>> Could you please make this fix apply only on Solaris,
>>> e.g. using an #ifdef ?!

[Martin v. L?wis]
>> That shouldn't be done. The code, as it was before, had
>> undefined behaviour in C. With the fix, it is now correct.

[Marc-Andre]
> I don't understand - what's undefined in:
>
> const char *s;
> Py_UNICODE *p;
> ...
> *p = *(Py_UNICODE *)s;

The pointer cast:

    A pointer to an object or incomplete type may be converted to a pointer
    to a different object or incomplete type. If the resulting pointer is not
    correctly aligned for the pointed-to type, the behavior is undefined.

Since Py_UNICODE has a stricter alignment requirement than char,
there's no guarantee that _the content_ of p is correctly aligned for
Py_UNICODE after the cast.  Indeed, that's why the code segfaulted on
the Solaris box.  On other architectures it may not segfault but
"just" take much longer for the HW and SW to hide improperly aligned
access.

>> If you want to drop usage of memcpy on systems where you
>> think it isn't needed, you should make a positive list of
>> such systems, e.g. through an autoconf test (although such
>> a test is difficult to formulate).

> I don't want to drop memcpy() - just keep the existing
> working code on platforms where the memcpy() is not
> needed.

There's no clear way I know of to guess which platforms that may be.
Is it possible to fiddle _PyUnicode_DecodeUnicodeInternal's _callers_
so that the char* `s` argument passed to it is always properly aligned
for Py_UNICODE?  Then the pointer cast would be fine.

> ...
> A modern compiler should know the alignment requirements
> of Py_UNICODE* on the platform and generate appropriate
> code.

The trend in modern compilers and architectures is to be less
forgiving of standard violations, not more.

> AFAICTL, only 64-bit platforms are subject to any
> such problems due to their requirement to have pointers
> aligned on 8-byte boundaries.

It's not the alignment of the pointer but of what the pointer points
_at_ that's at issue here.  While the effect of the pointer cast is
undefined, it's not the pointer cast that blows up.  It's
dereferencing the _result_ of the pointer cast that blows up:  it was
trying to read up a Py_UNICODE from an address that wasn't properly
aligned for Py_UNICODE.  That can blow up (or be very slow, or return
gibberish -- it's undefined) even if Py_UNICODE has an alignment
requirement of "just" 2 (which I expect was actually the case on the
Solaris box).

From martin at v.loewis.de  Tue May 16 23:13:29 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 16 May 2006 23:13:29 +0200
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <d2115f4f0605160048l6fddeb8bna362e96ca3ec147b@mail.gmail.com>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>	
	<ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>	
	<44684958.9010206@gmail.com> <446973F0.3010007@v.loewis.de>
	<d2115f4f0605160048l6fddeb8bna362e96ca3ec147b@mail.gmail.com>
Message-ID: <446A4079.5070808@v.loewis.de>

Rotem Yaari wrote:
>> Rotem should simply avoid to fork() in the toplevel code of a module.
>>
> 
> That's exactly the problem - the fork is not done at the module level,
> and the reason for the deadlock is imports inside functions performed
> by various library functions.
> This can happen virtually in every possible step of the program as
> long as the python builtins keep doing it.

Well, Python *builtins* (in the strict sense of the word) usually never
directly import anything, and most of them not even indirectly import
(there are some exceptions, of course, like the __import__ builtin)

So if you are certain that the fork doesn't happen *while* an import
is going, then I think the specific problem must be analyzed in more
detail before an attempt is made to fix it.

Regards,
Martin

From martin at v.loewis.de  Tue May 16 23:17:18 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 16 May 2006 23:17:18 +0200
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <2ae470100605160109s7ff19ab4id5f1bd90987d67e5@mail.gmail.com>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>	
	<ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>	
	<44684958.9010206@gmail.com> <446973F0.3010007@v.loewis.de>
	<2ae470100605160109s7ff19ab4id5f1bd90987d67e5@mail.gmail.com>
Message-ID: <446A415E.8070503@v.loewis.de>

Yair Chuchem wrote:
>     Rotem should simply avoid to fork() in the toplevel code of a module.
> 
> 
> this is not what happened.
> we called subprocess which itself called fork.

Ok - but are you calling subprocess in the context of code that is just
being imported? Or is the caller of the code that calls subprocess just
being imported? I.e. if you would trace back the stack at the point
of fork, would there be an import statement on any frame of the stack
(no matter how deeply nested)?

> as a subprocess bug this could be easily fixed by moving the import in
> "os._execve".
> we have made that fix locally and it fixes the problem.

I can readily believe that the problem went away. However, you did not
fix it - you worked around it. The import lock *ought* to be available
inside the execve code, so the import *ought* to work. By moving the
import outside execve, you have only hidden the problem, instead of
solving it.

Regards,
Martin

From martin at v.loewis.de  Tue May 16 23:20:02 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 16 May 2006 23:20:02 +0200
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <44699B47.4030504@gmail.com>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>	<ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>
	<44684958.9010206@gmail.com> <446973F0.3010007@v.loewis.de>
	<44699B47.4030504@gmail.com>
Message-ID: <446A4202.6070608@v.loewis.de>

Nick Coghlan wrote:
> Broadening the ifdef to cover all posix systems rather than just AIX
> might be the right thing to do. 

As I said: I doubt it is the right thing to do. Instead, the lock
should get released in the child process if it is held at the
point of the fork.

I agree that we should find a repeatable test case first.
Contributions are welcome.

Regards,
Martin

From yairchu at gmail.com  Tue May 16 10:09:18 2006
From: yairchu at gmail.com (Yair Chuchem)
Date: Tue, 16 May 2006 11:09:18 +0300
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <446973F0.3010007@v.loewis.de>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>
	<ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>
	<44684958.9010206@gmail.com> <446973F0.3010007@v.loewis.de>
Message-ID: <2ae470100605160109s7ff19ab4id5f1bd90987d67e5@mail.gmail.com>

On 5/16/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>
>
> Rotem should simply avoid to fork() in the toplevel code of a module.
>
>
this is not what happened.
we called subprocess which itself called fork.

as a subprocess bug this could be easily fixed by moving the import in
"os._execve".
we have made that fix locally and it fixes the problem.

cheers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060516/8dfb6009/attachment.html 

From vys at renet.ru  Wed May 17 14:06:50 2006
From: vys at renet.ru (Vladimir 'Yu' Stepanov)
Date: Wed, 17 May 2006 16:06:50 +0400
Subject: [Python-Dev] total ordering.
In-Reply-To: <bb8868b90605161016kda546eet2a4e79da8d5837a5@mail.gmail.com>
References: <20060505105630.67BE.JCARLSON@uci.edu> <445C67EE.1030704@renet.ru>	
	<20060506025902.67DA.JCARLSON@uci.edu>	
	<20060506205924.GA14684@fox.renet.ru>	
	<ca471dc20605070825k24c81178p1a2fb1123211f98d@mail.gmail.com>	
	<44632866.4090906@renet.ru>	
	<ca471dc20605110736n588f7548wf9cea8d5f79e3564@mail.gmail.com>	
	<4469F99C.6090305@renet.ru>
	<bb8868b90605161016kda546eet2a4e79da8d5837a5@mail.gmail.com>
Message-ID: <446B11DA.1080201@renet.ru>

Jason Orendorff wrote:
> On 5/11/06, Vladimir 'Yu' Stepanov <vys at renet.ru> wrote:
>> If for Python-3000 similar it will be shown concerning types
>> str(), int(), complex() and so on, and the type of exceptions
>> will strongly vary, it will make problematic redefinition of
>> behavior of function of sorting.
>
> I don't see what you mean by "redefinition of behavior of function of
> sorting".  Is this something a Python programmer might want to do?
> Can you give an example?
>

It can be shown concerning external types. When everyone them
we shall compare to any internal type, but not with each other.
It concerns functions and the methods realized through CPython
API. For example types of the whole greater numbers of different
mathematical libraries. Or in case of use of several different
representation of character arrays.

One more example of work with a database which is more
realistic. For Python-3000 it will be possible to use
comparison only compatible types. There can be a problem
to sort types of data incompatible for comparison.
------------------------------------------------------------
conn = MySQLdb.connect(...)
cu = conn.cursor()
cu.execute("""
CREATE TABLE bigtable (
        id INT NOT NULL AUTO_INCREMENT,
        group_id INT NULL,
        value VARCHAR(255) NOT NULL,
        PRIMARY KEY (id)
)""")
for query in (
"INSERT INTO bigtable VALUES (NULL, 50, 'i am lazy')",
"INSERT INTO bigtable VALUES (NULL, 20, 'i am very lazy')",
"INSERT INTO bigtable VALUES (NULL, 19, 'i am very very lazy')",
"INSERT INTO bigtable VALUES (NULL, 40, 'i am very very very lazy :)')",
"INSERT INTO bigtable VALUES (NULL, NULL, 'Something different')"
):
        cu.execute(query)
cu.execute("SELECT * FROM bigtable ORDER BY id")
lst = [ r for r in cu ]
... do something in given order ...
lst.sort(cmp=lambda a, b: cmp(a[1], b[1]))
------------------------------------------------------------

If it is required to sort records on a column group_id it
hardly will be possible to us for python-3000 as numbers in
a column group_id will alternate with None values. Certainly
it is possible to filter the list preliminary. But it will not
be effective. Especially, if to assume, that sorted values can
have some various incompatible types (it already beyond the
framework of the given example).

The essence of my offer consists in at first to admit check of
types due to fast mechanisms CPython of methods, and only in
case of incompatibility of types of data to use spare model of
comparison, only if it is necessary. Creation of uniform
exception should facilitate creation of a design try-except.

Class of exceptions TypeError have wide using. For functions
and methods of comparison the specialized class is necessary.
In fact nobody uses class StopIteration for generation of
messages on a error :)

Other example when classes of different modules are used, and
it is necessary to sort them (see attached "example1.py").

> On 5/16/06, Vladimir 'Yu' Stepanov <vys at renet.ru> wrote:
>> It will be possible it conveniently to use as exception of
>> management by a stream, for indication of necessity to involve
>> `.__r(eq|ne|le|lt|ge|gt|cmp)__()' a method. This kind of a class
>> can carry out function, similarly to StopIteration for `.next()'.
>
> There are no .__r(eq|ne|le|lt|ge|gt|cmp)__() methods, for a logical
> reason which you might enjoy deducing yourself...

Oops :) I has simply hastened. About __rcmp__ has once read
through, and only here has emerged in memory when about it
already all to think have forgotten. As inopportunely.

>
>> At present time similar function is carried out with exception
>> NotImplemented. This exception is generated in a number of
>> mathematical operations. For this reason I ask to consider an
>> opportunity of creation of a new class.
>
> Can you explain this?  NotImplemented isn't an exception.
> (NotImplementedError is, but that's something quite different.)

I was mistaken. Sorry. It is valid not a class of exceptions.
> NotImplemented has exactly one purpose in Python, as far as I can
> tell.  What mathematical operations do you mean?

At multiplication/division/addition/subtraction/binary of operation/etc
- if types are non-comparable (binary_op1 in abstract.c). Very universal
element. CPython API will transform returned element NotImplemented to
generation of exception TypeError. Whether it is good?

In my opinion thus it is possible to hide a mistake of programming. As
an example the attached file example2.py.

Thanks.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: example1.py
Url: http://mail.python.org/pipermail/python-dev/attachments/20060517/e78e4455/attachment.asc 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: example2.py
Url: http://mail.python.org/pipermail/python-dev/attachments/20060517/e78e4455/attachment.pot 

From theller at python.net  Wed May 17 16:56:02 2006
From: theller at python.net (Thomas Heller)
Date: Wed, 17 May 2006 16:56:02 +0200
Subject: [Python-Dev] MSVC 6.0 patch
In-Reply-To: <BAY109-DAV67DE2A3E328C13F68400F8AA00@phx.gbl>
References: <BAY109-DAV67DE2A3E328C13F68400F8AA00@phx.gbl>
Message-ID: <e4fdi2$qrf$1@sea.gmane.org>

Luke Dunstan wrote:
> Hi,
> 
> This patch submitted in March allows Python trunk to be built using MSVC 6:
> 
> http://sourceforge.net/tracker/index.php?func=detail&aid=1457736&group_id=5470&atid=305470
> 
> It is not my patch but I am quite interested because the code changes are 
> also required to build with MS eMbedded Visual C++ 4.0, the Windows CE 4.x 
> compiler. I'm guessing that the person assigned to the patch is too busy so 
> I hope somebody else can take over?

Since nobody else replied, let me answer this:
The reason that this patch has not been applied is probably that nobody
of the core developers is interested in it any longer.

IMO the best would be if you integrated this patch, or the subset that you need,
into your WinCE patch.

Thomas


From dcinege-mlists-dated-1148357217.fc23eb at psychosis.com  Thu May 18 06:06:51 2006
From: dcinege-mlists-dated-1148357217.fc23eb at psychosis.com (Dave Cinege)
Date: Thu, 18 May 2006 00:06:51 -0400
Subject: [Python-Dev] New string method - splitquoted
Message-ID: <200605180006.51461.dcinege-mlists@psychosis.com>

Very often....make that very very very very very very very very very often,
I find myself processing text in python that  when .split()'ing a line, I'd 
like to exclude the split for a 'quoted' item...quoted because it contains 
whitespace or the sep char.

For example:

s = '      Chan: 11  SNR: 22  ESSID: "Spaced Out Wifi"  Enc: On'

If I want to yank the essid in the above example, it's a pain. But with my new 
dandy split quoted method, we have a 3rd argument to .split() that we can 
spec the quote delimiter where no splitting will occur, and the quote char 
will be dropped:

	s.split(None,-1,'"')[5]
	'Spaced Out Wifi'

Attached is a proof of concept patch against 
Python-2.4.1/Objects/stringobject.c  that implements this. It is limited to 
whitespace splitting only. (sep == None)

As implemented the quote delimiter also doubles as an additional separator for 
the spliting out a substr. 

For example:
	'There is"no whitespace before these"quotes'.split(None,-1,'"')
	['There', 'is', 'no whitespace before these', 'quotes']

This is useful, but possibly better put into practice as a separate method??

Comments please.

Dave
-------------- next part --------------
A non-text attachment was scrubbed...
Name: splitquoted-20060516_stringobject.c.diff
Type: text/x-diff
Size: 3169 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060518/ae6f5a63/attachment.bin 

From nnorwitz at gmail.com  Thu May 18 08:43:10 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Wed, 17 May 2006 23:43:10 -0700
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <446A4202.6070608@v.loewis.de>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>
	<ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>
	<44684958.9010206@gmail.com> <446973F0.3010007@v.loewis.de>
	<44699B47.4030504@gmail.com> <446A4202.6070608@v.loewis.de>
Message-ID: <ee2a432c0605172343h30b1f469ydb5dff9435ca371f@mail.gmail.com>

On 5/16/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Nick Coghlan wrote:
> > Broadening the ifdef to cover all posix systems rather than just AIX
> > might be the right thing to do.
>
> As I said: I doubt it is the right thing to do. Instead, the lock
> should get released in the child process if it is held at the
> point of the fork.
>
> I agree that we should find a repeatable test case first.

I'm not sure if this bug can be reproduced now, but the comment in
warnings.py points to:

    http://python.org/sf/683658

I didn't investigate it further.  That might allow someone to create a
reproducible test case.

n

From me+python-dev at modelnine.org  Thu May 18 09:00:00 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Thu, 18 May 2006 09:00:00 +0200
Subject: [Python-Dev] New string method - splitquoted
In-Reply-To: <200605180006.51461.dcinege-mlists@psychosis.com>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
Message-ID: <200605180900.00583.me+python-dev@modelnine.org>

Am Donnerstag 18 Mai 2006 06:06 schrieb Dave Cinege:
> This is useful, but possibly better put into practice as a separate
> method??

I personally don't think it's particularily useful, at least not in the 
special case that your patch tries to address.

1) Generally, you won't only have one character that does quoting, but 
several. Think of the Python syntax, where you have ", ', """ and ''', which 
all behave slightly differently. The logic for " and ' is simple enough to 
implement (basically that's what your patch does, and I'm sure it's easy 
enough to extend it to accept a range of characters as splitters), but if you 
have more complicated quoting operators (such as """), are you sure it's 
sensible to implement the logic in split()?

2) What should the result of "this is a \"test string".split(None,-1,'"') be? 
An exception (ParseError)? Silently ignoring the missing delimiter, and 
returning ['this','is','a','test string']? Ignoring the delimiter altogether, 
returning ['this','is','a','"test','string']? I don't think there's one case 
to satisfy all here...

3) What about escapes of the delimiter? Your current patch doesn't address 
them at all (AFAICT) at the moment, but what should the escaping character 
be? Should "escape processing" take place, i.E. what should the result 
of "this is a \\\"delimiter \\test".split(None,-1,'"') be?

Don't get me wrong, I personally find this functionality very, very 
interesting (I'm +0.5 on adding it in some way or another), especially as a 
part of the standard library (not necessarily as an extension to .split()).

But there's quite a lot of semantic stuff to get right before you can 
implement it properly; see the complexity of the csv module, where you have 
to define pretty much all of this in the dialect you use to parse the csv 
file...

Why not write up a PEP?

--- Heiko.

From martin at v.loewis.de  Thu May 18 09:28:17 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 18 May 2006 09:28:17 +0200
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <ee2a432c0605172343h30b1f469ydb5dff9435ca371f@mail.gmail.com>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>	
	<ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>	
	<44684958.9010206@gmail.com> <446973F0.3010007@v.loewis.de>	
	<44699B47.4030504@gmail.com> <446A4202.6070608@v.loewis.de>
	<ee2a432c0605172343h30b1f469ydb5dff9435ca371f@mail.gmail.com>
Message-ID: <446C2211.5060408@v.loewis.de>

Neal Norwitz wrote:
>> As I said: I doubt it is the right thing to do. Instead, the lock
>> should get released in the child process if it is held at the
>> point of the fork.
>>
> I'm not sure if this bug can be reproduced now, but the comment in
> warnings.py points to:
> 
>    http://python.org/sf/683658
> 
> I didn't investigate it further.  That might allow someone to create a
> reproducible test case.

That's a different problem, though. Here, the main thread waits for a
child *thread* while holding the import lock. I don't quite understand
the test case given, and I can't make it deadlock anymore (probably
because of Just's fix).

Regards,
Martin

From nnorwitz at gmail.com  Thu May 18 09:33:32 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 18 May 2006 00:33:32 -0700
Subject: [Python-Dev] New string method - splitquoted
In-Reply-To: <200605180006.51461.dcinege-mlists@psychosis.com>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
Message-ID: <ee2a432c0605180033n7cf4a681i4f896405978a2e97@mail.gmail.com>

On 5/17/06, Dave Cinege
<dcinege-mlists-dated-1148357217.fc23eb at psychosis.com> wrote:
> Very often....make that very very very very very very very very very often,
> I find myself processing text in python that  when .split()'ing a line, I'd
> like to exclude the split for a 'quoted' item...quoted because it contains
> whitespace or the sep char.
>
> For example:
>
> s = '      Chan: 11  SNR: 22  ESSID: "Spaced Out Wifi"  Enc: On'
>
> If I want to yank the essid in the above example, it's a pain. But with my new
> dandy split quoted method, we have a 3rd argument to .split() that we can
> spec the quote delimiter where no splitting will occur, and the quote char
> will be dropped:
>
>         s.split(None,-1,'"')[5]
>         'Spaced Out Wifi'
>
> Attached is a proof of concept patch against
> Python-2.4.1/Objects/stringobject.c  that implements this. It is limited to
> whitespace splitting only. (sep == None)
>
> As implemented the quote delimiter also doubles as an additional separator for
> the spliting out a substr.
>
> For example:
>         'There is"no whitespace before these"quotes'.split(None,-1,'"')
>         ['There', 'is', 'no whitespace before these', 'quotes']
>
> This is useful, but possibly better put into practice as a separate method??
>
> Comments please.

What's wrong with:  re.findall(r'"[^"]*"|[^"\s]+', s)

YMMV,
n

From rasky at develer.com  Thu May 18 10:21:07 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Thu, 18 May 2006 10:21:07 +0200
Subject: [Python-Dev] New string method - splitquoted
References: <200605180006.51461.dcinege-mlists@psychosis.com>
	<200605180900.00583.me+python-dev@modelnine.org>
Message-ID: <0f0c01c67a54$0371f7c0$b14c2a97@bagio>

Heiko Wundram <me+python-dev at modelnine.org> wrote:

> Don't get me wrong, I personally find this functionality very, very
> interesting (I'm +0.5 on adding it in some way or another),
> especially as a
> part of the standard library (not necessarily as an extension to
> .split()).


It's already there. It's called shlex.split(), and follows the semantic of a
standard UNIX shell, including escaping and other things.

>>> import shlex
>>> shlex.split(r"""Hey I\'m a "bad guy" for you""")
['Hey', "I'm", 'a', 'bad guy', 'for', 'you']

Giovanni Bajo


From me+python-dev at modelnine.org  Thu May 18 11:50:50 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Thu, 18 May 2006 11:50:50 +0200
Subject: [Python-Dev] New string method - splitquoted
In-Reply-To: <0f0c01c67a54$0371f7c0$b14c2a97@bagio>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
	<200605180900.00583.me+python-dev@modelnine.org>
	<0f0c01c67a54$0371f7c0$b14c2a97@bagio>
Message-ID: <200605181150.50902.me+python-dev@modelnine.org>

Am Donnerstag 18 Mai 2006 10:21 schrieb Giovanni Bajo:
> Heiko Wundram <me+python-dev at modelnine.org> wrote:
> > Don't get me wrong, I personally find this functionality very, very
> > interesting (I'm +0.5 on adding it in some way or another),
> > especially as a
> > part of the standard library (not necessarily as an extension to
> > .split()).
>
> It's already there. It's called shlex.split(), and follows the semantic of
> a standard UNIX shell, including escaping and other things.

I knew about *nix shell escaping, but that isn't necessarily what I find in 
input I have to process (although generally it's what you see, yeah). That's 
why I said that it would be interesting to have a generalized method, sort of 
like the csv module but only for string "interpretation", which takes a 
dialect, and parses a string for the specified dialect.

Remember, there also escaping by doubling the end of string marker (for 
example, '""this is not a single argument""'.split() should be parsed as 
['"this','is','not','a',....]), and I know programs that use exactly this 
format for file storage.

Maybe, one could simply export the function the csv module uses to parse the 
actual data fields as a more prominent method, which accepts keyword 
arguments, instead of a Dialect-derived class.

--- Heiko.

From rasky at develer.com  Thu May 18 12:26:05 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Thu, 18 May 2006 12:26:05 +0200
Subject: [Python-Dev] New string method - splitquoted
References: <200605180006.51461.dcinege-mlists@psychosis.com>
	<200605180900.00583.me+python-dev@modelnine.org>
	<0f0c01c67a54$0371f7c0$b14c2a97@bagio>
	<200605181150.50902.me+python-dev@modelnine.org>
Message-ID: <109801c67a65$78958bf0$b14c2a97@bagio>

Heiko Wundram <me+python-dev at modelnine.org> wrote:

>>> Don't get me wrong, I personally find this functionality very, very
>>> interesting (I'm +0.5 on adding it in some way or another),
>>> especially as a
>>> part of the standard library (not necessarily as an extension to
>>> .split()).
>>
>> It's already there. It's called shlex.split(), and follows the
>> semantic of a standard UNIX shell, including escaping and other
>> things.
>
> I knew about *nix shell escaping, but that isn't necessarily what I
> find in input I have to process (although generally it's what you
> see, yeah). That's why I said that it would be interesting to have a
> generalized method, sort of like the csv module but only for string
> "interpretation", which takes a dialect, and parses a string for the
> specified dialect.
>
> Remember, there also escaping by doubling the end of string marker
> (for example, '""this is not a single argument""'.split() should be
> parsed as ['"this','is','not','a',....]), and I know programs that
> use exactly this format for file storage.

I never met this one. Anyway, I don't think it's harder than:

>>> def mysplit(s):
...     """Allow double quotes to escape a quotes"""
...     return shlex.split(s.replace(r'""', r'\"'))
...
>>> mysplit('""This is not a single argument""')
['"This', 'is', 'not', 'a', 'single', 'argument"']


> Maybe, one could simply export the function the csv module uses to
> parse the actual data fields as a more prominent method, which
> accepts keyword arguments, instead of a Dialect-derived class.


I think you're over-generalizing a very simple problem. I believe that
str.split, shlex.split, and some simple variation like the one above (maybe
using regular expressions to do the substitution if you have slightly more
complex cases) can handle 99.99% of the splitting cases. They surely handle
100% of those I myself had to parse.

I believe the standard library already covers common usage. There will surely
be cases where a custom lexer/splitetr will have to be written, but that's life
:)

Giovanni Bajo


From me+python-dev at modelnine.org  Thu May 18 12:38:29 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Thu, 18 May 2006 12:38:29 +0200
Subject: [Python-Dev] New string method - splitquoted
In-Reply-To: <109801c67a65$78958bf0$b14c2a97@bagio>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
	<200605181150.50902.me+python-dev@modelnine.org>
	<109801c67a65$78958bf0$b14c2a97@bagio>
Message-ID: <200605181238.29969.me+python-dev@modelnine.org>

Am Donnerstag 18 Mai 2006 12:26 schrieb Giovanni Bajo:
> I believe the standard library already covers common usage. There will
> surely be cases where a custom lexer/splitetr will have to be written, but
> that's life

The csv data field parser handles all common usage I have encountered so far, 
yes. ;-) But, generally, you can't (easily) get at the method that parses a 
data field directly, that's why I proposed to publish that method with 
keyword arguments. (actually, I've only tried getting at it when the csv 
module was still plain-python, I wouldn't even know whether the "method" is 
exported now that the module is written in C).

I've had the need to write a custom lexer time and again, and generally, I'd 
love to have a little more general string interpretation facility available 
to spare me from writing a state automaton... But as I said before, 
the "simple" patch that was proposed here won't do for my case. But I don't 
know if it's worth the trouble to actually write a more general version, 
because there are quite some different pitfalls that have to be overcome... I 
still remain +0.5 for adding something like this to the stdlib, but only if 
it's overly general so that it can handle all cases the csv module can 
handle.

--- Heiko.

From ncoghlan at gmail.com  Thu May 18 14:59:53 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 18 May 2006 22:59:53 +1000
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <446C2211.5060408@v.loewis.de>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>	
	<ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>	
	<44684958.9010206@gmail.com> <446973F0.3010007@v.loewis.de>	
	<44699B47.4030504@gmail.com> <446A4202.6070608@v.loewis.de>
	<ee2a432c0605172343h30b1f469ydb5dff9435ca371f@mail.gmail.com>
	<446C2211.5060408@v.loewis.de>
Message-ID: <446C6FC9.5060205@gmail.com>

Martin v. L?wis wrote:
> Neal Norwitz wrote:
>>> As I said: I doubt it is the right thing to do. Instead, the lock
>>> should get released in the child process if it is held at the
>>> point of the fork.
>>>
>> I'm not sure if this bug can be reproduced now, but the comment in
>> warnings.py points to:
>>
>>    http://python.org/sf/683658
>>
>> I didn't investigate it further.  That might allow someone to create a
>> reproducible test case.
> 
> That's a different problem, though. Here, the main thread waits for a
> child *thread* while holding the import lock. I don't quite understand
> the test case given, and I can't make it deadlock anymore (probably
> because of Just's fix).

And if I understand it correctly, it falls under the category that waiting for 
another thread while holding the import lock is a *really* bad idea from a 
thread safety point of view.

The thing with the import-after-fork deadlock is that you can trigger it 
without even doing anything that's known not to be thread-safe.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From ncoghlan at gmail.com  Thu May 18 15:20:52 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 18 May 2006 23:20:52 +1000
Subject: [Python-Dev] New string method - splitquoted
In-Reply-To: <200605180006.51461.dcinege-mlists@psychosis.com>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
Message-ID: <446C74B4.6030506@gmail.com>

Dave Cinege wrote:
> Very often....make that very very very very very very very very very often,
> I find myself processing text in python that  when .split()'ing a line, I'd 
> like to exclude the split for a 'quoted' item...quoted because it contains 
> whitespace or the sep char.
> 
> For example:
> 
> s = '      Chan: 11  SNR: 22  ESSID: "Spaced Out Wifi"  Enc: On'

Even if you don't like Neal's more efficient regex-based version, the 
necessary utility function to do a two-pass split operation really isn't that 
tricky:

def split_quoted(text, sep=None, quote='"'):
     sections = text.split(quote)
     result = []
     for idx, unquoted_text in enumerate(sections[::2]):
         result.extend(unquoted_text.split(sep))
         quoted = 2*idx+1
         quoted_text = sections[quoted:quoted+1]
         result.extend(quoted_text)
     return result

 >>> split_quoted('      Chan: 11  SNR: 22  ESSID: "Spaced Out Wifi"  Enc: On')
['Chan:', '11', 'SNR:', '22', 'ESSID:', 'Spaced Out Wifi', 'Enc:', 'On']

Given that this function (or a regex based equivalent) is easy enough to add 
if you do need it, I don't find the idea of increasing the complexity of the 
basic split API particularly compelling.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From guido at python.org  Thu May 18 17:11:31 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 18 May 2006 08:11:31 -0700
Subject: [Python-Dev] New string method - splitquoted
In-Reply-To: <200605180006.51461.dcinege-mlists@psychosis.com>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
Message-ID: <ca471dc20605180811m1cec4860pab52731352df57da@mail.gmail.com>

This is not an apropriate function to add as a string methods. There
are too many conventions for quoting and too many details to get
right. One method can't possibly handle them all without an enormous
number of weird options. It's better to figure out how to do this with
regexps or use some of the other approaches that have been suggested.
(Did anyone mention the csv module yet? It deals with this too.)

--Guido

On 5/17/06, Dave Cinege
<dcinege-mlists-dated-1148357217.fc23eb at psychosis.com> wrote:
> Very often....make that very very very very very very very very very often,
> I find myself processing text in python that  when .split()'ing a line, I'd
> like to exclude the split for a 'quoted' item...quoted because it contains
> whitespace or the sep char.
>
> For example:
>
> s = '      Chan: 11  SNR: 22  ESSID: "Spaced Out Wifi"  Enc: On'
>
> If I want to yank the essid in the above example, it's a pain. But with my new
> dandy split quoted method, we have a 3rd argument to .split() that we can
> spec the quote delimiter where no splitting will occur, and the quote char
> will be dropped:
>
>         s.split(None,-1,'"')[5]
>         'Spaced Out Wifi'
>
> Attached is a proof of concept patch against
> Python-2.4.1/Objects/stringobject.c  that implements this. It is limited to
> whitespace splitting only. (sep == None)
>
> As implemented the quote delimiter also doubles as an additional separator for
> the spliting out a substr.
>
> For example:
>         'There is"no whitespace before these"quotes'.split(None,-1,'"')
>         ['There', 'is', 'no whitespace before these', 'quotes']
>
> This is useful, but possibly better put into practice as a separate method??
>
> Comments please.
>
> Dave
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ronaldoussoren at mac.com  Thu May 18 17:18:05 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 18 May 2006 17:18:05 +0200
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <446A4202.6070608@v.loewis.de>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>
	<ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>
	<44684958.9010206@gmail.com> <446973F0.3010007@v.loewis.de>
	<44699B47.4030504@gmail.com> <446A4202.6070608@v.loewis.de>
Message-ID: <1DF6B4D3-A20C-4402-AA76-40EBB6B951B8@mac.com>


On 16-mei-2006, at 23:20, Martin v. L?wis wrote:

> Nick Coghlan wrote:
>> Broadening the ifdef to cover all posix systems rather than just AIX
>> might be the right thing to do.
>
> As I said: I doubt it is the right thing to do. Instead, the lock
> should get released in the child process if it is held at the
> point of the fork.
>
> I agree that we should find a repeatable test case first.
> Contributions are welcome.

 From what I understand the problem is that when a thread other than  
the thread calling fork is holding the import lock the child process  
ends up with an import lock that is locked by a no longer existing  
thread.

The attached script reproduces the problem for me. This creates a  
thread to imports a module and a thread that forks. Run  
'forkimport.py' to demonstrate the problem, if you set 'gHang' at the  
start of the script to 0 the child process doesn't perform an import  
and terminates normally.

Wouldn't this specific problem be fixed if os.fork were to acquire  
the import lock before calling fork(2)?

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: forklock.tgz
Type: application/octet-stream
Size: 544 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060518/cfd0407e/attachment.obj 
-------------- next part --------------


>
> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ 
> ronaldoussoren%40mac.com


From me+python-dev at modelnine.org  Thu May 18 18:19:37 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Thu, 18 May 2006 18:19:37 +0200
Subject: [Python-Dev] New string method - splitquoted
In-Reply-To: <ca471dc20605180811m1cec4860pab52731352df57da@mail.gmail.com>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
	<ca471dc20605180811m1cec4860pab52731352df57da@mail.gmail.com>
Message-ID: <200605181819.38110.me+python-dev@modelnine.org>

Am Donnerstag 18 Mai 2006 17:11 schrieb Guido van Rossum:
> (Did anyone mention the csv module yet? It deals with this too.)

Yes, mentioned it thrice. ;-)

--- Heiko.

From martin at v.loewis.de  Thu May 18 20:02:06 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 18 May 2006 20:02:06 +0200
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <446C6FC9.5060205@gmail.com>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>	
	<ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>	
	<44684958.9010206@gmail.com> <446973F0.3010007@v.loewis.de>	
	<44699B47.4030504@gmail.com> <446A4202.6070608@v.loewis.de>
	<ee2a432c0605172343h30b1f469ydb5dff9435ca371f@mail.gmail.com>
	<446C2211.5060408@v.loewis.de> <446C6FC9.5060205@gmail.com>
Message-ID: <446CB69E.4090609@v.loewis.de>

Nick Coghlan wrote:
> And if I understand it correctly, it falls under the category that
> waiting for another thread while holding the import lock is a *really*
> bad idea from a thread safety point of view.
> 
> The thing with the import-after-fork deadlock is that you can trigger it
> without even doing anything that's known not to be thread-safe.

Right. With some googling, I found that one solution is pthread_atexit:
a pthread_atexit handler is a triple (before, in_parent, in_child).
You set it to (acquire, release, release). When somebody forks,
the pthread library will first acquire the import lock in the thread
that wants to fork. Then the fork occurs, and the import lock gets
then released both in the parent and in the child.

I would like to see this approach implemented, but I agree with you
that a test case should be created first.

Regards,
Martin

From dcinege-mlists-dated-1148408503.c286b8 at psychosis.com  Thu May 18 20:21:34 2006
From: dcinege-mlists-dated-1148408503.c286b8 at psychosis.com (Dave Cinege)
Date: Thu, 18 May 2006 14:21:34 -0400
Subject: [Python-Dev] New string method - splitquoted
In-Reply-To: <200605180900.00583.me+python-dev@modelnine.org>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
	<200605180900.00583.me+python-dev@modelnine.org>
Message-ID: <200605181421.34344.dcinege-mlists@psychosis.com>

On Thursday 18 May 2006 03:00, Heiko Wundram wrote:
> Am Donnerstag 18 Mai 2006 06:06 schrieb Dave Cinege:
> > This is useful, but possibly better put into practice as a separate
> > method??
>
> I personally don't think it's particularily useful, at least not in the
> special case that your patch tries to address.

Well I'm thinking along the lines of a method to extract only quoted substr's:
' this is "something" and"nothing else"but junk'.splitout('"')
['something ', 'nothing else']

Useful? I dunno....

> splitters), but if you have more complicated quoting operators (such as
> """), are you sure it's sensible to implement the logic in split()?

Probably not. See below...

> 2) What should the result of "this is a \"test string".split(None,-1,'"')
> be? An exception (ParseError)?

I'd probably vote for that. However my current patch will simply play dumb and stop split'ing the rest of the line, dropping the first quote.

'this is a "test string'.split(None,-1,'"')
['this', 'is', 'a', 'test string']

> Silently ignoring the missing delimiter, and 
> returning ['this','is','a','test string']? Ignoring the delimiter
> altogether, returning ['this','is','a','"test','string']? I don't think
> there's one case to satisfy all here...

Well the point to the patch is a KISS approach to extending the split() method just slightly to exclude a range of substr from split'ing by delimiter, not to engage in further text processing. 

I'm dealing with this ALL the time, while processing output from other programs. (Windope) fIlenames, (poorly considered) wifi network names, etc. For me it's always some element with whitespace in it and double quotes surrounding it, that otherwise I could just use a slice to dump the quotes for the needed element

'filename: "/root/tmp.txt"'.split()[1] [1:-1]
'/root/tmp.txt'
OK

'filename: "/root/is a bit slow.txt"'.split()[1] [1:-1]
'/root/i'
NOT OK

This exact bug just zapped me in a product I have, that I didn't forsee whitespace turning up in that element.....

Thus my patch:
'filename: "/root/is a bit slow.txt"'.split(None,-1,'"')[1]
'/root/is a bit slow.txt'
LIFE IS GOOD

> 3) What about escapes of the delimiter? Your current patch doesn't address
> them at all (AFAICT) at the moment, 

And it wouldn't, just like the current split doesn't.
'this is a \ test string'.split()
['this', 'is', 'a', '\\', 'test', 'string']

> Don't get me wrong, I personally find this functionality very, very
> interesting (I'm +0.5 on adding it in some way or another), especially as a
> part of the standard library (not necessarily as an extension to .split()).

I'd be happy to have this in as .splitquoted(), but once you use it, it seems more to me like a natural 'ought to be there' extension to split itself.
>
> Why not write up a PEP?

Because I have no idea of the procedure.   : )  URL?

Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060518/119c8f59/attachment.html 

From dcinege-mlists-dated-1148408543.14cf06 at psychosis.com  Thu May 18 20:22:17 2006
From: dcinege-mlists-dated-1148408543.14cf06 at psychosis.com (Dave Cinege)
Date: Thu, 18 May 2006 14:22:17 -0400
Subject: [Python-Dev] New string method - splitquoted
In-Reply-To: <0f0c01c67a54$0371f7c0$b14c2a97@bagio>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
	<200605180900.00583.me+python-dev@modelnine.org>
	<0f0c01c67a54$0371f7c0$b14c2a97@bagio>
Message-ID: <200605181422.17307.dcinege-mlists@psychosis.com>

On Thursday 18 May 2006 04:21, Giovanni Bajo wrote:

> It's already there. It's called shlex.split(), and follows the semantic of
> a standard UNIX shell, including escaping and other things.

Not quite. As I said in my other post, simple is the idea for this, just like 
the split method itself.  (no escaping, etc.....just recognizing delimiters 
as an exception to the split seperatation) 

shlex.split() does not let one choose the separator or use a maxsplit, nor is 
it a pure method to strings.

Dave


From martin at v.loewis.de  Thu May 18 20:24:33 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 18 May 2006 20:24:33 +0200
Subject: [Python-Dev] pthreads, fork, import, and execvp
In-Reply-To: <1DF6B4D3-A20C-4402-AA76-40EBB6B951B8@mac.com>
References: <d2115f4f0605090639h72570784k1f4ef0802b5a8d51@mail.gmail.com>	<ca471dc20605112148s8414d85j6c14bf2ed622cccd@mail.gmail.com>	<44684958.9010206@gmail.com>
	<446973F0.3010007@v.loewis.de>	<44699B47.4030504@gmail.com>
	<446A4202.6070608@v.loewis.de>
	<1DF6B4D3-A20C-4402-AA76-40EBB6B951B8@mac.com>
Message-ID: <446CBBE1.1090803@v.loewis.de>

Ronald Oussoren wrote:
> Wouldn't this specific problem be fixed if os.fork were to acquire the
> import lock before calling fork(2)?

Right. Of course, you need to release the lock then both in the parent
and the child. Also, all places that currently do PyOs_AfterFork should
get modified (i.e. fork1, and forkpty also).

Regards,
Martin

From rasky at develer.com  Thu May 18 20:30:46 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Thu, 18 May 2006 20:30:46 +0200
Subject: [Python-Dev] New string method - splitquoted
References: <200605180006.51461.dcinege-mlists@psychosis.com><200605180900.00583.me+python-dev@modelnine.org><0f0c01c67a54$0371f7c0$b14c2a97@bagio>
	<200605181422.17307.dcinege-mlists@psychosis.com>
Message-ID: <056f01c67aa9$2dd25aa0$bf03030a@trilan>

Dave Cinege wrote:

>> It's already there. It's called shlex.split(), and follows the
>> semantic of a standard UNIX shell, including escaping and other
>> things.
>
> Not quite. As I said in my other post, simple is the idea for this,
> just like the split method itself.  (no escaping, etc.....just
> recognizing delimiters as an exception to the split seperatation)

And what's the actual problem? You either have a syntax which does not
support escaping or one that it does. If it can't be escaped, there won't be
any weird characters in the way, and shlex.split() will do it. If it does
support escaping in a decent way, you can either use shlex.split() directly
or modify the string before (like I've shown in the other message). In any
case, you get your job done.

Do you have any real-world case where you are still not able to split a
string? And if you do, are they really so many to warrant a place in the
standard library? As I said before, I think that split() and shlex.split()
cover the majority of real world usage cases.

> shlex.split() does not let one choose the separator
> or use a maxsplit

Real-world use case? Show me what you need to parse, and I assume this weird
format is generated by a program you have not written yourself (or you could
just change it to generate a more standard and simple format!)

> , nor is it a pure method to strings.

This is a totally different problem. It doesn't make it less useful nor it
does provide a need for adding a new method to the string.
-- 
Giovanni Bajo


From jason.orendorff at gmail.com  Thu May 18 21:34:30 2006
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Thu, 18 May 2006 15:34:30 -0400
Subject: [Python-Dev] total ordering.
In-Reply-To: <446B11DA.1080201@renet.ru>
References: <20060505105630.67BE.JCARLSON@uci.edu> <445C67EE.1030704@renet.ru>
	<20060506025902.67DA.JCARLSON@uci.edu>
	<20060506205924.GA14684@fox.renet.ru>
	<ca471dc20605070825k24c81178p1a2fb1123211f98d@mail.gmail.com>
	<44632866.4090906@renet.ru>
	<ca471dc20605110736n588f7548wf9cea8d5f79e3564@mail.gmail.com>
	<4469F99C.6090305@renet.ru>
	<bb8868b90605161016kda546eet2a4e79da8d5837a5@mail.gmail.com>
	<446B11DA.1080201@renet.ru>
Message-ID: <bb8868b90605181234sb9ddd3er2170339f5694125@mail.gmail.com>

Vladimir,

Your examples seem to indicate that you've misunderstood the change
that's proposed for Python 3000.  Especially this:

On 5/17/06, Vladimir 'Yu' Stepanov <vys at renet.ru> wrote:
>                 # BEGIN: Emulation python3000
>                 if type(a) is not type(b) and (
>                                 not operator.isNumberType(a) or
>                                 not operator.isNumberType(b)
>                         ):
>                         raise TypeError("python3000: not-comparable types", (a,b))
>                 # END: Emulation python3000

Python 3000 will not do anything like this.  It'll try a.__cmp__(b),
and failing that b.__cmp__(a) (but imagine this using tp_ slots
instead of actual Python method calls), and if both return
NotImplemented, it'll throw a TypeError (rather than guess, which is
what it does now).

There's a lot of legacy oddness in PyObject_RichCompare() and its many
helper functions; presumably they'll delete some of that, but it won't
be anything you care about.

Comparison with None should also continue to work as it does now,
unless I missed something.

-j

From pedronis at strakt.com  Fri May 19 00:28:40 2006
From: pedronis at strakt.com (Samuele Pedroni)
Date: Fri, 19 May 2006 00:28:40 +0200
Subject: [Python-Dev] Reminder: call for proposals "Python Language and
 Libraries Track" for Europython 2006
Message-ID: <446CF518.6040404@strakt.com>

Registration for Europython (3-5 July) at CERN in Geneva is now open,
if you feel submitting a talk proposal there's still time until
the 31th of May.

If you want to talk about a library you developed, or you know well
and want to share your knowledge, or about how you are making the best
out of Python through inventive/elegant idioms and patterns (or if you
are a language guru willing to disseminate your wisdom),

you can submit a proposal for the Python Language and Libraries
track

"""

A track about Python the Language, all batteries included. Talks about
the language, language evolution, patterns and idioms, implementations
(CPython, IronPython, Jython, PyPy ...) and implementation issues belong
to the track. So do talks about the standard library or interesting
3rd-party libraries (and frameworks), unless the gravitational pull of
other tracks is stronger.
"""

The full call and submission links are at:

http://www.europython.org/sections/tracks_and_talks/call-for-proposals

Samuele Pedroni, Python Language and Libraries Track Chair


From nnorwitz at gmail.com  Fri May 19 07:06:56 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 18 May 2006 22:06:56 -0700
Subject: [Python-Dev] 2.5 schedule
Message-ID: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>

I moved up the 2.5 release date by a bit.  Ideally, I wanted to have
the final on July 27 which corresponds to OSCON.  This seems too
aggressive since I haven't updated the schedule publicly.  So the new
schedule has rc1 slated for July 27.

    http://www.python.org/dev/peps/pep-0356/

So far we've only had very few 2.5 bug reports for the first 2 alphas.
 Hopefully people really are testing. :-)

I'll continue to bug people that have action items.

n

From guido at python.org  Fri May 19 08:37:49 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 18 May 2006 23:37:49 -0700
Subject: [Python-Dev] PEP 3101 Update
In-Reply-To: <445D5357.9060409@acm.org>
References: <445D5357.9060409@acm.org>
Message-ID: <ca471dc20605182337h66c8ce61o867a137db768b710@mail.gmail.com>

On 5/6/06, Talin <talin at acm.org> wrote:
> I've updated PEP 3101 based on the feedback collected so far.
[http://www.python.org/dev/peps/pep-3101/]

I think this is a step in the right direction.

I wonder if we shouldn't borrow more from .NET. I read this URL that
you referenced:

http://msdn.microsoft.com/library/en-us/cpguide/html/cpconcompositeformatting.asp

They have special syntax to support field width, e.g. {0,10} formats
item 0 in a field of (at least) 10 positions wide, right-justified;
{0,-10} does the same left-aligned. This is done independently from
the type-specific formatting. (I'm not proposing that we use .NET's
format specifiers after the colon, but I'm also no big fan for keeping
the C specific stuff we have now; we should put some work in designing
something with the same power as the current %-based system for floats
and ints, that would cover it.)

.NET's solution for quoting { and } as {{ and }} respectively also
sidesteps the issue of how to quote \ itself -- since '\\{' is a
2-char string containing one \ and one {, you'd have to write either
'\\\\{0}' or r'\\{0}' to produce a single literal \ followed by
formatted item 0. Any time there's the need to quadruple a backslash I
think we've lost the battle. (Or you might search the web for Tcl
quoting hell. :-)

I'm fine with not having a solution for doing variable substitution
within the format parameters. That could be done instead by building
up the format string with an extra formatting step: instead of
"{x:{y}}".format(x=whatever, y=3) you could write
"{{x,{y}}}".format(y=3).format(x=whatever). (Note that this is subtle:
the final }}} are parsed as } followed by }}. Once the parser has seen
a single {, the first } it sees is the matching closing } and adding
another } after it won't affect it. The specifier cannot contain { or
} at all.

I like having a way to reuse the format parsing code while
substituting something else for the formatting itself.

The PEP appears silent on what happens if there are too few or too
many positional arguments, or if there are missing or unused keywords.
Missing ones should be errors; I'm not sure about redundant (unused)
ones. On the one hand complaining about those gives us more certainty
that the format string is correct. On the other hand there are some
use cases for passing lots of keyword parameters (e.g. simple web
templating could pass a fixed set of variables using **dict). Even in
i18n (translation) apps I could see the usefulness of allowing unused
parameters

On the issue of {a.b.c}: like several correspondents, I don't like the
ambiguity of attribute vs. key refs much, even though it appears
useful enough in practice in web frameworks I've used. It seems to
violate the Zen of Python: "In the face of ambiguity, refuse the
temptation to guess."

Unfortunately I'm pretty lukewarm about the proposal to support
{a[b].c} since b is not a variable reference but a literal string 'b'.
It is also relatively cumbersome to parse. I wish I could propose
{a+b.c} for this case but that's so arbitrary...

Even more unfortunately, I expect that dict key access is a pretty
important use case so we'll have to address it somehow. I *don't*
think there's an important use case for the ambiguity -- in any
particular situation I expect that the programmer will know whether
they are expecting a dict or an object with attributes.

Hm, perhaps {a at b.c} might work? It's not an existing binary operator.
Or perhaps # or !.

It's too late to think straight so this will have to be continued...

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From talin at acm.org  Fri May 19 10:41:33 2006
From: talin at acm.org (Talin)
Date: Fri, 19 May 2006 01:41:33 -0700
Subject: [Python-Dev] PEP 3101 Update
In-Reply-To: <ca471dc20605182337h66c8ce61o867a137db768b710@mail.gmail.com>
References: <445D5357.9060409@acm.org>
	<ca471dc20605182337h66c8ce61o867a137db768b710@mail.gmail.com>
Message-ID: <446D84BD.8080208@acm.org>

Guido van Rossum wrote:
> On 5/6/06, Talin <talin at acm.org> wrote:
> 
>> I've updated PEP 3101 based on the feedback collected so far.
> 
> [http://www.python.org/dev/peps/pep-3101/]
> 
> I think this is a step in the right direction.

Cool, and thanks for the very detailed feedback.

> I wonder if we shouldn't borrow more from .NET. I read this URL that
> you referenced:
> 
> http://msdn.microsoft.com/library/en-us/cpguide/html/cpconcompositeformatting.asp 
> 
> 
> They have special syntax to support field width, e.g. {0,10} formats
> item 0 in a field of (at least) 10 positions wide, right-justified;
> {0,-10} does the same left-aligned. This is done independently from

We already have that now, don't we? If you look at the docs for "String 
Formatting Operations" in the library reference, it shows that a 
negative sign on a field width indicates left justification.

> the type-specific formatting. (I'm not proposing that we use .NET's
> format specifiers after the colon, but I'm also no big fan for keeping
> the C specific stuff we have now; we should put some work in designing
> something with the same power as the current %-based system for floats
> and ints, that would cover it.)

Agreed. As you say, the main work is in handling floats and ints, and 
everything else can either be formatted as plain str(), or use a custom 
format specifier syntax (as in my strftime example.)

> .NET's solution for quoting { and } as {{ and }} respectively also
> sidesteps the issue of how to quote \ itself -- since '\\{' is a
> 2-char string containing one \ and one {, you'd have to write either
> '\\\\{0}' or r'\\{0}' to produce a single literal \ followed by
> formatted item 0. Any time there's the need to quadruple a backslash I
> think we've lost the battle. (Or you might search the web for Tcl
> quoting hell. :-)
> 
> I'm fine with not having a solution for doing variable substitution
> within the format parameters. That could be done instead by building
> up the format string with an extra formatting step: instead of
> "{x:{y}}".format(x=whatever, y=3) you could write
> "{{x,{y}}}".format(y=3).format(x=whatever). (Note that this is subtle:
> the final }}} are parsed as } followed by }}. Once the parser has seen
> a single {, the first } it sees is the matching closing } and adding
> another } after it won't affect it. The specifier cannot contain { or
> } at all.

There is another solution to this which is equally subtle, although 
fairly straightforward to parse. It involves defining the rules for 
escapes as follows:

    '{{' is an escaped '{'
    '}}' is an escaped '}', unless we are within a field.

So you can write things like {0:10,{1}}, and the final '}}' will be 
parsed as two separate closing brackets, since we're within a field 
definition.

 From a parsing standpoint, this is unambiguous, however I've held off 
on suggesting it because it might appear to be ambiguous to a casual reader.

> I like having a way to reuse the format parsing code while
> substituting something else for the formatting itself.
> 
> The PEP appears silent on what happens if there are too few or too
> many positional arguments, or if there are missing or unused keywords.
> Missing ones should be errors; I'm not sure about redundant (unused)
> ones. On the one hand complaining about those gives us more certainty
> that the format string is correct. On the other hand there are some
> use cases for passing lots of keyword parameters (e.g. simple web
> templating could pass a fixed set of variables using **dict). Even in
> i18n (translation) apps I could see the usefulness of allowing unused
> parameters

I am undecided on this issue as well, which is the reason that it's not 
mentioned in the PEP (yet).

> On the issue of {a.b.c}: like several correspondents, I don't like the
> ambiguity of attribute vs. key refs much, even though it appears
> useful enough in practice in web frameworks I've used. It seems to
> violate the Zen of Python: "In the face of ambiguity, refuse the
> temptation to guess."
> 
> Unfortunately I'm pretty lukewarm about the proposal to support
> {a[b].c} since b is not a variable reference but a literal string 'b'.
> It is also relatively cumbersome to parse. I wish I could propose
> {a+b.c} for this case but that's so arbitrary...

Actually, it's not all that hard to parse, especially given that there 
is no need to deal with the 'nested' case.

I will be supplying a Python implementation of the parser along with the 
PEP. What I would prefer not to supply (although I certainly can if you 
feel it's necessary) is an optimized C implementation of the same 
parser, as well as the implementations of the various type-specific 
formatters.

> Even more unfortunately, I expect that dict key access is a pretty
> important use case so we'll have to address it somehow. I *don't*
> think there's an important use case for the ambiguity -- in any
> particular situation I expect that the programmer will know whether
> they are expecting a dict or an object with attributes.
> 
> Hm, perhaps {a at b.c} might work? It's not an existing binary operator.
> Or perhaps # or !.

[] is the most intuitive syntax by far IMHO. Let's run it up the 
flagpole and see if anybody salutes :)

> It's too late to think straight so this will have to be continued...

One additional issue that I would like some feedback on:

The way I have set up the API for writing custom formatters (not talking 
about the __format__ method here) allows the custom formatter object to 
examine the entire output string, not merely the part that it is 
responsible for; And moreover, the custom formatter is free to modify 
the entire string. So for example, a custom formatter could tabify or 
un-tabify all previous text within the string.

The API could be made slightly simpler by eliminating this feature. The 
reason that I added it was specifically so that custom formatters could 
perform column-specific operations, like the old BASIC function that 
would print spaces up to a given column. Having generated my share of 
reports back in the old days (COBOL programming in the USAF), I thought 
it might be useful to have the ability to do operations based on the 
absolute column number.

Currently the API specifies that a custom formatter is passed an array 
object, and the custom formatter should append its data to the end of 
the array, but it is also free to examine and modify the rest of the array.

If I were to remove this feature, then instead of using an array, we'd 
simply have the custom formatter return a string like __format__ does.

So the question is - is the use case useful enough to keep this feature? 
What do people think of the use of the Python array type in this case?

-- Talin

From steve at holdenweb.com  Fri May 19 11:24:43 2006
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 19 May 2006 10:24:43 +0100
Subject: [Python-Dev] [Python-checkins] r46043 - peps/trunk/pep-0356.txt
In-Reply-To: <20060519043403.5D0CD1E400E@bag.python.org>
References: <20060519043403.5D0CD1E400E@bag.python.org>
Message-ID: <446D8EDB.8000902@holdenweb.com>

neal.norwitz wrote:

Please note that the "Need for Speed" sprint runs May 21 - 27.

Bringing the schedule forward on Alpha 3 makes it less possible to 
incorporate sprint changes into the 2.5 trunk.

Will it be acceptable to add new (performance) changes between Alpha 3 
and beta 1, or would developers prefer to put a fourth alpha in to the 
schedule?

The intention is there should be no major functionality added by the 
sprint, simply that performance should be improved, so the schedule may 
work. It's just a call for the release manager really, I guess.

regards
  Steve

> Author: neal.norwitz
> Date: Fri May 19 06:34:02 2006
> New Revision: 46043
> 
> Modified:
>    peps/trunk/pep-0356.txt
> Log:
> Speed up schedule a bit, rc corresponds to OSCON.
> Gerhard checked in docs for sqlite.
> 
> 
> 
> Modified: peps/trunk/pep-0356.txt
> ==============================================================================
> --- peps/trunk/pep-0356.txt	(original)
> +++ peps/trunk/pep-0356.txt	Fri May 19 06:34:02 2006
> @@ -38,11 +38,11 @@
>  
>      alpha 1: April 5, 2006 [completed]
>      alpha 2: April 27, 2006 [completed]
> -    alpha 3: May 27, 2006 [planned]
> -    beta 1:  June 24, 2006 [planned]
> -    beta 2:  July 15, 2006 [planned]
> -    rc 1:    August 5, 2006 [planned]
> -    final:   August 19, 2006 [planned]
> +    alpha 3: May 25, 2006 [planned]
> +    beta 1:  June 14, 2006 [planned]
> +    beta 2:  July 6, 2006 [planned]
> +    rc 1:    July 27, 2006 [planned]
> +    final:   August 3, 2006 [planned]
>  
>  
>  Completed features for 2.5
> @@ -155,7 +155,6 @@
>      - Missing documentation
>        * ctypes (Thomas Heller)
>        * ElementTree/cElementTree (Fredrik Lundh)
> -      * pysqlite (Gerhard Haering)
>        * setuptools (written, needs conversion to proper format)
>  
>      - AST compiler problems
> _______________________________________________
> Python-checkins mailing list
> Python-checkins at python.org
> http://mail.python.org/mailman/listinfo/python-checkins
> 
> 
> 



-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Love me, love my blog  http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From guido at python.org  Fri May 19 16:55:37 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 19 May 2006 07:55:37 -0700
Subject: [Python-Dev] PEP 3101 Update
In-Reply-To: <446D84BD.8080208@acm.org>
References: <445D5357.9060409@acm.org>
	<ca471dc20605182337h66c8ce61o867a137db768b710@mail.gmail.com>
	<446D84BD.8080208@acm.org>
Message-ID: <ca471dc20605190755g2e9a6911v90e0b3d165ad40c2@mail.gmail.com>

On 5/19/06, Talin <talin at acm.org> wrote:
> Guido van Rossum wrote:
> > [http://www.python.org/dev/peps/pep-3101/]
> > http://msdn.microsoft.com/library/en-us/cpguide/html/cpconcompositeformatting.asp

[on width spec a la .NET]
> We already have that now, don't we? If you look at the docs for "String
> Formatting Operations" in the library reference, it shows that a
> negative sign on a field width indicates left justification.

Yes, but I was proposing to adopt the .NET syntax for the feature.
Making it a responsibility of the generic formatting code rather than
of each individual type-specific formatter makes it more likely that
it will be implemented correctly everywhere.

[on escaping]
> There is another solution to this which is equally subtle, although
> fairly straightforward to parse. It involves defining the rules for
> escapes as follows:
>
>     '{{' is an escaped '{'
>     '}}' is an escaped '}', unless we are within a field.
>
> So you can write things like {0:10,{1}}, and the final '}}' will be
> parsed as two separate closing brackets, since we're within a field
> definition.
>
>  From a parsing standpoint, this is unambiguous, however I've held off
> on suggesting it because it might appear to be ambiguous to a casual reader.

Sure. But I still think there isn't enough demand for variable
expansion *within* a field to bother. When's the lats time you used a
* in a % format string? And how essential was it?

[on error handling for unused variables]
> I am undecided on this issue as well, which is the reason that it's not
> mentioned in the PEP (yet).

There's another use case which suggests that perhaps all errors should
pass silent (or at least produce a string result instead of an
exception). It's a fairly common bug to have a broken format in an
except clause for a rare exception. This can cause major issues in
large programs because the one time you need the debug info badly you
don't get anything at all. (The logging module now has a special
work-around for this reason.) That's too bad, but it is a fact of
life.

If broken formats *never* caused exceptions (but instead some kind of
butchered conversion drawing attention to the problem) then at least
one source of frustration would be gone. If people like this I suggest
putting some kind of error message in the place of any format
conversion for which an error occurred. (If we left the {format}
itself in, then this would become a feature that people would rely on
in undesirable places.)

I wouldn't want to go so far as to catch exceptions from type-specific
formatters; but those formatters themselves should silently ignore bad
specifiers, or at least return an error string, rather than raise
exceptions.

Still, this may be more painful for beginners learning to write format strings?

Until we have decided, the PEP should list the two alternatives in a
fair amount of detail as "undecided".

> I will be supplying a Python implementation of the parser along with the
> PEP. What I would prefer not to supply (although I certainly can if you
> feel it's necessary) is an optimized C implementation of the same
> parser, as well as the implementations of the various type-specific
> formatters.

There's no need for any performance work at this point, and C code is
"right out" (as the Pythons say). The implementation should be usable
as a readable "spec" to resolve gray areas in the PEP.

> [] is the most intuitive syntax by far IMHO. Let's run it up the
> flagpole and see if anybody salutes :)

Fair enough.

> The way I have set up the API for writing custom formatters (not talking
> about the __format__ method here) allows the custom formatter object to
> examine the entire output string, not merely the part that it is
> responsible for; And moreover, the custom formatter is free to modify
> the entire string. So for example, a custom formatter could tabify or
> un-tabify all previous text within the string.

Ouch. I don't like that.

> The API could be made slightly simpler by eliminating this feature. The
> reason that I added it was specifically so that custom formatters could
> perform column-specific operations, like the old BASIC function that
> would print spaces up to a given column. Having generated my share of
> reports back in the old days (COBOL programming in the USAF), I thought
> it might be useful to have the ability to do operations based on the
> absolute column number.

Please drop it. Python has never had it and AFAICR it's never been requested.

> Currently the API specifies that a custom formatter is passed an array
> object, and the custom formatter should append its data to the end of
> the array, but it is also free to examine and modify the rest of the array.
>
> If I were to remove this feature, then instead of using an array, we'd
> simply have the custom formatter return a string like __format__ does.

Yes please.

> So the question is - is the use case useful enough to keep this feature?
> What do people think of the use of the Python array type in this case?

That rather constrains the formatting implementation, so I'd like to drop it.

BTW I think we should move this back to the py3k list -- the PEP is
3101 after all. That should simplify the PEP a bit because it no
longer has ti distinguish between str and unicode. If we later decide
to backport it to 2.6 it should be easy enough to figure out what to
do with str vs. unicode (probably the same as we do for %).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From aahz at pythoncraft.com  Fri May 19 18:04:43 2006
From: aahz at pythoncraft.com (Aahz)
Date: Fri, 19 May 2006 09:04:43 -0700
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
Message-ID: <20060519160443.GA7282@panix.com>

On Thu, May 18, 2006, Neal Norwitz wrote:
>
> I moved up the 2.5 release date by a bit.  Ideally, I wanted to have
> the final on July 27 which corresponds to OSCON.  This seems too
> aggressive since I haven't updated the schedule publicly.  So the new
> schedule has rc1 slated for July 27.
> 
>     http://www.python.org/dev/peps/pep-0356/
> 
> So far we've only had very few 2.5 bug reports for the first 2 alphas.
>  Hopefully people really are testing. :-)

This isn't up on python.org, please post the current schedule here.

Another point: with the Need For Speed sprint, should we perhaps allot
some time for any new code to gel?  Or is none of that code slated for
2.5?
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there."  --Steve Gonedes

From rhettinger at ewtllc.com  Fri May 19 18:57:59 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Fri, 19 May 2006 09:57:59 -0700
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <20060519160443.GA7282@panix.com>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
	<20060519160443.GA7282@panix.com>
Message-ID: <446DF917.4010709@ewtllc.com>

Aahz wrote:

>On Thu, May 18, 2006, Neal Norwitz wrote:
>  
>
>>I moved up the 2.5 release date by a bit.  Ideally, I wanted to have
>>the final on July 27 which corresponds to OSCON.  This seems too
>>aggressive since I haven't updated the schedule publicly.  So the new
>>schedule has rc1 slated for July 27.
>>
>>    http://www.python.org/dev/peps/pep-0356/
>>
>>So far we've only had very few 2.5 bug reports for the first 2 alphas.
>> Hopefully people really are testing. :-)
>>    
>>
>
>This isn't up on python.org, please post the current schedule here.
>
>Another point: with the Need For Speed sprint, should we perhaps allot
>some time for any new code to gel?  Or is none of that code slated for
>2.5?
>  
>

Also, there is a sprint in Arlington VA.

Both sprints will likely fix bugs, apply long standing patches, and 
improve performance for Py2.5.

Also, we could use another bugday after the next alpha.

Some "gelling" time would be helpful.


Raymond

From talin at acm.org  Fri May 19 19:24:37 2006
From: talin at acm.org (Talin)
Date: Fri, 19 May 2006 10:24:37 -0700
Subject: [Python-Dev] PEP 3101 Update
In-Reply-To: <ca471dc20605190755g2e9a6911v90e0b3d165ad40c2@mail.gmail.com>
References: <445D5357.9060409@acm.org>	
	<ca471dc20605182337h66c8ce61o867a137db768b710@mail.gmail.com>	
	<446D84BD.8080208@acm.org>
	<ca471dc20605190755g2e9a6911v90e0b3d165ad40c2@mail.gmail.com>
Message-ID: <446DFF55.8010207@acm.org>

Guido van Rossum wrote:
> [on escaping]
> 
>> There is another solution to this which is equally subtle, although
>> fairly straightforward to parse. It involves defining the rules for
>> escapes as follows:
>>
>>     '{{' is an escaped '{'
>>     '}}' is an escaped '}', unless we are within a field.
>>
>> So you can write things like {0:10,{1}}, and the final '}}' will be
>> parsed as two separate closing brackets, since we're within a field
>> definition.
>>
>>  From a parsing standpoint, this is unambiguous, however I've held off
>> on suggesting it because it might appear to be ambiguous to a casual 
>> reader.
> 
> 
> Sure. But I still think there isn't enough demand for variable
> expansion *within* a field to bother. When's the lats time you used a
> * in a % format string? And how essential was it?

True. I'm mainly trying to avoid excess debate by not dropping existing 
features unecessarily. (Otherwise, you spend way to much time arguing 
with the handful of people out there that do rely on that use case.) But 
if you want to use your special "BDFL superpower" to shortcut the 
debate, I'm fine with that :)

> BTW I think we should move this back to the py3k list -- the PEP is
> 3101 after all. That should simplify the PEP a bit because it no
> longer has ti distinguish between str and unicode. If we later decide
> to backport it to 2.6 it should be easy enough to figure out what to
> do with str vs. unicode (probably the same as we do for %).

All right; Although my understanding is that the PEP should be escalated 
to c.l.p at some point before acceptance, and I figured py-dev would be 
a reasonable intermediate point before that. But it sounds like 3101 is 
going to go back into the shop for the moment, so that's a non-issue.

Since you seem to be in a PEP-review mode, could you have a look at 
3102? In particular, it seems that all of the controversies on that one 
have quieted down; Virtually everyone seems in favor of the first part, 
and you have already ruled in favor of the second part. So I am not sure 
that there is anything more to discuss.

Perhaps I should go ahead and put 3102 on c.l.p at this point.

-- Talin

From guido at python.org  Fri May 19 20:04:01 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 19 May 2006 11:04:01 -0700
Subject: [Python-Dev] PEP 3101 Update
In-Reply-To: <446DFF55.8010207@acm.org>
References: <445D5357.9060409@acm.org>
	<ca471dc20605182337h66c8ce61o867a137db768b710@mail.gmail.com>
	<446D84BD.8080208@acm.org>
	<ca471dc20605190755g2e9a6911v90e0b3d165ad40c2@mail.gmail.com>
	<446DFF55.8010207@acm.org>
Message-ID: <ca471dc20605191104m4dd613e1hcce0fb74df8b4e65@mail.gmail.com>

On 5/19/06, Talin <talin at acm.org> wrote:
> Since you seem to be in a PEP-review mode, could you have a look at
> 3102? In particular, it seems that all of the controversies on that one
> have quieted down; Virtually everyone seems in favor of the first part,
> and you have already ruled in favor of the second part. So I am not sure
> that there is anything more to discuss.
>
> Perhaps I should go ahead and put 3102 on c.l.p at this point.

+1

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim.peters at gmail.com  Fri May 19 23:18:50 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 19 May 2006 17:18:50 -0400
Subject: [Python-Dev] Decimal and Exponentiation
In-Reply-To: <1148060287.088387.41150@j55g2000cwa.googlegroups.com>
References: <1148060287.088387.41150@j55g2000cwa.googlegroups.com>
Message-ID: <1f7befae0605191418q280ff3f3pa7585f25d9d3743e@mail.gmail.com>

[elventear]
> I am the in the need to do some numerical calculations that involve
> real numbers that are larger than what the native float can handle.
>
> I've tried to use Decimal, but I've found one main obstacle that I
> don't know how to sort. I need to do exponentiation with real
> exponents, but it seems that Decimal does not support non integer
> exponents.
>
> I would appreciate if anyone could recommend a solution for this
> problem.

Wait <0.3 wink>.  Python's Decimal module intends to be a faithful
implementation of IBM's proposed standard for decimal arithmetic:

    http://www2.hursley.ibm.com/decimal/

Last December, ln, log10, exp, and exponentiation to non-integral
powers were added to the proposed standard, but nobody yet has written
implementation code for Python's  module.  [Python-Dev:  somebody
wants to volunteer for this :-)]

If you're not a numeric expert, I wouldn't recommend that you try this
yourself (in particular, trying to implement x**y as exp(ln(x)*y)
using the same precision is mathematically correct but is numerically
badly naive).

The GNU GMP library (for which Python bindings are available) also
supports "big floats", but their power operation is also restricted to
integer powers and/or exact roots.  This can be painful even to try;
e.g.,

    >>> from gmpy import mpf
    >>> mpf("1e10000") ** mpf("3.01")

consumed well over a minute of CPU time (on a 3.4 GHz box) before dying with

    ValueError: mpq.pow fractional exponent, inexact-root

If you're working with floats outside the range of IEEE double, you
_probably_ want to be working with logarithms instead anyway; but that
depends on you app, and I don't want to know about it ;-)

From guido at python.org  Fri May 19 23:31:42 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 19 May 2006 14:31:42 -0700
Subject: [Python-Dev] zlib module doesn't build - inflateCopy() not found
Message-ID: <ca471dc20605191431qaf6de33l8dac8342b2c14dbe@mail.gmail.com>

I'm having trouble getting the zlib module to build (SVN HEAD). The error is:

building 'zlib' extension
gcc -pthread -fPIC -fno-strict-aliasing -DNDEBUG -g -O3 -Wall
-Wstrict-prototypes -I. -I/home/guido/projects/python/trunk/./Include
-I../Include -I. -I/usr/local/include
-I/home/guido/projects/python/trunk/Include
-I/home/guido/projects/python/trunk/linux -c
/home/guido/projects/python/trunk/Modules/zlibmodule.c -o
build/temp.linux-i686-2.5/zlibmodule.o
/home/guido/projects/python/trunk/Modules/zlibmodule.c: In function
`PyZlib_uncopy':
/home/guido/projects/python/trunk/Modules/zlibmodule.c:723: warning:
implicit declaration of function `inflateCopy'
gcc -pthread -shared build/temp.linux-i686-2.5/zlibmodule.o
-L/usr/local/lib -lz -o build/lib.linux-i686-2.5/zlib.so
*** WARNING: renaming "zlib" since importing it failed:
build/lib.linux-i686-2.5/zlib.so: undefined symbol: inflateCopy

It seems I have libz 1.1.4. Is this no longer supported?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim.peters at gmail.com  Fri May 19 23:43:39 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 19 May 2006 17:43:39 -0400
Subject: [Python-Dev] zlib module doesn't build - inflateCopy() not found
In-Reply-To: <ca471dc20605191431qaf6de33l8dac8342b2c14dbe@mail.gmail.com>
References: <ca471dc20605191431qaf6de33l8dac8342b2c14dbe@mail.gmail.com>
Message-ID: <1f7befae0605191443n36ff36cax9fe18bc809449b9a@mail.gmail.com>

[Guido]
> I'm having trouble getting the zlib module to build (SVN HEAD). The error is:
>
> building 'zlib' extension
> gcc -pthread -fPIC -fno-strict-aliasing -DNDEBUG -g -O3 -Wall
> -Wstrict-prototypes -I. -I/home/guido/projects/python/trunk/./Include
> -I../Include -I. -I/usr/local/include
> -I/home/guido/projects/python/trunk/Include
> -I/home/guido/projects/python/trunk/linux -c
> /home/guido/projects/python/trunk/Modules/zlibmodule.c -o
> build/temp.linux-i686-2.5/zlibmodule.o
> /home/guido/projects/python/trunk/Modules/zlibmodule.c: In function
> `PyZlib_uncopy':
> /home/guido/projects/python/trunk/Modules/zlibmodule.c:723: warning:
> implicit declaration of function `inflateCopy'
> gcc -pthread -shared build/temp.linux-i686-2.5/zlibmodule.o
> -L/usr/local/lib -lz -o build/lib.linux-i686-2.5/zlib.so
> *** WARNING: renaming "zlib" since importing it failed:
> build/lib.linux-i686-2.5/zlib.so: undefined symbol: inflateCopy
>
> It seems I have libz 1.1.4. Is this no longer supported?

That's very peculiar.  Python has its own copy of the zlib code now,
under Modules/zlib/.  That's version 1.2.3.

    http://mail.python.org/pipermail/python-dev/2005-December/058873.html

inflateCopy() is exported by Modules/zlib/inflate.c.  I wonder why the
last gcc line above has "-lz" in it?  Upgrade to Windows and
everything will be fine :-)

From andymac at bullseye.apana.org.au  Sat May 20 00:23:05 2006
From: andymac at bullseye.apana.org.au (Andrew MacIntyre)
Date: Sat, 20 May 2006 09:23:05 +1100
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
Message-ID: <446E4549.7050203@bullseye.apana.org.au>

Neal Norwitz wrote:
> I moved up the 2.5 release date by a bit.  Ideally, I wanted to have
> the final on July 27 which corresponds to OSCON.  This seems too
> aggressive since I haven't updated the schedule publicly.  So the new
> schedule has rc1 slated for July 27.
> 
>     http://www.python.org/dev/peps/pep-0356/
> 
> So far we've only had very few 2.5 bug reports for the first 2 alphas.
>  Hopefully people really are testing. :-)
> 
> I'll continue to bug people that have action items.

I'm sorry, but I find such advancing schedules with little warning quite
objectionable.  Particularly the cutoff for new functionality implicit
in the last of the alphas.

I have been working on patch 1454481 to try and make the original
alpha 3 deadline in the context of my other commitments.  I still
hope to get it done early next week, but having that deadline advance
without apparent warning just made things harder.

Andrew.

-------------------------------------------------------------------------
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac at bullseye.apana.org.au  (pref) | Snail: PO Box 370
        andymac at pcug.org.au             (alt) |        Belconnen ACT 2616
Web:    http://www.andymac.org/               |        Australia

From kbk at shore.net  Sat May 20 05:24:26 2006
From: kbk at shore.net (Kurt B. Kaiser)
Date: Fri, 19 May 2006 23:24:26 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200605200324.k4K3OQR1004951@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  378 open ( +0) /  3238 closed (+22) /  3616 total (+22)
Bugs    :  907 open (+13) /  5831 closed (+20) /  6738 total (+33)
RFE     :  218 open ( +2) /   217 closed ( +2) /   435 total ( +4)

New / Reopened Patches
______________________

Patch fixing #1481770 (wrong shared lib ext on hpux ia64)  (2006-05-07)
CLOSED http://python.org/sf/1483325  opened by  David Everly

Add new top-level domains to cookielib  (2006-05-07)
CLOSED http://python.org/sf/1483395  reopened by  jjlee

Add new top-level domains to cookielib  (2006-05-07)
CLOSED http://python.org/sf/1483395  opened by  John J Lee

Wave.py support for ulaw and alaw audio  (2006-05-07)
       http://python.org/sf/1483545  opened by  Eric Woudenberg

tarfile.py fix for #1471427 and updates  (2006-05-09)
CLOSED http://python.org/sf/1484695  opened by  Lars Gust?bel

cookielib: reduce (fatal) dependency on "beta" logging?  (2006-05-09)
CLOSED http://python.org/sf/1484758  opened by  kxroberto

urllib2: resolves extremly slow import (of "everything")  (2006-05-09)
CLOSED http://python.org/sf/1484793  opened by  kxroberto

HTMLParser : A auto-tolerant parsing mode  (2006-05-11)
       http://python.org/sf/1486713  opened by  kxroberto

Patches and enhancements to turtle.py  (2006-05-11)
CLOSED http://python.org/sf/1486962  opened by  Vern Ceder

MacOSX: distutils support for -arch and -isysroot flags  (2006-05-13)
       http://python.org/sf/1488098  opened by  Ronald Oussoren

Memory alignment fix on SPARC  (2006-05-14)
CLOSED http://python.org/sf/1488312  opened by  Jan Palus

tarfile.py: support for file-objects and bz2 (cp. #1488634)  (2006-05-15)
CLOSED http://python.org/sf/1488881  opened by  Lars Gust?bel

Updates to syntax rules in reference manual  (2006-05-16)
       http://python.org/sf/1489771  opened by  ?iga Seilnacht

Patch for the urllib2 HOWTO  (2006-05-16)
CLOSED http://python.org/sf/1489784  opened by  Mike Foord

fix typo in os.utime() docstring  (2006-05-17)
CLOSED http://python.org/sf/1490189  opened by  M. Levinson

add os.chflags() and os.lchflags() where available  (2006-05-17)
       http://python.org/sf/1490190  opened by  M. Levinson

time.altzone does not include DST offset on Cygwin  (2006-05-17)
CLOSED http://python.org/sf/1490224  opened by  Christian Franke

PC new-logo-based icon set  (2006-05-17)
       http://python.org/sf/1490384  opened by  Andrew Clover

Describe Py_DEBUG and friends...  (2006-05-18)
       http://python.org/sf/1490989  opened by  Skip Montanaro

IDLE L&F on MacOSX   (2006-05-19)
       http://python.org/sf/1491759  opened by  Ronald Oussoren

Simple slice support for list.sort() and .reverse()  (2006-05-19)
       http://python.org/sf/1491804  opened by  Heiko Wundram

Complex representation  (2006-05-19)
       http://python.org/sf/1491866  opened by  Heiko Wundram

Fix for bug #1486663 mutable types check kwargs in tp_new  (2006-05-20)
       http://python.org/sf/1491939  opened by  ?iga Seilnacht

Patches Closed
______________

applesingle endianness issue  (2004-06-30)
       http://python.org/sf/982340  closed by  gbrandl

Patch fixing #1481770 (wrong shared lib ext on hpux ia64)  (2006-05-07)
       http://python.org/sf/1483325  closed by  loewis

Add new top-level domains to cookielib  (2006-05-07)
       http://python.org/sf/1483395  closed by  gbrandl

Add new top-level domains to cookielib  (2006-05-07)
       http://python.org/sf/1483395  closed by  gbrandl

Heavy revisions to urllib2 howto  (2006-05-01)
       http://python.org/sf/1479977  closed by  akuchling

Make urllib2 digest auth and basic auth play together  (2006-04-30)
       http://python.org/sf/1479302  closed by  gbrandl

Take advantage of BaseException/Exception split in cookielib  (2006-04-29)
       http://python.org/sf/1478993  closed by  gbrandl

tarfile.py fix for #1471427 and updates  (2006-05-09)
       http://python.org/sf/1484695  closed by  gbrandl

cookielib: reduce (fatal) dependency on "beta" logging?  (2006-05-09)
       http://python.org/sf/1484758  closed by  gbrandl

urllib2: resolves extremly slow import (of "everything")  (2006-05-09)
       http://python.org/sf/1484793  closed by  gbrandl

Fix doctest nit.  (2006-04-28)
       http://python.org/sf/1478292  closed by  tim_one

Remote debugging with pdb.py  (2003-04-14)
       http://python.org/sf/721464  closed by  gbrandl

urllib2: better import ftplib and gopherlib etc late  (2004-10-24)
       http://python.org/sf/1053150  closed by  loewis

detect %zd format for PY_FORMAT_SIZE_T  (2006-04-22)
       http://python.org/sf/1474907  closed by  bcannon

Patches and enhancements to turtle.py  (2006-05-11)
       http://python.org/sf/1486962  closed by  gbrandl

Improve docs for tp_clear and tp_traverse  (2006-04-19)
       http://python.org/sf/1473132  closed by  tim_one

Memory alignment fix on SPARC  (2006-05-14)
       http://python.org/sf/1488312  closed by  nnorwitz

tarfile.py: support for file-objects and bz2 (cp. #1488634)  (2006-05-15)
       http://python.org/sf/1488881  closed by  gbrandl

Add copy() method to zlib's compress and decompress objects  (2006-02-20)
       http://python.org/sf/1435422  closed by  gbrandl

Patch for the urllib2 HOWTO  (2006-05-16)
       http://python.org/sf/1489784  closed by  gbrandl

fix typo in os.utime() docstring  (2006-05-17)
       http://python.org/sf/1490189  closed by  gbrandl

time.altzone does not include DST offset on Cygwin  (2006-05-17)
       http://python.org/sf/1490224  closed by  gbrandl

great improvement for locale.py formatting functions  (2005-04-10)
       http://python.org/sf/1180296  closed by  gbrandl

New / Reopened Bugs
___________________

docs for 'profile' should refer to pstats.Stats  (2006-05-06)
CLOSED http://python.org/sf/1482988  opened by  Ori Peleg

Objects/genobject.c:53: gen_iternext: Assertion `f->f_back !  (2006-05-07)
       http://python.org/sf/1483133  opened by  svensoho

Add set.member() method  (2006-05-07)
CLOSED http://python.org/sf/1483384  opened by  Michael Tsai

Doctest bug  (2006-05-08)
CLOSED http://python.org/sf/1483785  opened by  stnsls

struct.unpack problem with @, =, < specifiers  (2006-05-08)
       http://python.org/sf/1483963  opened by  Christian Hudon

pydoc doesn't create html-file  (2006-05-08)
CLOSED http://python.org/sf/1483997  opened by  Gregor Lingl

hyper-threading locks up sleeping threads  (2006-05-08)
       http://python.org/sf/1484172  opened by  Olaf Meding

Output of KlocWork on Python2.4.3 sources  (2006-05-09)
       http://python.org/sf/1484556  opened by  Miki Tebeka

ctypes documenation obscure and incomplete  (2006-05-09)
       http://python.org/sf/1484580  opened by  Nicko van Someren

Doc for new_panel() missing scope info  (2006-05-09)
CLOSED http://python.org/sf/1484978  opened by  Troels Therkelsen

The name is qmail, not Qmail  (2006-05-09)
CLOSED http://python.org/sf/1485280  opened by  David Phillips

subprocess.Popen + cwd + relative args[0]  (2006-05-10)
CLOSED http://python.org/sf/1485447  opened by  Mark Sheppard

httplib: read/_read_chunked failes with ValueError sometime  (2006-05-11)
       http://python.org/sf/1486335  opened by  kxroberto

Over-zealous keyword-arguments check for built-in set class  (2006-05-11)
       http://python.org/sf/1486663  opened by  dib

OS X framework build for python 2.5 fails, configure is odd  (2006-05-11)
       http://python.org/sf/1486897  opened by  Christopher Knox

No examples or usage docs for urllib2  (2004-04-29)
CLOSED http://python.org/sf/944394  reopened by  fresh

2.5 rev 45972 fails build on Mac-Intel  (2006-05-12)
       http://python.org/sf/1487105  opened by  Alex Martelli

datetime.time and datetime.timedelta  (2006-05-12)
       http://python.org/sf/1487389  opened by  Chris Withers

can't pickle slice objects  (2006-05-12)
       http://python.org/sf/1487420  opened by  ganges master

Could BIND_FIRST be removed on HP-UX?  (2006-05-12)
       http://python.org/sf/1487481  opened by  G?ran Uddeborg

SystemError with conditional expression in assignment  (2006-05-13)
CLOSED http://python.org/sf/1487966  opened by  ?iga Seilnacht

tarfile.TarFile.open can might incorrectly raise ValueError  (2006-05-15)
CLOSED http://python.org/sf/1488634  opened by  Andrew Bennetts

appended file on Windows reads arbitrary data  (2006-05-15)
CLOSED http://python.org/sf/1488717  opened by  muimi

appended file on Windows reads arbitrary data  (2006-05-15)
CLOSED http://python.org/sf/1488779  opened by  muimi

endless loop in PyCFunction_Fini()  (2006-05-15)
       http://python.org/sf/1488906  opened by  Matejcik

Multiple dots in relative import statement raise SyntaxError  (2006-05-15)
       http://python.org/sf/1488915  opened by  ?iga Seilnacht

file.write + closed pipe   (2006-05-15)
       http://python.org/sf/1488934  opened by  Erik Demaine

difflib.Differ() doesn't always add hints for tab characters  (2006-05-16)
       http://python.org/sf/1488943  opened by  Henry

keyword and topic help broken in Pythonwin IDE  (2006-05-15)
       http://python.org/sf/1489051  opened by  BartlebyScrivener

2.4.3 fails to find Tcl/Tk on Solaris 10 x86_64   (2006-05-15)
       http://python.org/sf/1489246  opened by  Tim Mooney

time module missing from global mod index  (2006-05-16)
       http://python.org/sf/1489648  opened by  Skip Montanaro

%g incorrect conversion  (2006-05-18)
CLOSED http://python.org/sf/1490688  opened by  paul rubin

urllib.retrieve's reporthook called with non-helpful value  (2006-05-18)
       http://python.org/sf/1490929  opened by  Sidnei da Silva

glob_to_re problems  (2006-05-19)
       http://python.org/sf/1491431  opened by  Wayne Davison

argvemulator doesn't work on intel mac  (2006-05-19)
       http://python.org/sf/1491468  opened by  Ronald Oussoren

distutils adds (unwanted) -xcode=pic32 in the compile comman  (2006-05-19)
       http://python.org/sf/1491574  opened by  Toon Knapen

Pickle & schizofrenic classes don't match  (2006-05-19)
CLOSED http://python.org/sf/1491688  opened by  Ronald Oussoren

Bugs Closed
___________

asyncore is not listed in test_sundry  (2006-05-05)
       http://python.org/sf/1482746  closed by  gbrandl

docs for 'profile' should refer to pstats.Stats  (2006-05-06)
       http://python.org/sf/1482988  closed by  gbrandl

tarfile.py chokes on long names  (2006-04-16)
       http://python.org/sf/1471427  closed by  gbrandl

hpux ia64 shared lib ext should be ".so"  (2006-05-04)
       http://python.org/sf/1481770  closed by  nnorwitz

Doctest bug  (2006-05-08)
       http://python.org/sf/1483785  deleted by  stnsls

pydoc doesn't create html-file  (2006-05-08)
       http://python.org/sf/1483997  closed by  gbrandl

ctypes extension does not compile on Mac OS 10.3.9  (2006-03-29)
       http://python.org/sf/1460514  closed by  theller

Doc for new_panel() missing scope info  (2006-05-09)
       http://python.org/sf/1484978  closed by  gbrandl

Python does not check for POSIX compliant open() modes  (2006-03-31)
       http://python.org/sf/1462152  closed by  gbrandl

Building without C++ should not fail.  (2005-12-26)
       http://python.org/sf/1390321  closed by  loewis

The name is qmail, not Qmail  (2006-05-10)
       http://python.org/sf/1485280  closed by  gbrandl

subprocess.Popen + cwd + relative args[0]  (2006-05-10)
       http://python.org/sf/1485447  closed by  gbrandl

urllib2: better import ftplib and gopherlib etc late  (2004-10-13)
       http://python.org/sf/1046077  closed by  loewis

No examples or usage docs for urllib2  (2004-04-29)
       http://python.org/sf/944394  closed by  loewis

No examples or usage docs for urllib2  (04/29/04)
       http://python.org/sf/944394  closed by  sf-robot

long path support in win32 part of os.listdir(posixmodule.c)  (2006-02-14)
       http://python.org/sf/1431582  closed by  loewis

SystemError with conditional expression in assignment  (2006-05-13)
       http://python.org/sf/1487966  closed by  nnorwitz

subprocess _active.remove(self) self not in list _active  (05/10/05)
       http://python.org/sf/1199282  closed by  sf-robot

tarfile.TarFile.open can might incorrectly raise ValueError  (2006-05-15)
       http://python.org/sf/1488634  closed by  gbrandl

appended file on Windows reads arbitrary data  (2006-05-15)
       http://python.org/sf/1488717  closed by  tim_one

appended file on Windows reads arbitrary data  (2006-05-15)
       http://python.org/sf/1488779  closed by  gbrandl

%g incorrect conversion  (2006-05-18)
       http://python.org/sf/1490688  closed by  gbrandl

non-keyword argument following keyword  (2006-04-22)
       http://python.org/sf/1474677  closed by  nnorwitz

Pickle & schizofrenic classes don't match  (2006-05-19)
       http://python.org/sf/1491688  closed by  ronaldoussoren

New / Reopened RFE
__________________

Backwards compatibility support for Py_ssize_t  (2006-05-10)
       http://python.org/sf/1485576  opened by  Konrad Hinsen

str.startswith/endswith could take a tuple?  (2006-05-19)
       http://python.org/sf/1491485  opened by  Tom Lynn

RFE Closed
__________

Add set.member() method  (2006-05-07)
       http://python.org/sf/1483384  closed by  rhettinger

Drop py.ico and pyc.ico  (2006-04-27)
       http://python.org/sf/1477968  closed by  loewis


From anthony at interlink.com.au  Sat May 20 05:36:34 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sat, 20 May 2006 13:36:34 +1000
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <446DF917.4010709@ewtllc.com>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
	<20060519160443.GA7282@panix.com> <446DF917.4010709@ewtllc.com>
Message-ID: <200605201336.36395.anthony@interlink.com.au>

Remember, the feature freeze isn't until beta1. New stuff can still go 
in after the next alpha, before beta1.

And pure speedup related items aren't likely to cause feature changes 
(I hope)

Anthony

From anthony at interlink.com.au  Sat May 20 05:37:38 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sat, 20 May 2006 13:37:38 +1000
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <446E4549.7050203@bullseye.apana.org.au>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
	<446E4549.7050203@bullseye.apana.org.au>
Message-ID: <200605201337.40688.anthony@interlink.com.au>

On Saturday 20 May 2006 08:23, Andrew MacIntyre wrote:
> I'm sorry, but I find such advancing schedules with little warning
> quite objectionable.  Particularly the cutoff for new functionality
> implicit in the last of the alphas.

Nonono. Feature freeze is beta1.

Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From martin at v.loewis.de  Sat May 20 08:01:53 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 20 May 2006 08:01:53 +0200
Subject: [Python-Dev] zlib module doesn't build - inflateCopy() not found
In-Reply-To: <1f7befae0605191443n36ff36cax9fe18bc809449b9a@mail.gmail.com>
References: <ca471dc20605191431qaf6de33l8dac8342b2c14dbe@mail.gmail.com>
	<1f7befae0605191443n36ff36cax9fe18bc809449b9a@mail.gmail.com>
Message-ID: <446EB0D1.8020209@v.loewis.de>

Tim Peters wrote:
> That's very peculiar.  Python has its own copy of the zlib code now,
> under Modules/zlib/.  That's version 1.2.3.
> 
>     http://mail.python.org/pipermail/python-dev/2005-December/058873.html
> 
> inflateCopy() is exported by Modules/zlib/inflate.c.  I wonder why the
> last gcc line above has "-lz" in it?

By convention, the Unix build doesn't use the included zlib. Unix users
typically insist on using the system libraries when they are available,
so nobody has even contributed a build procedure to use the included
zlib optionally.

Regards,
Martin

From martin at v.loewis.de  Sat May 20 08:25:12 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 20 May 2006 08:25:12 +0200
Subject: [Python-Dev] zlib module doesn't build - inflateCopy() not found
In-Reply-To: <ca471dc20605191431qaf6de33l8dac8342b2c14dbe@mail.gmail.com>
References: <ca471dc20605191431qaf6de33l8dac8342b2c14dbe@mail.gmail.com>
Message-ID: <446EB648.90804@v.loewis.de>

Guido van Rossum wrote:
> It seems I have libz 1.1.4. Is this no longer supported?

Apparently so. This function started to be used with

------------------------------------------------------------------------
r46012 | georg.brandl | 2006-05-16 09:38:27 +0200 (Di, 16 Mai 2006) | 3
lines
Ge?nderte Pfade:
   M /python/trunk/Doc/lib/libzlib.tex
   M /python/trunk/Lib/test/test_zlib.py
   M /python/trunk/Misc/NEWS
   M /python/trunk/Modules/zlibmodule.c

Patch #1435422: zlib's compress and decompress objects now have a
copy() method.

------------------------------------------------------------------------

zlib itself contains inflateCopy since

Changes in 1.2.0 (9 March 2003)
    - Added inflateCopy() function to record state for random access on
      externally generated deflate streams (e.g. in gzip files)

The options for Python now are these:
1. require users to install zlib 1.2.x if they want the zlib module
   drawback: more work for the system administrator
2. conditionalize copy/uncopy on the system zlib being 1.2.x
   drawback: Python applications relying on these functions would
   break if the system zlib is too old
3. make setup.py fall back to the bundled zlib if the system zlib
   is too old
   drawback: you get all the problems of static linking, e.g.
   the size increase, and the problems with two zlib versions
   living in the same address space for some embedded Python
   applications

I'm not volunteering to implement any of the options.

Regards,
Martin



From g.brandl at gmx.net  Sat May 20 09:42:52 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 20 May 2006 09:42:52 +0200
Subject: [Python-Dev] zlib module doesn't build - inflateCopy() not found
In-Reply-To: <446EB648.90804@v.loewis.de>
References: <ca471dc20605191431qaf6de33l8dac8342b2c14dbe@mail.gmail.com>
	<446EB648.90804@v.loewis.de>
Message-ID: <e4mh9s$fgq$1@sea.gmane.org>

Martin v. L?wis wrote:
> Guido van Rossum wrote:
>> It seems I have libz 1.1.4. Is this no longer supported?
> 
> Apparently so. This function started to be used with
> 
> ------------------------------------------------------------------------
> r46012 | georg.brandl | 2006-05-16 09:38:27 +0200 (Di, 16 Mai 2006) | 3
> lines
> Ge?nderte Pfade:
>    M /python/trunk/Doc/lib/libzlib.tex
>    M /python/trunk/Lib/test/test_zlib.py
>    M /python/trunk/Misc/NEWS
>    M /python/trunk/Modules/zlibmodule.c
> 
> Patch #1435422: zlib's compress and decompress objects now have a
> copy() method.
> 
> ------------------------------------------------------------------------
> 
> zlib itself contains inflateCopy since
> 
> Changes in 1.2.0 (9 March 2003)
>     - Added inflateCopy() function to record state for random access on
>       externally generated deflate streams (e.g. in gzip files)
 >
> The options for Python now are these:
> 1. require users to install zlib 1.2.x if they want the zlib module
>    drawback: more work for the system administrator
 >
> 2. conditionalize copy/uncopy on the system zlib being 1.2.x
>    drawback: Python applications relying on these functions would
>    break if the system zlib is too old
 >
> 3. make setup.py fall back to the bundled zlib if the system zlib
>    is too old
>    drawback: you get all the problems of static linking, e.g.
>    the size increase, and the problems with two zlib versions
>    living in the same address space for some embedded Python
>    applications
 >
> I'm not volunteering to implement any of the options.

Of course, option 4 is to revert the patch if none of these options
are acceptable.

Georg


From me+python at modelnine.org  Thu May 18 08:59:26 2006
From: me+python at modelnine.org (Heiko Wundram)
Date: Thu, 18 May 2006 08:59:26 +0200
Subject: [Python-Dev] New string method - splitquoted
In-Reply-To: <200605180006.51461.dcinege-mlists@psychosis.com>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
Message-ID: <200605180859.26327.me+python@modelnine.org>

Am Donnerstag 18 Mai 2006 06:06 schrieb Dave Cinege:
> This is useful, but possibly better put into practice as a separate
> method??

I personally don't think it's particularily useful, at least not in the 
special case that your patch tries to address.

1) Generally, you won't only have one character that does quoting, but 
several. Think of the Python syntax, where you have ", ', """ and ''', which 
all behave slightly differently. The logic for " and ' is simple enough to 
implement (basically that's what your patch does, and I'm sure it's easy 
enough to extend it to accept a range of characters as splitters), but if you 
have more complicated quoting operators (such as """), are you sure it's 
sensible to implement the logic in split()?

2) What should the result of "this is a \"test string".split(None,-1,'"') be? 
An exception (ParseError)? Silently ignoring the missing delimiter, and 
returning ['this','is','a','test string']? Ignoring the delimiter altogether, 
returning ['this','is','a','"test','string']? I don't think there's one case 
to satisfy all here...

3) What about escapes of the delimiter? Your current patch doesn't address 
them at all (AFAICT) at the moment, but what should the escaping character 
be? Should "escape processing" take place, i.E. what should the result 
of "this is a \\\"delimiter \\test".split(None,-1,'"') be?

Don't get me wrong, I personally find this functionality very, very 
interesting (I'm +0.5 on adding it in some way or another), especially as a 
part of the standard library (not necessarily as an extension to .split()).

But there's quite a lot of semantic stuff to get right before you can 
implement it properly; see the complexity of the csv module, where you have 
to define pretty much all of this in the dialect you use to parse the csv 
file...

Why not write up a PEP?

--- Heiko.

From dcinegemlists2-dated-1148410955.43df49 at psychosis.com  Thu May 18 21:02:31 2006
From: dcinegemlists2-dated-1148410955.43df49 at psychosis.com (Dave Cinege)
Date: Thu, 18 May 2006 15:02:31 -0400
Subject: [Python-Dev] New string method - splitquoted - New EmailAddress
In-Reply-To: <0f0c01c67a54$0371f7c0$b14c2a97@bagio>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
	<200605180900.00583.me+python-dev@modelnine.org>
	<0f0c01c67a54$0371f7c0$b14c2a97@bagio>
Message-ID: <200605181502.31989.dcinegemlists2@psychosis.com>

Sorry to all about tmda on my dcinege-mlists email addy. It was not supposed 
to be, however the dash in dcinege-mlists was flipping out the latest 
incarnation of my mail server config. Please use this address to reply to me
in this thread.

Dave


From dcinegemlists2-dated-1148412023.b4d3d0 at psychosis.com  Thu May 18 21:20:19 2006
From: dcinegemlists2-dated-1148412023.b4d3d0 at psychosis.com (Dave Cinege)
Date: Thu, 18 May 2006 15:20:19 -0400
Subject: [Python-Dev] New string method - splitquoted
In-Reply-To: <ca471dc20605180811m1cec4860pab52731352df57da@mail.gmail.com>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
	<ca471dc20605180811m1cec4860pab52731352df57da@mail.gmail.com>
Message-ID: <200605181520.19915.dcinegemlists2@psychosis.com>

On Thursday 18 May 2006 11:11, Guido van Rossum wrote:
> This is not an apropriate function to add as a string methods. There
> are too many conventions for quoting and too many details to get
> right. One method can't possibly handle them all without an enormous
> number of weird options. It's better to figure out how to do this with
> regexps or use some of the other approaches that have been suggested.
> (Did anyone mention the csv module yet? It deals with this too.)

Maybe my idea is better called splitexcept instead of splitquoted, as my goal 
is to (simply) provide a way to limit the split by delimiters, and not dive 
into an all encompassing quoting algorithm.

It me this is in the spirit of the maxsplit option already present.

Dave


From dcinegemlists2-dated-1148418122.b868e0 at psychosis.com  Thu May 18 23:01:58 2006
From: dcinegemlists2-dated-1148418122.b868e0 at psychosis.com (Dave Cinege)
Date: Thu, 18 May 2006 17:01:58 -0400
Subject: [Python-Dev] New string method - splitquoted
In-Reply-To: <446CD574.70903@scottdial.com>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
	<446CD574.70903@scottdial.com>
Message-ID: <200605181701.58169.dcinegemlists2@psychosis.com>

On Thursday 18 May 2006 16:13, you wrote:
> Dave Cinege wrote:
> > For example:
> >
> > s = '      Chan: 11  SNR: 22  ESSID: "Spaced Out Wifi"  Enc: On'
>
> My complaint with this example is that you are just using the wrong tool
> to do this job. If I was going to do this, I would've immediately jumped
> on the regex-press train.
>
> wifi_info = re.match('^\s+'
>                       'Chan:\s+(?P<channel>[0-9]+)\s+'
>                       'SNR:\s+(?P<snr>[0-9]+)\s+'
>                       'ESSID:\s+"(?P<essid>[^"]*)"\s+'
>                       'Enc:\s+(?P<encryption>[a-zA-Z]+)'
>                       , s)

For the 5 years of been pythoning, I've used re probably twice. 
I find regex to be a tool of last resort, and quite a bit of effort to get 
right, as regex (for me) is quite prone it giving unintended results without 
a good deal of thought. I don't want to have to think. That's why I use 
python.  : )

.split() and slicing has always been python's holy grail for me, and I find it 
a lot easier to .replace() 'stray' chars with spaces or a delimiter and then 
split() that.  It's easier to read and (should be) a lot quicker to process 
then regex. (Which I care about, as I'm also often on embedded CPU's of a few 
hundred MHz)

So .split works just super duper.....but I keep running in to situations where 
I'd like a substr to be excluded from the split'ing.

The clearest one is excluding a 'quoted' string that has whitespace.
Here's another, be it, a very poor example: 

s = '\t\tFrequency:2.462 GHz (Channel 11)'	# This is real output from iwlist:
s.replace(':',')').replace(' (','))').split(None,-1,')')
['Frequency', '2.462 GHz', 'Channel 11']

I wanted to preserve the '2.462 GHz' substr. Let's assume, that could come out 
as '900 MHz' or '11.3409 GHz'. The above code gets what I want in 1 shot, 
either way. Show me an easier way, that doesn't need multiple splits, and 
string re-assembly, ....and I'll use it.

Dave


From guido at python.org  Sat May 20 16:47:01 2006
From: guido at python.org (Guido van Rossum)
Date: Sat, 20 May 2006 07:47:01 -0700
Subject: [Python-Dev] zlib module doesn't build - inflateCopy() not found
In-Reply-To: <e4mh9s$fgq$1@sea.gmane.org>
References: <ca471dc20605191431qaf6de33l8dac8342b2c14dbe@mail.gmail.com>
	<446EB648.90804@v.loewis.de> <e4mh9s$fgq$1@sea.gmane.org>
Message-ID: <ca471dc20605200747i759c28d7v5cb0793093ada2ba@mail.gmail.com>

On 5/20/06, Georg Brandl <g.brandl at gmx.net> wrote:
> Martin v. L?wis wrote:
> > Guido van Rossum wrote:
> >> It seems I have libz 1.1.4. Is this no longer supported?
> >
> > Apparently so. This function started to be used with
> >
> > ------------------------------------------------------------------------
> > r46012 | georg.brandl | 2006-05-16 09:38:27 +0200 (Di, 16 Mai 2006) | 3
> > lines
> > Ge?nderte Pfade:
> >    M /python/trunk/Doc/lib/libzlib.tex
> >    M /python/trunk/Lib/test/test_zlib.py
> >    M /python/trunk/Misc/NEWS
> >    M /python/trunk/Modules/zlibmodule.c
> >
> > Patch #1435422: zlib's compress and decompress objects now have a
> > copy() method.
> >
> > ------------------------------------------------------------------------
> >
> > zlib itself contains inflateCopy since
> >
> > Changes in 1.2.0 (9 March 2003)
> >     - Added inflateCopy() function to record state for random access on
> >       externally generated deflate streams (e.g. in gzip files)
>  >
> > The options for Python now are these:
> > 1. require users to install zlib 1.2.x if they want the zlib module
> >    drawback: more work for the system administrator

That's unacceptable; I'd have to fight a whole layer of burocracy to
change the system image (and I wouldn't even want to do this since one
premise here is that engineering workstations are interchangeable).

> > 2. conditionalize copy/uncopy on the system zlib being 1.2.x
> >    drawback: Python applications relying on these functions would
> >    break if the system zlib is too old

That would be fine for me. Since the feature is apparently brand new
(in Python) I doubt that I'd miss it.

> > 3. make setup.py fall back to the bundled zlib if the system zlib
> >    is too old
> >    drawback: you get all the problems of static linking, e.g.
> >    the size increase, and the problems with two zlib versions
> >    living in the same address space for some embedded Python
> >    applications

That would work too (probably better). As long as it prints some kind
of warning durning build IMO this would be the best option. The
drawbacks don't matter for me.

> > I'm not volunteering to implement any of the options.
>
> Of course, option 4 is to revert the patch if none of these options
> are acceptable.

Not so quick. :-)

What was the purpose of the patch in the first place?

If you could volunteer #3 that would be great IMO.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sat May 20 16:57:08 2006
From: guido at python.org (Guido van Rossum)
Date: Sat, 20 May 2006 07:57:08 -0700
Subject: [Python-Dev] New string method - splitquoted
In-Reply-To: <200605181701.58169.dcinegemlists2@psychosis.com>
References: <200605180006.51461.dcinege-mlists@psychosis.com>
	<446CD574.70903@scottdial.com>
	<200605181701.58169.dcinegemlists2@psychosis.com>
Message-ID: <ca471dc20605200757h65f44301s5792f5ef8985dc77@mail.gmail.com>

"I'm sorry Dave, I'm afraid I can't do that."

We hear you, Dave, but this is not a suitable function to add to the
standard library. Many respondents are trying to tell you that in many
different ways. If you keep arguing for it, we'll just ignore you.

--Guido

PS. Give up TDMA. Try Spambayes instead. It works much better and is
less annoying for your correspondents.

On 5/18/06, Dave Cinege
<dcinegemlists2-dated-1148418122.b868e0 at psychosis.com> wrote:
> On Thursday 18 May 2006 16:13, you wrote:
> > Dave Cinege wrote:
> > > For example:
> > >
> > > s = '      Chan: 11  SNR: 22  ESSID: "Spaced Out Wifi"  Enc: On'
> >
> > My complaint with this example is that you are just using the wrong tool
> > to do this job. If I was going to do this, I would've immediately jumped
> > on the regex-press train.
> >
> > wifi_info = re.match('^\s+'
> >                       'Chan:\s+(?P<channel>[0-9]+)\s+'
> >                       'SNR:\s+(?P<snr>[0-9]+)\s+'
> >                       'ESSID:\s+"(?P<essid>[^"]*)"\s+'
> >                       'Enc:\s+(?P<encryption>[a-zA-Z]+)'
> >                       , s)
>
> For the 5 years of been pythoning, I've used re probably twice.
> I find regex to be a tool of last resort, and quite a bit of effort to get
> right, as regex (for me) is quite prone it giving unintended results without
> a good deal of thought. I don't want to have to think. That's why I use
> python.  : )
>
> .split() and slicing has always been python's holy grail for me, and I find it
> a lot easier to .replace() 'stray' chars with spaces or a delimiter and then
> split() that.  It's easier to read and (should be) a lot quicker to process
> then regex. (Which I care about, as I'm also often on embedded CPU's of a few
> hundred MHz)
>
> So .split works just super duper.....but I keep running in to situations where
> I'd like a substr to be excluded from the split'ing.
>
> The clearest one is excluding a 'quoted' string that has whitespace.
> Here's another, be it, a very poor example:
>
> s = '\t\tFrequency:2.462 GHz (Channel 11)'      # This is real output from iwlist:
> s.replace(':',')').replace(' (','))').split(None,-1,')')
> ['Frequency', '2.462 GHz', 'Channel 11']
>
> I wanted to preserve the '2.462 GHz' substr. Let's assume, that could come out
> as '900 MHz' or '11.3409 GHz'. The above code gets what I want in 1 shot,
> either way. Show me an easier way, that doesn't need multiple splits, and
> string re-assembly, ....and I'll use it.
>
> Dave
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nnorwitz at gmail.com  Sun May 21 09:10:04 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 21 May 2006 00:10:04 -0700
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <200605201336.36395.anthony@interlink.com.au>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
	<20060519160443.GA7282@panix.com> <446DF917.4010709@ewtllc.com>
	<200605201336.36395.anthony@interlink.com.au>
Message-ID: <ee2a432c0605210010q770d046ft2e67ae5df5898607@mail.gmail.com>

On 5/19/06, Anthony Baxter <anthony at interlink.com.au> wrote:
> Remember, the feature freeze isn't until beta1. New stuff can still go
> in after the next alpha, before beta1.
>
> And pure speedup related items aren't likely to cause feature changes
> (I hope)

I agree. Of course, it's preferable to get things in ASAP to get more
testing.  I should probably check in my perf patch for speeding up
function calls.  It would be great if someone picked that up in the
sprint and polished it off.  The last version I posted should work in
debug and normal modes, adds some simple profiling.  i think the only
thing that it didn't have that Martin requested was more inlining of
PyCFunction_CallMethod(?).  I'm not sure if that would be faster or
not.

n

From martin at v.loewis.de  Sun May 21 12:01:44 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 21 May 2006 12:01:44 +0200
Subject: [Python-Dev] zlib module doesn't build - inflateCopy() not found
In-Reply-To: <ca471dc20605200747i759c28d7v5cb0793093ada2ba@mail.gmail.com>
References: <ca471dc20605191431qaf6de33l8dac8342b2c14dbe@mail.gmail.com>	<446EB648.90804@v.loewis.de>
	<e4mh9s$fgq$1@sea.gmane.org>
	<ca471dc20605200747i759c28d7v5cb0793093ada2ba@mail.gmail.com>
Message-ID: <44703A88.3040800@v.loewis.de>

Guido van Rossum wrote:
> What was the purpose of the patch in the first place?

I don't fully understand it. I guess the objective of the patch
was to expose the feature of the underlying library. The SF
submission then gives the rationale for that feature as

"""
Copying a
(de)compression object allows a developer to store the
state of the (de)compressor at a certain point of the
input stream in order to more efficiently compress data
sharing some identical header, or to more efficiently
seek inside compressed data.
"""

The submitter first posted this message to c.l.p:

http://groups.google.com/group/comp.lang.python/msg/13f8ce4057162ad5

Apparently, one application is to take a snapshot of a gzip
file that is being created, then add to it, and if that fails
(e.g. because the size of the CD would be exceeded), truncate
it to back to the snapshot (and then close the zlib stream).

Regards,
Martin



From aahz at pythoncraft.com  Sun May 21 15:53:13 2006
From: aahz at pythoncraft.com (Aahz)
Date: Sun, 21 May 2006 06:53:13 -0700
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <ee2a432c0605210010q770d046ft2e67ae5df5898607@mail.gmail.com>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
	<20060519160443.GA7282@panix.com> <446DF917.4010709@ewtllc.com>
	<200605201336.36395.anthony@interlink.com.au>
	<ee2a432c0605210010q770d046ft2e67ae5df5898607@mail.gmail.com>
Message-ID: <20060521135313.GA13431@panix.com>

On Sun, May 21, 2006, Neal Norwitz wrote:
> On 5/19/06, Anthony Baxter <anthony at interlink.com.au> wrote:
>>
>>Remember, the feature freeze isn't until beta1. New stuff can still go
>>in after the next alpha, before beta1.
> 
> I agree. Of course, it's preferable to get things in ASAP to get more
> testing.  

Nevertheless, I still think that putting out an alpha during or right
before a major sprint is awkward at best.  My testing is more focused on
alpha releases than the trunk, and I think other people work similarly.
Except for Anthony, the responses you've gotten to this surprising
change have been negative; this doesn't seem in keeping with our usual
decision-making process.

BTW, the PEP on python.org hasn't been updated.  Please post the current
schedule here.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there."  --Steve Gonedes

From ncoghlan at gmail.com  Sun May 21 16:25:34 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 22 May 2006 00:25:34 +1000
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <20060521135313.GA13431@panix.com>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>	<20060519160443.GA7282@panix.com>
	<446DF917.4010709@ewtllc.com>	<200605201336.36395.anthony@interlink.com.au>	<ee2a432c0605210010q770d046ft2e67ae5df5898607@mail.gmail.com>
	<20060521135313.GA13431@panix.com>
Message-ID: <4470785E.7030803@gmail.com>

Aahz wrote:
> On Sun, May 21, 2006, Neal Norwitz wrote:
>> On 5/19/06, Anthony Baxter <anthony at interlink.com.au> wrote:
>>> Remember, the feature freeze isn't until beta1. New stuff can still go
>>> in after the next alpha, before beta1.
>> I agree. Of course, it's preferable to get things in ASAP to get more
>> testing.  
> 
> Nevertheless, I still think that putting out an alpha during or right
> before a major sprint is awkward at best.  My testing is more focused on
> alpha releases than the trunk, and I think other people work similarly.
> Except for Anthony, the responses you've gotten to this surprising
> change have been negative; this doesn't seem in keeping with our usual
> decision-making process.

Moving the alpha release from after a weekend to before a weekend is a bit of 
a hassle for me, too. I'd hoped to have the functional->functool move done 
this weekend, but wasn't too worried about getting it finished because I had 
next weekend as a fallback.

Now it appears I'll need to find time to do it sometime this week instead. 
That's certainly possible, but it's somewhat inconvenient. We don't all get to 
do Python-related things on work time ;)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From thomas at python.org  Sun May 21 16:33:26 2006
From: thomas at python.org (Thomas Wouters)
Date: Sun, 21 May 2006 16:33:26 +0200
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <20060521135313.GA13431@panix.com>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
	<20060519160443.GA7282@panix.com> <446DF917.4010709@ewtllc.com>
	<200605201336.36395.anthony@interlink.com.au>
	<ee2a432c0605210010q770d046ft2e67ae5df5898607@mail.gmail.com>
	<20060521135313.GA13431@panix.com>
Message-ID: <9e804ac0605210733j52079d84hf8c71a428131a15c@mail.gmail.com>

On 5/21/06, Aahz <aahz at pythoncraft.com> wrote:
>
> On Sun, May 21, 2006, Neal Norwitz wrote:
> > On 5/19/06, Anthony Baxter <anthony at interlink.com.au> wrote:
> >>
> >>Remember, the feature freeze isn't until beta1. New stuff can still go
> >>in after the next alpha, before beta1.
> >
> > I agree. Of course, it's preferable to get things in ASAP to get more
> > testing.
>
> Nevertheless, I still think that putting out an alpha during or right
> before a major sprint is awkward at best.  My testing is more focused on
> alpha releases than the trunk, and I think other people work similarly.
> Except for Anthony, the responses you've gotten to this surprising
> change have been negative; this doesn't seem in keeping with our usual
> decision-making process.


Allow me to pitch in my support for speeding up the releases, then. I don't
believe alpha releases (of Python) get serious testing. They wet appetites,
but until the feature-freeze it's really too much in flux to even worry
about extensive testing. Speeding up the releases makes sense to me,
considering how little ruckus they've made so far (in spite of two rather
complex internal changes), and it also means people get their hands on a
beta -- good testing material -- that much sooner.

An acceptable (to me) alternative would be to skip alpha3. The biggest
change is Martin's rewrite of various os.* methods, which IMHO isn't really
that big a deal (it's not *supposed* to change too many semantics, anyway :)

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060521/2521a68d/attachment.htm 

From me+python-dev at modelnine.org  Sun May 21 17:10:51 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Sun, 21 May 2006 17:10:51 +0200
Subject: [Python-Dev] PEP-xxx: Unification of for statement and list-comp
	syntax
Message-ID: <200605211710.51720.me+python-dev@modelnine.org>

Hi all!

The following PEP tries to make the case for a slight unification of for 
statement and list comprehension syntax.

Comments appreciated, including on the sample implementation.

===
PEP: xxx
Title: Unification of for-statement and list-comprehension syntax
Version: $Revision$
Last-Modified: $Date$
Author: Heiko Wundram <me at modelnine.org>
Status: Active
Type: Standards Track
Content-Type: text/plain
Created: 21-May-2006
Post-History: 21-May-2006 17:00 GMT+0200


Abstract

    When list comprehensions were introduced, they added the ability
    to add conditions which are tested before the expression which is
    associated with the list comprehension is evaluated. This is
    often used to create new lists which consist only of those items
    of the original list which match the specified condition(s). For
    example:

      [node for node in tree if node.haschildren()]

    will create a new list which only contains those items of the
    original list (tree) whose items match the havechildren()
    condition. Generator expressions work similarily.

    With a standard for-loop, this corresponds to adding a continue
    statement testing for the negated expression at the beginning of
    the loop body.

    As I've noticed that I find myself typing the latter quite often
    in code I write, it would only be sensible to add the corresponding
    syntax for the for statement:

      for node in tree if node.haschildren():
          <do something with node>

    as syntactic sugar for:

      for node in tree:
          if not node.haschildren():
              continue
          <do something with node>

    There are several other methods (including generator-expressions or
    list-comprehensions, the itertools module, or the builtin filter
    function) to achieve this same goal, but all of them make the code
    longer and harder to understand and/or require more memory, because
    of the generation of an intermediate list.


Implementation details

    The implementation of this feature requires changes to the Python
    grammar, to allow for a variable number of 'if'-expressions before
    the colon of a 'for'-statement:

      for_stmt: 'for' exprlist 'in' testlist_safe ('if' old_test)* ':'
                    suite ['else' ':' suite]

    This change would replace testlist with testlist_safe as the
    'in'-expression of a for statement, in line with the definition of
    list comprehensions in the Python grammar.

    Each of the 'if'-expressions is evaluated in turn (if present), until
    one is found False, in which case the 'for'-statement restarts at the
    next item from the generator of the 'in'-expression immediately
    (the tests are thus short-circuting), or until all are found to be
    True (or there are no tests), in which case the suite body is executed.
    The behaviour of the 'else'-suite is unchanged.

    The intermediate code that is generated is modelled after the
    byte-code that is generated for list comprehensions:

      def f():
          for x in range(10) if x == 1:
              print x

    would generate:

      2           0 SETUP_LOOP              42 (to 45)
                  3 LOAD_GLOBAL              0 (range)
                  6 LOAD_CONST               1 (10)
                  9 CALL_FUNCTION            1
                 12 GET_ITER
            >>   13 FOR_ITER                28 (to 44)
                 16 STORE_FAST               0 (x)
                 19 LOAD_FAST                0 (x)
                 22 LOAD_CONST               2 (1)
                 25 COMPARE_OP               2 (==)
                 28 JUMP_IF_FALSE            9 (to 40)
                 31 POP_TOP

      3          32 LOAD_FAST                0 (x)
                 35 PRINT_ITEM
                 36 PRINT_NEWLINE
                 37 JUMP_ABSOLUTE           13
            >>   40 POP_TOP
                 41 JUMP_ABSOLUTE           13
            >>   44 POP_BLOCK
            >>   45 LOAD_CONST               0 (None)
                 48 RETURN_VALUE

    where all tests are inserted immediately at the beginning of the
    loop body, and jump to a new block if found to be false which pops
    the comparision from the stack and jumps back to the beginning of
    the loop to fetch the next item.

Implementation issues

    The changes are backwards-compatible, as they don't change the
    default behaviour of the 'for'-loop. Also, as the changes that
    this PEP proposes don't change the byte-code structure of the
    interpreter, old byte-code continues to run on Python with this
    addition unchanged.


Implementation

    A sample implementation (with updates to the grammar documentation
    and a small test case) is available at:

      
http://sourceforge.net/tracker/index.php?func=detail&aid=1492509&group_id=5470&atid=305470


Copyright

    This document has been placed in the public domain.
===

--- Heiko.

From guido at python.org  Sun May 21 17:34:16 2006
From: guido at python.org (Guido van Rossum)
Date: Sun, 21 May 2006 08:34:16 -0700
Subject: [Python-Dev] zlib module doesn't build - inflateCopy() not found
In-Reply-To: <44703A88.3040800@v.loewis.de>
References: <ca471dc20605191431qaf6de33l8dac8342b2c14dbe@mail.gmail.com>
	<446EB648.90804@v.loewis.de> <e4mh9s$fgq$1@sea.gmane.org>
	<ca471dc20605200747i759c28d7v5cb0793093ada2ba@mail.gmail.com>
	<44703A88.3040800@v.loewis.de>
Message-ID: <ca471dc20605210834o435d0bbcybdc8c469a63df4b9@mail.gmail.com>

Then options 2 and 3 are both fine.

Not compiling at all is *not*, so if nobody has time to implement 2 or
3, we'll have to do 4.

--Guido

On 5/21/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Guido van Rossum wrote:
> > What was the purpose of the patch in the first place?
>
> I don't fully understand it. I guess the objective of the patch
> was to expose the feature of the underlying library. The SF
> submission then gives the rationale for that feature as
>
> """
> Copying a
> (de)compression object allows a developer to store the
> state of the (de)compressor at a certain point of the
> input stream in order to more efficiently compress data
> sharing some identical header, or to more efficiently
> seek inside compressed data.
> """
>
> The submitter first posted this message to c.l.p:
>
> http://groups.google.com/group/comp.lang.python/msg/13f8ce4057162ad5
>
> Apparently, one application is to take a snapshot of a gzip
> file that is being created, then add to it, and if that fails
> (e.g. because the size of the CD would be exceeded), truncate
> it to back to the snapshot (and then close the zlib stream).
>
> Regards,
> Martin
>
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sun May 21 17:38:49 2006
From: guido at python.org (Guido van Rossum)
Date: Sun, 21 May 2006 08:38:49 -0700
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
	list-comp syntax
In-Reply-To: <200605211710.51720.me+python-dev@modelnine.org>
References: <200605211710.51720.me+python-dev@modelnine.org>
Message-ID: <ca471dc20605210838g697d0bf6v215bc5d0c7d2d800@mail.gmail.com>

-1. The contraction just makes it easier to miss the logic.

Also, it would be a parsing conflict for the new conditional
expressions (x if T else y).

This was proposed and rejected before.

--Guido

On 5/21/06, Heiko Wundram <me+python-dev at modelnine.org> wrote:
> Hi all!
>
> The following PEP tries to make the case for a slight unification of for
> statement and list comprehension syntax.
>
> Comments appreciated, including on the sample implementation.
>
> ===
> PEP: xxx
> Title: Unification of for-statement and list-comprehension syntax
> Version: $Revision$
> Last-Modified: $Date$
> Author: Heiko Wundram <me at modelnine.org>
> Status: Active
> Type: Standards Track
> Content-Type: text/plain
> Created: 21-May-2006
> Post-History: 21-May-2006 17:00 GMT+0200
>
>
> Abstract
>
>     When list comprehensions were introduced, they added the ability
>     to add conditions which are tested before the expression which is
>     associated with the list comprehension is evaluated. This is
>     often used to create new lists which consist only of those items
>     of the original list which match the specified condition(s). For
>     example:
>
>       [node for node in tree if node.haschildren()]
>
>     will create a new list which only contains those items of the
>     original list (tree) whose items match the havechildren()
>     condition. Generator expressions work similarily.
>
>     With a standard for-loop, this corresponds to adding a continue
>     statement testing for the negated expression at the beginning of
>     the loop body.
>
>     As I've noticed that I find myself typing the latter quite often
>     in code I write, it would only be sensible to add the corresponding
>     syntax for the for statement:
>
>       for node in tree if node.haschildren():
>           <do something with node>
>
>     as syntactic sugar for:
>
>       for node in tree:
>           if not node.haschildren():
>               continue
>           <do something with node>
>
>     There are several other methods (including generator-expressions or
>     list-comprehensions, the itertools module, or the builtin filter
>     function) to achieve this same goal, but all of them make the code
>     longer and harder to understand and/or require more memory, because
>     of the generation of an intermediate list.
>
>
> Implementation details
>
>     The implementation of this feature requires changes to the Python
>     grammar, to allow for a variable number of 'if'-expressions before
>     the colon of a 'for'-statement:
>
>       for_stmt: 'for' exprlist 'in' testlist_safe ('if' old_test)* ':'
>                     suite ['else' ':' suite]
>
>     This change would replace testlist with testlist_safe as the
>     'in'-expression of a for statement, in line with the definition of
>     list comprehensions in the Python grammar.
>
>     Each of the 'if'-expressions is evaluated in turn (if present), until
>     one is found False, in which case the 'for'-statement restarts at the
>     next item from the generator of the 'in'-expression immediately
>     (the tests are thus short-circuting), or until all are found to be
>     True (or there are no tests), in which case the suite body is executed.
>     The behaviour of the 'else'-suite is unchanged.
>
>     The intermediate code that is generated is modelled after the
>     byte-code that is generated for list comprehensions:
>
>       def f():
>           for x in range(10) if x == 1:
>               print x
>
>     would generate:
>
>       2           0 SETUP_LOOP              42 (to 45)
>                   3 LOAD_GLOBAL              0 (range)
>                   6 LOAD_CONST               1 (10)
>                   9 CALL_FUNCTION            1
>                  12 GET_ITER
>             >>   13 FOR_ITER                28 (to 44)
>                  16 STORE_FAST               0 (x)
>                  19 LOAD_FAST                0 (x)
>                  22 LOAD_CONST               2 (1)
>                  25 COMPARE_OP               2 (==)
>                  28 JUMP_IF_FALSE            9 (to 40)
>                  31 POP_TOP
>
>       3          32 LOAD_FAST                0 (x)
>                  35 PRINT_ITEM
>                  36 PRINT_NEWLINE
>                  37 JUMP_ABSOLUTE           13
>             >>   40 POP_TOP
>                  41 JUMP_ABSOLUTE           13
>             >>   44 POP_BLOCK
>             >>   45 LOAD_CONST               0 (None)
>                  48 RETURN_VALUE
>
>     where all tests are inserted immediately at the beginning of the
>     loop body, and jump to a new block if found to be false which pops
>     the comparision from the stack and jumps back to the beginning of
>     the loop to fetch the next item.
>
> Implementation issues
>
>     The changes are backwards-compatible, as they don't change the
>     default behaviour of the 'for'-loop. Also, as the changes that
>     this PEP proposes don't change the byte-code structure of the
>     interpreter, old byte-code continues to run on Python with this
>     addition unchanged.
>
>
> Implementation
>
>     A sample implementation (with updates to the grammar documentation
>     and a small test case) is available at:
>
>
> http://sourceforge.net/tracker/index.php?func=detail&aid=1492509&group_id=5470&atid=305470
>
>
> Copyright
>
>     This document has been placed in the public domain.
> ===
>
> --- Heiko.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From me+python-dev at modelnine.org  Sun May 21 17:57:46 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Sun, 21 May 2006 17:57:46 +0200
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
	list-comp syntax
In-Reply-To: <ca471dc20605210838g697d0bf6v215bc5d0c7d2d800@mail.gmail.com>
References: <200605211710.51720.me+python-dev@modelnine.org>
	<ca471dc20605210838g697d0bf6v215bc5d0c7d2d800@mail.gmail.com>
Message-ID: <200605211757.46187.me+python-dev@modelnine.org>

Am Sonntag 21 Mai 2006 17:38 schrieb Guido van Rossum:
> -1. The contraction just makes it easier to miss the logic.

I actually don't think so, because it's pretty synonymous to what 'if' means 
for list comprehensions which use the same keywords (that's why I called 
it "unification of ... syntax"), but I guess it's superfluous to discuss this 
if you're -1.

> Also, it would be a parsing conflict for the new conditional
> expressions (x if T else y).

It isn't, if you change the grammar to use testlist_safe as 
the 'in'-expression, just as is used for list comprehensions. As I've said in 
the PEP, I've created a patch that implements this, and Python's test suite 
passes cleanly (except for a little buglet in test_dis, which stems from the 
fact that the generated byte-code for a for-loop is slightly altered by this 
patch).

> This was proposed and rejected before.

I haven't seen this proposed before (at least not in PEP form, or with a 
working implementation against the current trunk, or just in some form of 
mail on python-dev), so that's why I posted this. But, if I've really 
repeated things that were proposed before, feel free to ignore this.

--- Heiko.

From skip at pobox.com  Sun May 21 18:00:32 2006
From: skip at pobox.com (skip at pobox.com)
Date: Sun, 21 May 2006 11:00:32 -0500
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <9e804ac0605210733j52079d84hf8c71a428131a15c@mail.gmail.com>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
	<20060519160443.GA7282@panix.com> <446DF917.4010709@ewtllc.com>
	<200605201336.36395.anthony@interlink.com.au>
	<ee2a432c0605210010q770d046ft2e67ae5df5898607@mail.gmail.com>
	<20060521135313.GA13431@panix.com>
	<9e804ac0605210733j52079d84hf8c71a428131a15c@mail.gmail.com>
Message-ID: <17520.36512.329978.835781@montanaro.dyndns.org>


    Thomas> Allow me to pitch in my support for speeding up the releases,
    Thomas> then. I don't believe alpha releases (of Python) get serious
    Thomas> testing.

Is there nobody else out there that uses Python built from the current svn
repository as the main Python interpreter on a day-to-day basis?  Maybe that
doesn't qualify as rigorous testing.  I've been doing that for several
years.  I can only remember once or twice needing to fall back to something
less bleeding edge for a short while.

Skip

From thomas at python.org  Sun May 21 18:07:51 2006
From: thomas at python.org (Thomas Wouters)
Date: Sun, 21 May 2006 18:07:51 +0200
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <17520.36512.329978.835781@montanaro.dyndns.org>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
	<20060519160443.GA7282@panix.com> <446DF917.4010709@ewtllc.com>
	<200605201336.36395.anthony@interlink.com.au>
	<ee2a432c0605210010q770d046ft2e67ae5df5898607@mail.gmail.com>
	<20060521135313.GA13431@panix.com>
	<9e804ac0605210733j52079d84hf8c71a428131a15c@mail.gmail.com>
	<17520.36512.329978.835781@montanaro.dyndns.org>
Message-ID: <9e804ac0605210907h2ba369f5ic83c56be105a62ce@mail.gmail.com>

On 5/21/06, skip at pobox.com <skip at pobox.com> wrote:
>
>
>     Thomas> Allow me to pitch in my support for speeding up the releases,
>     Thomas> then. I don't believe alpha releases (of Python) get serious
>     Thomas> testing.
>
> Is there nobody else out there that uses Python built from the current svn
> repository as the main Python interpreter on a day-to-day basis?  Maybe
> that
> doesn't qualify as rigorous testing.  I've been doing that for several
> years.  I can only remember once or twice needing to fall back to
> something
> less bleeding edge for a short while.


Sure, that counts. You only count as one, though (and so do I) -- and
neither of us is actually using an alpha release :-) I was talking about the
testing an alpha release attracts, in the wider population than just
python-dev subscribers. The fact that much testing goes on between releases
as well makes the alphas even less important. Wetting the appetites and
soliciting feedback on the major features is what they're for, in my eyes.

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060521/fce8209e/attachment.htm 

From steven.bethard at gmail.com  Sun May 21 18:08:08 2006
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 21 May 2006 10:08:08 -0600
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
	list-comp syntax
In-Reply-To: <200605211757.46187.me+python-dev@modelnine.org>
References: <200605211710.51720.me+python-dev@modelnine.org>
	<ca471dc20605210838g697d0bf6v215bc5d0c7d2d800@mail.gmail.com>
	<200605211757.46187.me+python-dev@modelnine.org>
Message-ID: <d11dcfba0605210908p59451449hf77e95956863a7ed@mail.gmail.com>

On 5/21/06, Heiko Wundram <me+python-dev at modelnine.org> wrote:
> Am Sonntag 21 Mai 2006 17:38 schrieb Guido van Rossum:
> > This was proposed and rejected before.
>
> I haven't seen this proposed before (at least not in PEP form, or with a
> working implementation against the current trunk, or just in some form of
> mail on python-dev), so that's why I posted this. But, if I've really
> repeated things that were proposed before, feel free to ignore this.

While this has been proposed before, I'd like to thank you for putting
together a full PEP and a working implementaiton.  I think you should
still submit the PEP, if for nothing else so that when the issue comes
up again, we can point to the PEP and explain that Guido's already
rejected it.

Steve
-- 
Grammar am for people who can't think for myself.
        --- Bucky Katt, Get Fuzzy

From tjreedy at udel.edu  Sun May 21 20:31:11 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 21 May 2006 14:31:11 -0400
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
	list-compsyntax
References: <200605211710.51720.me+python-dev@modelnine.org>
Message-ID: <e4qble$2ua$1@sea.gmane.org>


"Heiko Wundram" <me+python-dev at modelnine.org> wrote in message 
news:200605211710.51720.me+python-dev at modelnine.org...
>    As I've noticed that I find myself typing the latter quite often
>    in code I write, it would only be sensible to add the corresponding
>    syntax for the for statement:
>
>      for node in tree if node.haschildren():
>          <do something with node>
>
>    as syntactic sugar for:
>
>      for node in tree:
>          if not node.haschildren():
>              continue
>          <do something with node>

Isn't this the same as

      for node in tree:
          if  node.haschildren():
              <do something with node>

so that all you would save is ':\n' and the extra indents?

tjr





From me+python-dev at modelnine.org  Sun May 21 21:56:15 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Sun, 21 May 2006 21:56:15 +0200
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
	list-compsyntax
In-Reply-To: <e4qble$2ua$1@sea.gmane.org>
References: <200605211710.51720.me+python-dev@modelnine.org>
	<e4qble$2ua$1@sea.gmane.org>
Message-ID: <200605212156.15265.me+python-dev@modelnine.org>

Am Sonntag 21 Mai 2006 20:31 schrieb Terry Reedy:
> Isn't this the same as
>
>       for node in tree:
>           if  node.haschildren():
>               <do something with node>
>
> so that all you would save is ':\n' and the extra indents?

Saving an extra indent and the ':\n' makes it quite a bit easier for me to 
read and understand in the long run. I find:

for x in range(10):
    if not x % 2:
        print "Another even number:",
        print x
        ...
        print "Finished processing the even number"

a lot harder to read and understand than:

for x in range(10):
    if x % 2:
        continue
    print "Another even number:",
    print x
    ...
    print "Finished processing the even number"

because with the latter it's much more obvious for me how the flow of control 
in the indented block is structured (and to what part of the loop the 
condition applies, namely to the whole loop body). That's why I'd prefer 
the "combination" of the two as:

for x in range(10) if not x % 2:
    print "Another even number:",
    print x
    ...
    print "Finished processing the even number"

because the logic is immediately obvious (at least if you find the logic to be 
obvious when reading list comprehensions, simply because the format is 
similar), and there's no extra indentation (as in your example), which makes 
the block, as I said, a lot easier to parse for my brain.

But, probably, that's personal preference.

--- Heiko.

From me+python-dev at modelnine.org  Sun May 21 22:04:43 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Sun, 21 May 2006 22:04:43 +0200
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
	list-comp syntax
In-Reply-To: <d11dcfba0605210908p59451449hf77e95956863a7ed@mail.gmail.com>
References: <200605211710.51720.me+python-dev@modelnine.org>
	<200605211757.46187.me+python-dev@modelnine.org>
	<d11dcfba0605210908p59451449hf77e95956863a7ed@mail.gmail.com>
Message-ID: <200605212204.44240.me+python-dev@modelnine.org>

Am Sonntag 21 Mai 2006 18:08 schrieb Steven Bethard:
> While this has been proposed before, I'd like to thank you for putting
> together a full PEP and a working implementaiton.  I think you should
> still submit the PEP, if for nothing else so that when the issue comes
> up again, we can point to the PEP and explain that Guido's already
> rejected it.

I'll submit the PEP tomorrow, after I've reworded it slightly. I'll also add 
Guido's rejection notice to the PEP, so that a number can be assigned to it 
properly and it can be added to the PEP list on python.org.

--- Heiko.

From talin at acm.org  Sun May 21 22:11:02 2006
From: talin at acm.org (Talin)
Date: Sun, 21 May 2006 13:11:02 -0700
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
 list-comp syntax
In-Reply-To: <d11dcfba0605210908p59451449hf77e95956863a7ed@mail.gmail.com>
References: <200605211710.51720.me+python-dev@modelnine.org>	<ca471dc20605210838g697d0bf6v215bc5d0c7d2d800@mail.gmail.com>	<200605211757.46187.me+python-dev@modelnine.org>
	<d11dcfba0605210908p59451449hf77e95956863a7ed@mail.gmail.com>
Message-ID: <4470C956.8010002@acm.org>

Steven Bethard wrote:
> On 5/21/06, Heiko Wundram <me+python-dev at modelnine.org> wrote:
> 
>>Am Sonntag 21 Mai 2006 17:38 schrieb Guido van Rossum:
>>
>>>This was proposed and rejected before.
>>
>>I haven't seen this proposed before (at least not in PEP form, or with a
>>working implementation against the current trunk, or just in some form of
>>mail on python-dev), so that's why I posted this. But, if I've really
>>repeated things that were proposed before, feel free to ignore this.
> 
> 
> While this has been proposed before, I'd like to thank you for putting
> together a full PEP and a working implementaiton.  I think you should
> still submit the PEP, if for nothing else so that when the issue comes
> up again, we can point to the PEP and explain that Guido's already
> rejected it.
> 
> Steve

I'd like to second Steve's point. I'm impressed by the thoroughness and 
organization of the PEP.

As a general guideline, I've noticed that proposals which are purely 
syntactic sugar are unlikely to be accepted unless there is some 
additional benefit other than just compression of source code.

-- Talin

From me+python-dev at modelnine.org  Sun May 21 23:17:46 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Sun, 21 May 2006 23:17:46 +0200
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
	list-comp syntax
In-Reply-To: <4470C956.8010002@acm.org>
References: <200605211710.51720.me+python-dev@modelnine.org>
	<d11dcfba0605210908p59451449hf77e95956863a7ed@mail.gmail.com>
	<4470C956.8010002@acm.org>
Message-ID: <200605212317.46181.me+python-dev@modelnine.org>

Am Sonntag 21 Mai 2006 22:11 schrieb Talin:
> As a general guideline, I've noticed that proposals which are purely
> syntactic sugar are unlikely to be accepted unless there is some
> additional benefit other than just compression of source code.

I know about this, but generally, I find there's more to this "syntactic 
sugar" than just source code compression.

1) It unifies the syntax for list comprehensions and for loops, which use the 
same keywords (and are thus identified easily). Basically, you can then think 
of a for-loop as a list comprehension which evaluates a suite instead of an 
expression, and doesn't store the evaluated items, or vice versa, which at 
the moment you can't, because of the difference in setup.

2) Just as I've replied to Terry J. Reed, if you find list comprehensions easy 
to read, you're also bound to be able to understand what "for <expr> in 
<expr> if <expr>:" does, at least AFAICT.

3) Generally, indentation, as Terry J. Reed suggested, isn't always good. If 
the body of the loop is more than a few lines long (which happens more often 
than not in my code), extra indentation is bound to confuse me. That's why I 
use "if not <expr>: continue" at the moment, so that I don't need extra 
indentation. If I could just append the condition to the loop to get it out 
of the way (just as you do with a list comprehension), I save the 
obnoxious "if not <expr>: continue" (which destroys my read flow of the body 
somewhat too), and still get the same behaviour without any extra 
(errorprone, in my eyes) indentation.

Anyway, it's fine if Guido says this is -1. I don't need to discuss this, as I 
guess it's a pretty futile attempt to get people to consider this against 
Guido's will. I just found this to be something I'd personally longed for for 
a long time, and had time free today, so I thought I'd just implement it. ;-) 

And, additionally, nobody's keeping me from making my own Python tree where I 
can keep this patch for my very personal scripts where I don't need to have 
interoperability. This is open source, isn't it? ;-)

--- Heiko.

From tdelaney at avaya.com  Mon May 22 01:28:28 2006
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Mon, 22 May 2006 09:28:28 +1000
Subject: [Python-Dev] [Python-checkins] r46043 - peps/trunk/pep-0356.txt
Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1E702@au3010avexu1.global.avaya.com>

Steve Holden wrote:

> Will it be acceptable to add new (performance) changes between Alpha 3
> and beta 1, or would developers prefer to put a fourth alpha in to the
> schedule?
> 
> The intention is there should be no major functionality added by the
> sprint, simply that performance should be improved, so the schedule
> may work. It's just a call for the release manager really, I guess.

If there's no functionality changes, what would be the problem with
putting it in post-alpha?

Tim Delaney

From jcarlson at uci.edu  Mon May 22 01:59:25 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 21 May 2006 16:59:25 -0700
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
	list-comp syntax
In-Reply-To: <200605212317.46181.me+python-dev@modelnine.org>
References: <4470C956.8010002@acm.org>
	<200605212317.46181.me+python-dev@modelnine.org>
Message-ID: <20060521164055.689B.JCARLSON@uci.edu>


Heiko Wundram <me+python-dev at modelnine.org> wrote:
> 
> Am Sonntag 21 Mai 2006 22:11 schrieb Talin:
> > As a general guideline, I've noticed that proposals which are purely
> > syntactic sugar are unlikely to be accepted unless there is some
> > additional benefit other than just compression of source code.
> 
> I know about this, but generally, I find there's more to this "syntactic 
> sugar" than just source code compression.
> 
> 1) It unifies the syntax for list comprehensions and for loops, which use the 

No, it /partially unifies/ list comprehensions and for loops.  To
actually unify them, you would need to allow for arbitrarily nested fors
and ifs...

for ... in ... [if ...] [for ... in ... [if ...]]*:

If I remember correctly, this was why it wasn't accepted before; because
actual unification is ugly.  Further, because quite literally the only
difference is some indentation, a colon, and a line break, the vast
majority of people would likely stick with "the one obvious way to do it",
which has always worked.


> 2) Just as I've replied to Terry J. Reed, if you find list comprehensions easy 
> to read, you're also bound to be able to understand what "for <expr> in 
> <expr> if <expr>:" does, at least AFAICT.

Not everyone finds list comprehensions easy to read.


> 3) Generally, indentation, as Terry J. Reed suggested, isn't always good. If 
> the body of the loop is more than a few lines long (which happens more often 
> than not in my code), extra indentation is bound to confuse me. That's why I 
[snip]

I feel for you; I really do.  I've done the same thing myself. However,
I don't believe that it is a good practice, in general, and I don't
think that syntax should support this special case.


> And, additionally, nobody's keeping me from making my own Python tree where I 
> can keep this patch for my very personal scripts where I don't need to have 
> interoperability. This is open source, isn't it? ;-)

Do whatever you want.  Good luck with keeping your patches current
against whatever Python version you plan on using.


 - Josiah


From greg.ewing at canterbury.ac.nz  Mon May 22 02:22:03 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 May 2006 12:22:03 +1200
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
 list-comp syntax
In-Reply-To: <200605211710.51720.me+python-dev@modelnine.org>
References: <200605211710.51720.me+python-dev@modelnine.org>
Message-ID: <4471042B.70706@canterbury.ac.nz>

Heiko Wundram wrote:

>       for node in tree:
>           if not node.haschildren():
>               continue
>           <do something with node>

Er, you do realise that can be written more straightforwardly as

   for node in tree:
     if node.haschildren():
       <do something with node>

--
Greg

From greg.ewing at canterbury.ac.nz  Mon May 22 02:46:30 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 May 2006 12:46:30 +1200
Subject: [Python-Dev] PEP-xxx: Unification of for statement
 and	list-comp syntax
In-Reply-To: <200605212317.46181.me+python-dev@modelnine.org>
References: <200605211710.51720.me+python-dev@modelnine.org>
	<d11dcfba0605210908p59451449hf77e95956863a7ed@mail.gmail.com>
	<4470C956.8010002@acm.org>
	<200605212317.46181.me+python-dev@modelnine.org>
Message-ID: <447109E6.7060002@canterbury.ac.nz>

Heiko Wundram wrote:

> 2) Just as I've replied to Terry J. Reed, if you find list comprehensions easy 
> to read, you're also bound to be able to understand what "for <expr> in 
> <expr> if <expr>:" does, at least AFAICT.

I tend to write non-trivial LCs on multiple lines, e.g.

   l = [foo(x)
     for x in stuff
       if something_about(x)]

for the very reason that it makes them easier to read.

--
Greg

From me+python-dev at modelnine.org  Mon May 22 06:56:44 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Mon, 22 May 2006 06:56:44 +0200
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
	list-comp syntax
In-Reply-To: <20060521164055.689B.JCARLSON@uci.edu>
References: <4470C956.8010002@acm.org>
	<200605212317.46181.me+python-dev@modelnine.org>
	<20060521164055.689B.JCARLSON@uci.edu>
Message-ID: <200605220656.44601.me+python-dev@modelnine.org>

Am Montag 22 Mai 2006 01:59 schrieb Josiah Carlson:
> > 1) It unifies the syntax for list comprehensions and for loops, which use
> > the
>
> No, it /partially unifies/ list comprehensions and for loops.  To
> actually unify them, you would need to allow for arbitrarily nested fors
> and ifs...
>
> for ... in ... [if ...] [for ... in ... [if ...]]*:
>
> If I remember correctly, this was why it wasn't accepted before; because
> actual unification is ugly.

This syntax is ugly, that's why the PEP doesn't try to make a case for this. 
But, one level equivalence to list comprehensions isn't bad, again, at least 
in my eyes.

> > 2) Just as I've replied to Terry J. Reed, if you find list comprehensions
> > easy to read, you're also bound to be able to understand what "for <expr>
> > in <expr> if <expr>:" does, at least AFAICT.
>
> Not everyone finds list comprehensions easy to read.

Why has Python added list-comprehensions, then? (or at least, why has Python 
added the 'if'-expression to list-comprehensions if they're hard to read? 
filter/itertools/any of the proposed "workarounds" in the PEP would also work 
for list-comprehensions).

> > 3) Generally, indentation, as Terry J. Reed suggested, isn't always good.
> > If the body of the loop is more than a few lines long (which happens more
> > often than not in my code), extra indentation is bound to confuse me.
> > That's why I
>
> [snip]
>
> I feel for you; I really do.  I've done the same thing myself. However,
> I don't believe that it is a good practice, in general, and I don't
> think that syntax should support this special case.

Why isn't this good practice? It's not always sensible to refactor loop code 
to call methods (to make the loop body shorter), and it's a pretty general 
case that you only want to iterate over part of a generator, not over the 
whole content. Because of this, list comprehensions grew the 'if'-clause. So: 
why doesn't the for-loop?

--- Heiko.

From me+python-dev at modelnine.org  Mon May 22 06:57:58 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Mon, 22 May 2006 06:57:58 +0200
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
	list-comp syntax
In-Reply-To: <4471042B.70706@canterbury.ac.nz>
References: <200605211710.51720.me+python-dev@modelnine.org>
	<4471042B.70706@canterbury.ac.nz>
Message-ID: <200605220657.58925.me+python-dev@modelnine.org>

Am Montag 22 Mai 2006 02:22 schrieb Greg Ewing:
> Heiko Wundram wrote:
> >       for node in tree:
> >           if not node.haschildren():
> >               continue
> >           <do something with node>
>
> Er, you do realise that can be written more straightforwardly as
>
>    for node in tree:
>      if node.haschildren():
>        <do something with node>

Yes, of course. Read my replies to Terry J. Reed, to Josiah Carlton, to Talin, 
to see why I chose to compare it to the 'continue' syntax.

--- Heiko.

From me+python-dev at modelnine.org  Mon May 22 07:01:33 2006
From: me+python-dev at modelnine.org (Heiko Wundram)
Date: Mon, 22 May 2006 07:01:33 +0200
Subject: [Python-Dev]
	=?iso-8859-1?q?PEP-xxx=3A_Unification_of_for_stateme?=
	=?iso-8859-1?q?nt_and=09list-comp_syntax?=
In-Reply-To: <447109E6.7060002@canterbury.ac.nz>
References: <200605211710.51720.me+python-dev@modelnine.org>
	<200605212317.46181.me+python-dev@modelnine.org>
	<447109E6.7060002@canterbury.ac.nz>
Message-ID: <200605220701.33916.me+python-dev@modelnine.org>

Am Montag 22 Mai 2006 02:46 schrieben Sie:
> Heiko Wundram wrote:
> > 2) Just as I've replied to Terry J. Reed, if you find list comprehensions
> > easy to read, you're also bound to be able to understand what "for <expr>
> > in <expr> if <expr>:" does, at least AFAICT.
>
> I tend to write non-trivial LCs on multiple lines, e.g.
>
>    l = [foo(x)
>      for x in stuff
>        if something_about(x)]
>
> for the very reason that it makes them easier to read.

You can also do the same here (by using normal bracketing):

for <expr> in (<some>
               <non-trivial>
               <stuff>) if (<one expr> and
                            <two expr> and
                            <three expr>):
    foo(x)

--- Heiko.

From greg.ewing at canterbury.ac.nz  Mon May 22 08:25:57 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 May 2006 18:25:57 +1200
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
 list-comp syntax
In-Reply-To: <200605220657.58925.me+python-dev@modelnine.org>
References: <200605211710.51720.me+python-dev@modelnine.org>
	<4471042B.70706@canterbury.ac.nz>
	<200605220657.58925.me+python-dev@modelnine.org>
Message-ID: <44715975.2020108@canterbury.ac.nz>

Heiko Wundram wrote:

> Yes, of course. Read my replies to Terry J. Reed, to Josiah Carlton, to Talin, 
> to see why I chose to compare it to the 'continue' syntax.

I saw them. Your brain must be wired very differently
to mine, because I find loops with a continue in them
harder to follow than ones without -- exactly the
opposite of what you seem to prefer.

Also, I don't find an extra indendation level to be
a problem at all, unless the code under it is more
than a screen long -- in which case you've got big
readability problems already.

--
Greg

From greg.ewing at canterbury.ac.nz  Mon May 22 08:29:33 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 May 2006 18:29:33 +1200
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
 list-comp syntax
In-Reply-To: <200605220701.33916.me+python-dev@modelnine.org>
References: <200605211710.51720.me+python-dev@modelnine.org>
	<200605212317.46181.me+python-dev@modelnine.org>
	<447109E6.7060002@canterbury.ac.nz>
	<200605220701.33916.me+python-dev@modelnine.org>
Message-ID: <44715A4D.8080107@canterbury.ac.nz>

Heiko Wundram wrote:

> You can also do the same here (by using normal bracketing):
> 
> for <expr> in (<some>
>                <non-trivial>
>                <stuff>) if (<one expr> and
>                             <two expr> and
>                             <three expr>):

So you want to be able to write the if in-line,
and then format it so that it's no longer in-line?
The point of doing that eludes me...

--
Greg

From greg.ewing at canterbury.ac.nz  Mon May 22 08:44:47 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 May 2006 18:44:47 +1200
Subject: [Python-Dev] PEP-xxx: Unification of for statement
 and	list-comp syntax
In-Reply-To: <200605220656.44601.me+python-dev@modelnine.org>
References: <4470C956.8010002@acm.org>
	<200605212317.46181.me+python-dev@modelnine.org>
	<20060521164055.689B.JCARLSON@uci.edu>
	<200605220656.44601.me+python-dev@modelnine.org>
Message-ID: <44715DDF.10402@canterbury.ac.nz>

Heiko Wundram wrote:
> Am Montag 22 Mai 2006 01:59 schrieb Josiah Carlson:
> > Not everyone finds list comprehensions easy to read.
> 
> Why has Python added list-comprehensions, then? (or at least, why has Python 
> added the 'if'-expression to list-comprehensions if they're hard to read?

LCs are useful because they're expressions rather than statements.
Being expressions, they need if-clauses in order to be able to
conditionally include items in the list. LCs with if-clauses don't
*have* to be hard to read; you just need to lay them out on separate
lines, as you would do when writing nested statements.

With a for-statement, there is no need for if-clauses, since you
can use a nested if-statement to get the same effect. The only
possible reason for wanting an if-clause would be so that you
can write it on the same line, which reduces readability for no
corresponding benefit.

--
Greg

From niko at alum.mit.edu  Mon May 22 10:07:27 2006
From: niko at alum.mit.edu (Niko Matsakis)
Date: Mon, 22 May 2006 10:07:27 +0200
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
	list-comp syntax
In-Reply-To: <44715975.2020108@canterbury.ac.nz>
References: <200605211710.51720.me+python-dev@modelnine.org>
	<4471042B.70706@canterbury.ac.nz>
	<200605220657.58925.me+python-dev@modelnine.org>
	<44715975.2020108@canterbury.ac.nz>
Message-ID: <71CEF4AC-AA61-48A3-9C89-03F4E4998906@alum.mit.edu>

> I saw them. Your brain must be wired very differently
> to mine, because I find loops with a continue in them
> harder to follow than ones without -- exactly the
> opposite of what you seem to prefer.

Delurking for no particular reason:

For what it's worth, I also favor the continue syntax Heiko compared  
his code against.  Without it, you have to scroll to the end of the  
loop to know whether  there is an else clause; it also allows you to  
have multiple conditions expressed up-front without a lot of  
indentation.  In general, I favor the idea that the main flow of  
computation should have the least indentation.  I also am +1 on this  
PEP, even if it is doomed, as I've often longed to use list- 
comprehension like syntax in a for loop; to me it is mentally taxing  
to remember which syntax is supported where.

Just thought I'd throw Heiko some support. :)


Niko

From hannu at tm.ee  Sat May 20 18:53:56 2006
From: hannu at tm.ee (Hannu Krosing)
Date: Sat, 20 May 2006 19:53:56 +0300
Subject: [Python-Dev] Iterating generator from C (PostgreSQL's pl/python
	RETUN	SETOF/RECORD iterator support broken on RedHat buggy libs)
Message-ID: <1148144037.3833.89.camel@localhost.localdomain>

I try to move this to -dev as I hope there more people reading it who
are competent in internal working :). So please replay to -dev only.

-------------

The question is about use of generators in embedde v2.4 with asserts
enabled.

Can somebody explain, why the code in try2.c works with wrappers 2 and 3
but crashes on buggy exception for all others, that is pure generator
and wrappers 1,4,5 ?

In other words, what does [i in i for gen] do differently than other
ways of iterating over gen, which helps it to avoid the assert, and
also, how could I write something that is still a generator, but does
not trigger the assert.

While the right solution is to just fix python (which is done for v2.5)
we need a workaround for the following reason

We have added support for returning rows (as tuples and or dictionaries
or classes) and sets of boths scalars and rows to PostgreSQL's
pl/plpython embedded language, but have some objections to getting this
into distribution, because there is a bug in python 2.4 doing a wrong
assert and additional bug in how RedHat rpm build process which leaves
the buggy asserts in python.so.

So I hoped to write a simple wrapper class, but it only seems to work,
when the generator is turned into list, which is not a good solution as
it works directly against what generators are good for.

-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies O?
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: try2.c
Type: text/x-csrc
Size: 1998 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060520/7ab05f48/attachment.c 

From skip at pobox.com  Mon May 22 15:16:02 2006
From: skip at pobox.com (skip at pobox.com)
Date: Mon, 22 May 2006 08:16:02 -0500
Subject: [Python-Dev] [Python-checkins] r46043 - peps/trunk/pep-0356.txt
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1E702@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5FF1E702@au3010avexu1.global.avaya.com>
Message-ID: <17521.47506.887303.572688@montanaro.dyndns.org>


    Tim> If there's no functionality changes, what would be the problem with
    Tim> putting it in post-alpha?

It still represents new code that may introduce new bugs.  In theory (and
generally in practice for Python), once you move into the beta stage all you
do is fix bugs.

Skip

From thomas at python.org  Mon May 22 15:22:06 2006
From: thomas at python.org (Thomas Wouters)
Date: Mon, 22 May 2006 15:22:06 +0200
Subject: [Python-Dev] [Python-checkins] r46043 - peps/trunk/pep-0356.txt
In-Reply-To: <17521.47506.887303.572688@montanaro.dyndns.org>
References: <2773CAC687FD5F4689F526998C7E4E5FF1E702@au3010avexu1.global.avaya.com>
	<17521.47506.887303.572688@montanaro.dyndns.org>
Message-ID: <9e804ac0605220622l6b44e147ge2496b82d05454aa@mail.gmail.com>

On 5/22/06, skip at pobox.com <skip at pobox.com> wrote:
>
>
>     Tim> If there's no functionality changes, what would be the problem
> with
>     Tim> putting it in post-alpha?
>
> It still represents new code that may introduce new bugs.  In theory (and
> generally in practice for Python), once you move into the beta stage all
> you
> do is fix bugs.
>

Note that the problem isn't 'post-alpha', it's 'post-beta' -- new features
(especially minor or long-agreed-upon) are fine up until beta1 ;)

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060522/843b664b/attachment.htm 

From jcarlson at uci.edu  Mon May 22 21:20:12 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 22 May 2006 12:20:12 -0700
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
	list-comp syntax
In-Reply-To: <71CEF4AC-AA61-48A3-9C89-03F4E4998906@alum.mit.edu>
References: <44715975.2020108@canterbury.ac.nz>
	<71CEF4AC-AA61-48A3-9C89-03F4E4998906@alum.mit.edu>
Message-ID: <20060522121820.68B7.JCARLSON@uci.edu>


Niko Matsakis <niko at alum.mit.edu> wrote:
> 
> > I saw them. Your brain must be wired very differently
> > to mine, because I find loops with a continue in them
> > harder to follow than ones without -- exactly the
> > opposite of what you seem to prefer.
> 
> Delurking for no particular reason:
> 
> For what it's worth, I also favor the continue syntax Heiko compared  
> his code against.  Without it, you have to scroll to the end of the  
> loop to know whether  there is an else clause; it also allows you to  

One could also consider documenting the fact that there is no else
clause...

    for i in ...:
        if ...: #no else clause
            ...

 - Josiah


From jcarlson at uci.edu  Mon May 22 21:28:37 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 22 May 2006 12:28:37 -0700
Subject: [Python-Dev] PEP-xxx: Unification of for statement and
	list-comp syntax
In-Reply-To: <200605220656.44601.me+python-dev@modelnine.org>
References: <20060521164055.689B.JCARLSON@uci.edu>
	<200605220656.44601.me+python-dev@modelnine.org>
Message-ID: <20060522122023.68BA.JCARLSON@uci.edu>


Heiko Wundram <me+python-dev at modelnine.org> wrote:
> Why isn't this good practice? It's not always sensible to refactor loop code 
> to call methods (to make the loop body shorter), and it's a pretty general 
> case that you only want to iterate over part of a generator, not over the 
> whole content. Because of this, list comprehensions grew the 'if'-clause. So: 
> why doesn't the for-loop?

List comprehensions grew an if clause because they are expressions that
were supposed to replace one of the most the most common idioms in
Python:

    x = []
    for i in ...:
        if ...:
            x.append(...)

And you know what?  They've done that very well.  Seemingly so well that
you are asking for the list comprehension syntax to make it back into
for loops.  So far, you have failed to convince me (and most others it
looks like) that it is a good idea, you have re-hashed the same
arguments you made in the PEP, and Guido chimed in as the first message
to say:

    "-1. The contraction just makes it easier to miss the logic."
    "This was proposed and rejected before."
    - Guido

If you want some advice: let it drop.


 - Josiah


From pje at telecommunity.com  Mon May 22 21:33:00 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 22 May 2006 15:33:00 -0400
Subject: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib
In-Reply-To: <44526DD3.4010807@colorstudy.com>
References: <ca471dc20604281103o3cc9ea5fnd7e03dd724f7b0b0@mail.gmail.com>
	<ca471dc20604281103o3cc9ea5fnd7e03dd724f7b0b0@mail.gmail.com>
Message-ID: <5.1.1.6.0.20060522153116.03ea4468@mail.telecommunity.com>

At 02:32 PM 4/28/2006 -0500, Ian Bicking wrote:
>I'd like to include paste.lint with that as well (as wsgiref.lint or
>whatever).  Since the last discussion I enumerated in the docstring all
>the checks it does.  There's still some outstanding issues, mostly where
>I'm not sure if it is too restrictive (marked with @@ in the source).
>It's at:
>
>    http://svn.pythonpaste.org/Paste/trunk/paste/lint.py

Ian, I see this is under the MIT license.  Do you also have a PSF 
contributor agreement (to license under AFL/ASF)?  If not, can you place a 
copy of this under a compatible license so that I can add this to the 
version of wsgiref that gets checked into the stdlib?

(And if any PSF folks can chime in with easier ways to handle this, please do.)


From tdelaney at avaya.com  Mon May 22 23:24:58 2006
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Tue, 23 May 2006 07:24:58 +1000
Subject: [Python-Dev] [Python-checkins] r46043 - peps/trunk/pep-0356.txt
Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1E707@au3010avexu1.global.avaya.com>

Thomas Wouters wrote:

> On 5/22/06, skip at pobox.com <skip at pobox.com> wrote:
> 
>     Tim> If there's no functionality changes, what would be the
>     problem with Tim> putting it in post-alpha?
> 
> It still represents new code that may introduce new bugs.  In theory
> (and 
> generally in practice for Python), once you move into the beta stage
> all you 
> do is fix bugs.
> 
> 
> Note that the problem isn't 'post-alpha', it's 'post-beta' -- new
> features (especially minor or long-agreed-upon) are fine up until
> beta1 ;)  

Isn't "beta" post-alpha? <wink>

I did of course mean once we were out of the alpha stage. If they're
purely performance changes (no new functionality) then there's no
problem with putting it in during beta - IMO even if it introduces new
bugs (so long as those bugs are gone by the time 2.5 goes final).

Beta is "feature set frozen".

Tim Delaney

From tdelaney at avaya.com  Mon May 22 23:31:40 2006
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Tue, 23 May 2006 07:31:40 +1000
Subject: [Python-Dev] PEP-xxx: Unification of for statement andlist-comp
	syntax
Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1E708@au3010avexu1.global.avaya.com>

Niko Matsakis wrote:

> For what it's worth, I also favor the continue syntax Heiko compared
> his code against.

I also commonly do early tests and either break or continue, but my
criteria is whether it makes the flow easier to understand. Sometimes it
does, sometimes it doesn't.
(on two lines of course ;) where things are considerably more complex

> In general, I favor the idea that the main flow of
> computation should have the least indentation.

Ditto, simply because it tends to be more readable.

> I also am +1 on this
> PEP, even if it is doomed, as I've often longed to use list-
> comprehension like syntax in a for loop; to me it is mentally taxing
> to remember which syntax is supported where.
> 
> Just thought I'd throw Heiko some support. :)

I'm -1 because there are numerous other ways to do it that are just as
clear IMO. No support from me Heiko ;)

Tim Delaney

From greg.ewing at canterbury.ac.nz  Tue May 23 02:30:15 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 23 May 2006 12:30:15 +1200
Subject: [Python-Dev] PEP-xxx: Unification of for statement
 and	list-comp syntax
In-Reply-To: <20060522121820.68B7.JCARLSON@uci.edu>
References: <44715975.2020108@canterbury.ac.nz>
	<71CEF4AC-AA61-48A3-9C89-03F4E4998906@alum.mit.edu>
	<20060522121820.68B7.JCARLSON@uci.edu>
Message-ID: <44725797.7070603@canterbury.ac.nz>

Niko Matsakis <niko at alum.mit.edu> wrote:

> For what it's worth, I also favor the continue syntax Heiko compared  
> his code against.  Without it, you have to scroll to the end of the  
> loop to know whether  there is an else clause;

Only if the code doesn't fit on one screen, which it should.

--
Greg

From monica at calabashanimation.com  Mon May 22 17:37:00 2006
From: monica at calabashanimation.com (Monica Kendall)
Date: Mon, 22 May 2006 10:37:00 -0500
Subject: [Python-Dev]  Charles Waldman?
Message-ID: <26ADF5AC-F1C4-4461-B5D3-B046731E4389@calabashanimation.com>

Hi Skip,

Did you ever find Charles Waldman? We are looking for him too.

Monica


From guido at python.org  Tue May 23 05:19:27 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 22 May 2006 20:19:27 -0700
Subject: [Python-Dev] Looking for Memories of Python Old-Timers
Message-ID: <ca471dc20605222019l2ce6f7c2ycb00f9d34f6688f4@mail.gmail.com>

How long have you used Python? 10 years or longer? Please tell us how
you first heard of the language, how you first used it, and how you
helped develop it (if you did). More recent reminiscences are welcome
too!

Please add them to my blog:
http://www.artima.com/weblogs/viewpost.jsp?thread=161207

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ianb at colorstudy.com  Tue May 23 06:23:01 2006
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 22 May 2006 23:23:01 -0500
Subject: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib
In-Reply-To: <5.1.1.6.0.20060522153116.03ea4468@mail.telecommunity.com>
References: <ca471dc20604281103o3cc9ea5fnd7e03dd724f7b0b0@mail.gmail.com>
	<ca471dc20604281103o3cc9ea5fnd7e03dd724f7b0b0@mail.gmail.com>
	<5.1.1.6.0.20060522153116.03ea4468@mail.telecommunity.com>
Message-ID: <44728E25.60807@colorstudy.com>

Phillip J. Eby wrote:
> At 02:32 PM 4/28/2006 -0500, Ian Bicking wrote:
>> I'd like to include paste.lint with that as well (as wsgiref.lint or
>> whatever).  Since the last discussion I enumerated in the docstring all
>> the checks it does.  There's still some outstanding issues, mostly where
>> I'm not sure if it is too restrictive (marked with @@ in the source).
>> It's at:
>>
>>    http://svn.pythonpaste.org/Paste/trunk/paste/lint.py
> 
> Ian, I see this is under the MIT license.  Do you also have a PSF 
> contributor agreement (to license under AFL/ASF)?  If not, can you place 
> a copy of this under a compatible license so that I can add this to the 
> version of wsgiref that gets checked into the stdlib?

I don't have a contributor agreement.  I can change the license in 
place, or sign an agreement, or whatever; someone should just tell me 
what to do.


-- 
Ian Bicking  |  ianb at colorstudy.com  |  http://blog.ianbicking.org

From guido at python.org  Tue May 23 06:25:30 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 22 May 2006 21:25:30 -0700
Subject: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib
In-Reply-To: <44728E25.60807@colorstudy.com>
References: <ca471dc20604281103o3cc9ea5fnd7e03dd724f7b0b0@mail.gmail.com>
	<5.1.1.6.0.20060522153116.03ea4468@mail.telecommunity.com>
	<44728E25.60807@colorstudy.com>
Message-ID: <ca471dc20605222125t35c010fal553219b3416b3ed5@mail.gmail.com>

This explains what to do, and which license to use:

http://www.python.org/psf/contrib/

--Guido

On 5/22/06, Ian Bicking <ianb at colorstudy.com> wrote:
> Phillip J. Eby wrote:
> > At 02:32 PM 4/28/2006 -0500, Ian Bicking wrote:
> >> I'd like to include paste.lint with that as well (as wsgiref.lint or
> >> whatever).  Since the last discussion I enumerated in the docstring all
> >> the checks it does.  There's still some outstanding issues, mostly where
> >> I'm not sure if it is too restrictive (marked with @@ in the source).
> >> It's at:
> >>
> >>    http://svn.pythonpaste.org/Paste/trunk/paste/lint.py
> >
> > Ian, I see this is under the MIT license.  Do you also have a PSF
> > contributor agreement (to license under AFL/ASF)?  If not, can you place
> > a copy of this under a compatible license so that I can add this to the
> > version of wsgiref that gets checked into the stdlib?
>
> I don't have a contributor agreement.  I can change the license in
> place, or sign an agreement, or whatever; someone should just tell me
> what to do.
>
>
> --
> Ian Bicking  |  ianb at colorstudy.com  |  http://blog.ianbicking.org
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nnorwitz at gmail.com  Tue May 23 07:52:12 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 22 May 2006 22:52:12 -0700
Subject: [Python-Dev] [Python-checkins] r46064 - in python/trunk:
	Include/Python.h Include/pyport.h Misc/ACKS Misc/NEWS
	Modules/_localemodule.c Modules/main.c Modules/posixmodule.c
	Modules/sha512module.c PC/pyconfig.h Python/thread_nt.h
In-Reply-To: <20060522091613.3A64F1E4014@bag.python.org>
References: <20060522091613.3A64F1E4014@bag.python.org>
Message-ID: <ee2a432c0605222252m7a8ca95s3ad61af33c960464@mail.gmail.com>

On 5/22/06, martin.v.loewis <python-checkins at python.org> wrote:
> Author: martin.v.loewis
> Date: Mon May 22 11:15:18 2006
> New Revision: 46064
>
> Modified: python/trunk/Include/Python.h
> ==============================================================================
> --- python/trunk/Include/Python.h       (original)
> +++ python/trunk/Include/Python.h       Mon May 22 11:15:18 2006
> @@ -35,7 +35,9 @@
>  #endif
>
>  #include <string.h>
> +#ifndef DONT_HAVE_ERRNO_H
>  #include <errno.h>
> +#endif
>  #include <stdlib.h>
>  #ifdef HAVE_UNISTD_H
>  #include <unistd.h>

What is the reason for the DONT_HAVE_* macros?  Can we use the HAVE_*
versions?  There only seem to be 22 occurrences total:

Include/pyport.h:6
Modules/arraymodule.c:2
Modules/getbuildinfo.c:1
Modules/selectmodule.c:1
Objects/fileobject.c:3
PC/pyconfig.h:2
Python/ceval.c:1
Python/mystrtoul.c:1
Python/strtod.c:1
Python/thread.c:1
RISCOS/pyconfig.h:3

n

From pje at telecommunity.com  Tue May 23 08:01:20 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 23 May 2006 02:01:20 -0400
Subject: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib
In-Reply-To: <ca471dc20605222125t35c010fal553219b3416b3ed5@mail.gmail.co
 m>
References: <44728E25.60807@colorstudy.com>
	<ca471dc20604281103o3cc9ea5fnd7e03dd724f7b0b0@mail.gmail.com>
	<5.1.1.6.0.20060522153116.03ea4468@mail.telecommunity.com>
	<44728E25.60807@colorstudy.com>
Message-ID: <5.1.1.6.0.20060523015956.01f5efd8@mail.telecommunity.com>

It's not clear to me whether this means that Ian can just relicense his 
code for me to slap into wsgiref and thence into Python by virtue of my own 
PSF contribution form and the compatible license, or whether it means Ian 
has to sign a form too.

At 09:25 PM 5/22/2006 -0700, Guido van Rossum wrote:
>This explains what to do, and which license to use:
>
>http://www.python.org/psf/contrib/
>
>--Guido
>
>On 5/22/06, Ian Bicking <ianb at colorstudy.com> wrote:
>>Phillip J. Eby wrote:
>> > At 02:32 PM 4/28/2006 -0500, Ian Bicking wrote:
>> >> I'd like to include paste.lint with that as well (as wsgiref.lint or
>> >> whatever).  Since the last discussion I enumerated in the docstring all
>> >> the checks it does.  There's still some outstanding issues, mostly where
>> >> I'm not sure if it is too restrictive (marked with @@ in the source).
>> >> It's at:
>> >>
>> >>    http://svn.pythonpaste.org/Paste/trunk/paste/lint.py
>> >
>> > Ian, I see this is under the MIT license.  Do you also have a PSF
>> > contributor agreement (to license under AFL/ASF)?  If not, can you place
>> > a copy of this under a compatible license so that I can add this to the
>> > version of wsgiref that gets checked into the stdlib?
>>
>>I don't have a contributor agreement.  I can change the license in
>>place, or sign an agreement, or whatever; someone should just tell me
>>what to do.
>>
>>
>>--
>>Ian Bicking  |  ianb at colorstudy.com  |  http://blog.ianbicking.org
>>_______________________________________________
>>Python-Dev mailing list
>>Python-Dev at python.org
>>http://mail.python.org/mailman/listinfo/python-dev
>>Unsubscribe: 
>>http://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>--
>--Guido van Rossum (home page: http://www.python.org/~guido/)


From nnorwitz at gmail.com  Tue May 23 08:00:55 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 22 May 2006 23:00:55 -0700
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <20060521135313.GA13431@panix.com>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
	<20060519160443.GA7282@panix.com> <446DF917.4010709@ewtllc.com>
	<200605201336.36395.anthony@interlink.com.au>
	<ee2a432c0605210010q770d046ft2e67ae5df5898607@mail.gmail.com>
	<20060521135313.GA13431@panix.com>
Message-ID: <ee2a432c0605222300y7098f5adtec7726ed7c8a6327@mail.gmail.com>

Anthony's schedule is a bit up in the air which means this schedule
does not reflect what reality will be.  Perhaps we will skip a3
altogether which will give more time in a sense, though not in reality
since b1 in that case will hopefully be on or before June 14.  FWIW:

    alpha 3: May 25, 2006 [planned]
    beta 1:  June 14, 2006 [planned]
    beta 2:  July 6, 2006 [planned]
    rc 1:    July 27, 2006 [planned]
    final:   August 3, 2006 [planned]

As for wanting to get more things in, people ought to communicate that
here.  If there's something that should be added to the PEP, everyone
needs to know.

n
--

On 5/21/06, Aahz <aahz at pythoncraft.com> wrote:
> On Sun, May 21, 2006, Neal Norwitz wrote:
> > On 5/19/06, Anthony Baxter <anthony at interlink.com.au> wrote:
> >>
> >>Remember, the feature freeze isn't until beta1. New stuff can still go
> >>in after the next alpha, before beta1.
> >
> > I agree. Of course, it's preferable to get things in ASAP to get more
> > testing.
>
> Nevertheless, I still think that putting out an alpha during or right
> before a major sprint is awkward at best.  My testing is more focused on
> alpha releases than the trunk, and I think other people work similarly.
> Except for Anthony, the responses you've gotten to this surprising
> change have been negative; this doesn't seem in keeping with our usual
> decision-making process.
>
> BTW, the PEP on python.org hasn't been updated.  Please post the current
> schedule here.
> --
> Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/
>
> "I saw `cout' being shifted "Hello world" times to the left and stopped
> right there."  --Steve Gonedes
>

From tim.peters at gmail.com  Tue May 23 08:19:28 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 23 May 2006 06:19:28 +0000
Subject: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib
In-Reply-To: <5.1.1.6.0.20060523015956.01f5efd8@mail.telecommunity.com>
References: <ca471dc20604281103o3cc9ea5fnd7e03dd724f7b0b0@mail.gmail.com>
	<5.1.1.6.0.20060522153116.03ea4468@mail.telecommunity.com>
	<44728E25.60807@colorstudy.com>
	<5.1.1.6.0.20060523015956.01f5efd8@mail.telecommunity.com>
Message-ID: <1f7befae0605222319o4650d7c9ndc9a6e729e280562@mail.gmail.com>

[Phillip J. Eby]
> It's not clear to me whether this means that Ian can just relicense his
> code for me to slap into wsgiref and thence into Python by virtue of my own
> PSF contribution form and the compatible license, or whether it means Ian
> has to sign a form too.

It's clearly best if Ian signs a form:  as

    http://www.python.org/psf/contrib/

explained, one of the primary aims of the PSF form is to give the PSF
legal right to _re_license the work.  You have no such right if Ian
merely licenses his code to you, and so also no basis on which you can
represent that the PSF can relicense the portions of your work derived
from Ian's.  If Ian can sign a PSF form, that keeps the legalities as
clear as possible.

From martin at v.loewis.de  Tue May 23 08:38:08 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 23 May 2006 08:38:08 +0200
Subject: [Python-Dev] [Python-checkins] r46064 - in
 python/trunk:	Include/Python.h Include/pyport.h Misc/ACKS
 Misc/NEWS	Modules/_localemodule.c Modules/main.c
 Modules/posixmodule.c	Modules/sha512module.c PC/pyconfig.h
 Python/thread_nt.h
In-Reply-To: <ee2a432c0605222252m7a8ca95s3ad61af33c960464@mail.gmail.com>
References: <20060522091613.3A64F1E4014@bag.python.org>
	<ee2a432c0605222252m7a8ca95s3ad61af33c960464@mail.gmail.com>
Message-ID: <4472ADD0.5060109@v.loewis.de>

Neal Norwitz wrote:
> What is the reason for the DONT_HAVE_* macros?  Can we use the HAVE_*
> versions?

I think the actual rationale is that the contributor didn't want to be
bothered with modifying configure.in, and running autoconf.

I accepted the change since systems which don't have, say, <errno.h>,
are "unusual", in the sense that they aren't standard C systems
(as standard C mandates errno.h, atleast in a hosted environment).
So far, only one target platform doesn't have these things (Windows
CE), so the reasoning is that the burden of setting things right
should be on that system.

Of course, it would be fairly straight-forward to convert this
to standard autoconf machinery (if one remembers to update
PC/pyconfig.h accordingly). I'm sure Luke Dunstan would be willing
to revise the patch in that direction if that is agreed.

Regards,
Martin

From martin at v.loewis.de  Tue May 23 08:57:34 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 23 May 2006 08:57:34 +0200
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <ee2a432c0605222300y7098f5adtec7726ed7c8a6327@mail.gmail.com>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>	<20060519160443.GA7282@panix.com>
	<446DF917.4010709@ewtllc.com>	<200605201336.36395.anthony@interlink.com.au>	<ee2a432c0605210010q770d046ft2e67ae5df5898607@mail.gmail.com>	<20060521135313.GA13431@panix.com>
	<ee2a432c0605222300y7098f5adtec7726ed7c8a6327@mail.gmail.com>
Message-ID: <4472B25E.50408@v.loewis.de>

Neal Norwitz wrote:
> Anthony's schedule is a bit up in the air which means this schedule
> does not reflect what reality will be.  Perhaps we will skip a3
> altogether which will give more time in a sense, though not in reality
> since b1 in that case will hopefully be on or before June 14.  FWIW:
> 
>     alpha 3: May 25, 2006 [planned]
>     beta 1:  June 14, 2006 [planned]
>     beta 2:  July 6, 2006 [planned]
>     rc 1:    July 27, 2006 [planned]
>     final:   August 3, 2006 [planned]
> 
> As for wanting to get more things in, people ought to communicate that
> here.  If there's something that should be added to the PEP, everyone
> needs to know.

If alpha 3 doesn't happen until May 27 (which is what the PEP on the Web
*still* says, can somebody please fix the build process so that it
actually gets installed?), it should get dropped entirely. I'm not
available for creating installers in the following week, so the earliest
day to make alpha 3 would then be June 5, but traditionally, it would
be June 8.

Regards,
Martin

From tds333+pydev at gmail.com  Tue May 23 10:19:54 2006
From: tds333+pydev at gmail.com (Wolfgang Langner)
Date: Tue, 23 May 2006 10:19:54 +0200
Subject: [Python-Dev] Looking for Memories of Python Old-Timers
In-Reply-To: <ca471dc20605222019l2ce6f7c2ycb00f9d34f6688f4@mail.gmail.com>
References: <ca471dc20605222019l2ce6f7c2ycb00f9d34f6688f4@mail.gmail.com>
Message-ID: <4c45c1530605230119m703f2ab0p6d61c0fe1786efb6@mail.gmail.com>

Just to remember 10 years ago was the python 1.3 release.
I can't remember how long I use python but I can remember the first
release I used.

Use the cvs (svn) history of tags to get the date:

http://svn.python.org/view/python/tags/

I hope this helps. :-)

2006/5/23, Guido van Rossum <guido at python.org>:
> How long have you used Python? 10 years or longer? Please tell us how
> you first heard of the language, how you first used it, and how you
> helped develop it (if you did). More recent reminiscences are welcome
> too!

-- 
bye by Wolfgang

From ncoghlan at gmail.com  Tue May 23 11:41:01 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 23 May 2006 19:41:01 +1000
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <ee2a432c0605222300y7098f5adtec7726ed7c8a6327@mail.gmail.com>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>	<20060519160443.GA7282@panix.com>
	<446DF917.4010709@ewtllc.com>	<200605201336.36395.anthony@interlink.com.au>	<ee2a432c0605210010q770d046ft2e67ae5df5898607@mail.gmail.com>	<20060521135313.GA13431@panix.com>
	<ee2a432c0605222300y7098f5adtec7726ed7c8a6327@mail.gmail.com>
Message-ID: <4472D8AD.70008@gmail.com>

Neal Norwitz wrote:
> Anthony's schedule is a bit up in the air which means this schedule
> does not reflect what reality will be.  Perhaps we will skip a3
> altogether which will give more time in a sense, though not in reality
> since b1 in that case will hopefully be on or before June 14.  FWIW:
> 
>     alpha 3: May 25, 2006 [planned]
>     beta 1:  June 14, 2006 [planned]
>     beta 2:  July 6, 2006 [planned]
>     rc 1:    July 27, 2006 [planned]
>     final:   August 3, 2006 [planned]
> 
> As for wanting to get more things in, people ought to communicate that
> here.  If there's something that should be added to the PEP, everyone
> needs to know.

Skipping alpha 3 wouldn't be a problem for the functional->functools move (and 
the new functools functions), which is the only thing I really want to get in 
before beta 1. Any errors created by that should be shallow ones that Buildbot 
will pick up.

So no objections here to going straight from a2 to b1 (although I'll try to 
get the module move done this week anyway).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From jafo-python-dev at tummy.com  Tue May 23 12:07:13 2006
From: jafo-python-dev at tummy.com (Sean Reifschneider)
Date: Tue, 23 May 2006 04:07:13 -0600
Subject: [Python-Dev] Looking for Memories of Python Old-Timers
In-Reply-To: <4c45c1530605230119m703f2ab0p6d61c0fe1786efb6@mail.gmail.com>
References: <ca471dc20605222019l2ce6f7c2ycb00f9d34f6688f4@mail.gmail.com>
	<4c45c1530605230119m703f2ab0p6d61c0fe1786efb6@mail.gmail.com>
Message-ID: <20060523100712.GA32617@tummy.com>

2006/5/23, Guido van Rossum <guido at python.org>:
> How long have you used Python? 10 years or longer? Please tell us how
> you first heard of the language, how you first used it, and how you
> helped develop it (if you did). More recent reminiscences are welcome
> too!

I'm almost at the 10 year mark now.  I'll just blab a bit.

I'm a C guy from way back.  Back in the <gasp> '80s I liked C, but by the
mid-90s I was pretty tired of working at such a low level, having to do so
much code to do string manipulations, having to cut-and-paste code for
getting dynamic list code, etc...  I liked that C was small and I could
hold it all in my head, but it was just too low-level.

The problem was that I was starting to lose my immortality.  As I grew
older I just wanted to be more productive so I'd spend less time doing the
same damn thing over and over just to make progress.

Because I liked C, I really wanted C++ to be an option.  In 1995 I tried
out C++, but the compilers and environment then were extremely immature.
Templates were horrible.  I spent about 6 to 9 months giving it the
"college-try".  It just wasn't workable then though.

Next I tried Perl, another 6 to 9 months.  It made me realize that a
"scripting language" could be fast enough for much of what I was doing, and
the POSIX-like API was very familiar to my C and Unix background.  However,
I never got comfortable with Perl, just passing lists to functions never
seemed to work the way I expected.

Then I tried Java, only for about 3 months.  I mostly liked the language,
though it was much to "object-centric" for my taste.  By that, I mean even
a simple "hello world" program HAS to be an object.  My biggest complaint
was that in 1997, Java under Linux was in a horrible state.  Long running
processes would leak horribly, compiling was slow even on my mighty, mighty
dual PPro 200 box, and anything I wrote in Java was 32MB of RAM just to
start, even without the memory leaking.

A friend of mine, John Shipman, mentioned right around then that he was
learning Python and really liking it.  I took a look at it shortly after
Evelyn and I moved to our new house in 1998.

Python was everything I had wanted C++ to be.  Like C, it was simple and
easy to keep in my head.  It had a rich library, which appealed to my
desires for having to write less code.  It was operating at such a high
level that I felt productive.

I was working with an ISP at the time, and there was this Perl application
that generated accounting reports of dial-in user activity.  It had bugs
where it would generate slightly or sometimes vastly wrong data.  Worse, it
used RAM for holding all the data to generate the accounting.  It was
pretty easy to fill up all the memory for even small or medium ISPs
records.

My first program was a re-implementation of the Perl program, with fewer
bugs.  I also used the shelve library to keep the accounting data out of
RAM.  It worked brilliantly.

Mostly I haven't done as much development as I would like.  I built a lot
of the Python RPMs for Linux, which is my primary contribution.  I also
have done some documentation changes and have added the string.rfind()
methods.  I've done a few things on pydotorg for the web-site, including
sprinting on the new site at PyCon.

Sean
-- 
 The price of freedom is responsibility, but it's a bargain because freedom
 is priceless.  -- Hugh Downs
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From jafo-python-dev at tummy.com  Tue May 23 13:08:06 2006
From: jafo-python-dev at tummy.com (Sean Reifschneider)
Date: Tue, 23 May 2006 05:08:06 -0600
Subject: [Python-Dev] Changing python int to "long long".
Message-ID: <20060523110805.GB32617@tummy.com>

We've been discussing the possibility of converting the Python int type to
long long (from long).  I played around with it some, and it's going to be
a huge change that probably will break most C extensions until.  However,
as uncletimmy says, "Python is so stinking slow" that it probably won't
make much of a negative impact on the performance.

The reason we're looking at this at Need for Speed is that EWT has a lot of
code that uses ints between the 32 and 64 bit range.  It'd be a hard change
to make to convert to long long for regular ints.  I'm going to speak with
John and Runar today about if we can make just a long long extension type
and live with it not integrating with other type coercion as well as an int
would.

The big deal right now is on 32 bit platforms, giving the 64-bits for int.
However, it will also be a win for 64-bit platforms for ints that fall
between 64 and 128 bits.

So, I ran pybench on a half-arsed conversion of ints to long long, and only
lost around 1.11% over all.  However, for simple math I got a solid 25%
speed-up (n = ((((i << 33) * 2) + 1) / 3) - 2), and for 1<<35 I got a 34%
speed up, compared with them getting converted to Python longs.

For ints that will fit entirely within 32-bits, I'm seeing around 11% worse
performance when going to 64-bit types.

This is all on the 2.5 trunk.

My conclusion is that it's probably enough of a performance drop that we
can't just do it wholesale.  It has a big gain for ints within 32 and 64
on 32-bit platforms, but for <32-bit ints it's going to be pretty bad.
Within the next few years many platforms, particularly performance-critical
ones, will probably be 64 bit anyway, so fewer people will probably be able
to see an advantage to it.

The ideal would probably be to have another int type and up-convert to a
long long, THEN go to arbitrary precision, but that's a pretty big change.
Less big than converting the int to long long though.  The easiest would be
to make an extension that has long long, if that will work for speeding up
applications like EWT has.  I'll follow up with them on this.

Any thoughts on this?

Thanks,
Sean
-- 
 Linux:  When you need to run like a greased weasel.
                 -- Sean Reifschneider, 1998
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From facundobatista at gmail.com  Tue May 23 14:32:40 2006
From: facundobatista at gmail.com (Facundo Batista)
Date: Tue, 23 May 2006 09:32:40 -0300
Subject: [Python-Dev] Decimal and Exponentiation
In-Reply-To: <1f7befae0605191418q280ff3f3pa7585f25d9d3743e@mail.gmail.com>
References: <1148060287.088387.41150@j55g2000cwa.googlegroups.com>
	<1f7befae0605191418q280ff3f3pa7585f25d9d3743e@mail.gmail.com>
Message-ID: <e04bdf310605230532l1855cbc0v9b73c9478459354b@mail.gmail.com>

2006/5/19, Tim Peters <tim.peters at gmail.com>:

> If you're not a numeric expert, I wouldn't recommend that you try this
> yourself (in particular, trying to implement x**y as exp(ln(x)*y)
> using the same precision is mathematically correct but is numerically
> badly naive).

I'd start to see this not before two weeks (I have a conference, and
need to finish my papers).

TIm, we both know that I'm not, under any point of view, a numeric
expert. So, I'd ask you a favor.

Could you please send here some examples, for a given precision, of
perilous "not-int ** not-int" situations, just to add them to the test
cases and be sure that the modifications to Decimal are safe enough?

Or just point to some docs?

Thank you very much!

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From tim.peters at gmail.com  Tue May 23 16:48:52 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 23 May 2006 14:48:52 +0000
Subject: [Python-Dev] Decimal and Exponentiation
In-Reply-To: <e04bdf310605230532l1855cbc0v9b73c9478459354b@mail.gmail.com>
References: <1148060287.088387.41150@j55g2000cwa.googlegroups.com>
	<1f7befae0605191418q280ff3f3pa7585f25d9d3743e@mail.gmail.com>
	<e04bdf310605230532l1855cbc0v9b73c9478459354b@mail.gmail.com>
Message-ID: <1f7befae0605230748q51a8caf1k38ee6920e81562ff@mail.gmail.com>

[Facundo Batista]
> I'd start to see this not before two weeks (I have a conference, and
> need to finish my papers).
>
> TIm, we both know that I'm not, under any point of view, a numeric
> expert. So, I'd ask you a favor.
>
> Could you please send here some examples, for a given precision, of
> perilous "not-int ** not-int" situations, just to add them to the test
> cases and be sure that the modifications to Decimal are safe enough?
>
> Or just point to some docs?

This is all part of IBM's spec (31 Mar 2006) now:

    http://www2.hursley.ibm.com/decimal/decarith.html

Here's the change log from December:

"""
Changes in Draft 1.50 (9 December 2005)

    * The exp operation for raising e to a power has been added.
    * The ln natural logarithm and log10 base 10 logarithm operations
have been added.
    * The power operation has been redefined to allow raising a number
to a non-integral power.
"""

The change log for the test vectors (at the same site) implies the
tests have also been updated to match:

"""
Changes in Version 2.35 (27 November 2005)

    * The exp, ln, and log10 operations and testcase groups have been added.
    * The power testcase group has been greatly extended, to cover
non-integer second operands, and the powersqrt testcase group has been
added.
"""

So tests probably aren't a problem.  Implementation will be, and I'll
do that if I can -- designing efficient arbitrary-precision
transcendental functions is an obscure skill.

From coder_infidel at hotmail.com  Tue May 23 17:28:42 2006
From: coder_infidel at hotmail.com (Luke Dunstan)
Date: Tue, 23 May 2006 23:28:42 +0800
Subject: [Python-Dev] [Python-checkins] r46064 - in
	python/trunk:	Include/Python.h Include/pyport.h Misc/ACKS
	Misc/NEWS	Modules/_localemodule.c Modules/main.c
	Modules/posixmodule.c	Modules/sha512module.c PC/pyconfig.h
	Python/thread_nt.h
References: <20060522091613.3A64F1E4014@bag.python.org><ee2a432c0605222252m7a8ca95s3ad61af33c960464@mail.gmail.com>
	<4472ADD0.5060109@v.loewis.de>
Message-ID: <BAY109-DAV147E15E853B6B7D289816D8A9B0@phx.gbl>


----- Original Message ----- 
From: ""Martin v. L?wis"" <martin at v.loewis.de>
To: "Neal Norwitz" <nnorwitz at gmail.com>
Cc: "Python Dev" <python-dev at python.org>
Sent: Tuesday, May 23, 2006 2:38 PM
Subject: Re: [Python-Dev] [Python-checkins] r46064 - in python/trunk: 
Include/Python.h Include/pyport.h Misc/ACKS Misc/NEWS 
Modules/_localemodule.c Modules/main.c Modules/posixmodule.c 
Modules/sha512module.c PC/pyconfig.h Python/thread_nt.h


> Neal Norwitz wrote:
>> What is the reason for the DONT_HAVE_* macros?  Can we use the HAVE_*
>> versions?
>
> I think the actual rationale is that the contributor didn't want to be
> bothered with modifying configure.in, and running autoconf.

I thought that smaller patches would generally be preferred, but yes it is 
also more effort to modify configure, especially when doing development on 
Windows. In the case of errno.h there was already 3 occurrences of 
DONT_HAVE_ERRNO_H in the source and no occurrences of HAVE_ERRNO_H, and I 
wanted to be consistent.

> I accepted the change since systems which don't have, say, <errno.h>,
> are "unusual", in the sense that they aren't standard C systems
> (as standard C mandates errno.h, atleast in a hosted environment).
> So far, only one target platform doesn't have these things (Windows
> CE), so the reasoning is that the burden of setting things right
> should be on that system.

Yes, this is also my reasoning and is supported by the comments in 
Include/pyport.h about sys/stat.h.

> Of course, it would be fairly straight-forward to convert this
> to standard autoconf machinery (if one remembers to update
> PC/pyconfig.h accordingly). I'm sure Luke Dunstan would be willing
> to revise the patch in that direction if that is agreed.
>
> Regards,
> Martin

Yes, I can do that. But what about the other 3 versions of pyconfig.h in 
platform subdirectories?

Luke

From guido at python.org  Tue May 23 18:47:56 2006
From: guido at python.org (Guido van Rossum)
Date: Tue, 23 May 2006 09:47:56 -0700
Subject: [Python-Dev] Changing python int to "long long".
In-Reply-To: <20060523110805.GB32617@tummy.com>
References: <20060523110805.GB32617@tummy.com>
Message-ID: <ca471dc20605230947i5970a41eyab865d46d0fe9f18@mail.gmail.com>

On 5/23/06, Sean Reifschneider <jafo-python-dev at tummy.com> wrote:
> We've been discussing the possibility of converting the Python int type to
> long long (from long).  I played around with it some, and it's going to be
> a huge change that probably will break most C extensions until.  However,
> as uncletimmy says, "Python is so stinking slow" that it probably won't
> make much of a negative impact on the performance.

Based on the stability picture alone I think this must wait for 2.6.

> The reason we're looking at this at Need for Speed is that EWT has a lot of
> code that uses ints between the 32 and 64 bit range.  It'd be a hard change
> to make to convert to long long for regular ints.  I'm going to speak with
> John and Runar today about if we can make just a long long extension type
> and live with it not integrating with other type coercion as well as an int
> would.

Actually it should be able to integrate almost perfectly given that
2.5 has __index__.

> The big deal right now is on 32 bit platforms, giving the 64-bits for int.
> However, it will also be a win for 64-bit platforms for ints that fall
> between 64 and 128 bits.

But who needs those? And who says that long long is 128 bits on a
64-bit machine? (At least on Windows 64, long is still 32 bit, so I
expect long long to be 64 bit there.)

> So, I ran pybench on a half-arsed conversion of ints to long long, and only
> lost around 1.11% over all.  However, for simple math I got a solid 25%
> speed-up (n = ((((i << 33) * 2) + 1) / 3) - 2), and for 1<<35 I got a 34%
> speed up, compared with them getting converted to Python longs.
>
> For ints that will fit entirely within 32-bits, I'm seeing around 11% worse
> performance when going to 64-bit types.
>
> This is all on the 2.5 trunk.
>
> My conclusion is that it's probably enough of a performance drop that we
> can't just do it wholesale.  It has a big gain for ints within 32 and 64
> on 32-bit platforms, but for <32-bit ints it's going to be pretty bad.
> Within the next few years many platforms, particularly performance-critical
> ones, will probably be 64 bit anyway, so fewer people will probably be able
> to see an advantage to it.

Indeed. I say EWT can solve the problem much cheaper by using 64-bit
hardware, compared to the cost imposed on the rest of the world.

> The ideal would probably be to have another int type and up-convert to a
> long long, THEN go to arbitrary precision, but that's a pretty big change.
> Less big than converting the int to long long though.  The easiest would be
> to make an extension that has long long, if that will work for speeding up
> applications like EWT has.  I'll follow up with them on this.
>
> Any thoughts on this?

In 2.6, I'd be okay with standardizing int on 64 bits everywhere (I
don't think bothering with 128 bits on 64-bit platforms is worth it).
In 2.5, I think we should leave this alone.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From brett at python.org  Tue May 23 21:46:52 2006
From: brett at python.org (Brett Cannon)
Date: Tue, 23 May 2006 12:46:52 -0700
Subject: [Python-Dev] SSH key for work computer
Message-ID: <bbaeab100605231246g5ef4eba7tb37d217540e85764@mail.gmail.com>

Can someone install the attached SSH key (it's for my work machine)?  The
fingerprint is::

 cd:69:15:52:b2:e5:dc:2e:73:f1:62:1a:12:49:2b:a1 bcannon at google.com


Also, how hard is it to have a specific key uninstalled?  The reason I ask
is that my internship is only for three months so that after that the key
won't be valid.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060523/38413a7b/attachment.htm 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: id_dsa.pub
Type: application/octet-stream
Size: 1121 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060523/38413a7b/attachment.obj 

From jimjjewett at gmail.com  Tue May 23 22:30:55 2006
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 23 May 2006 16:30:55 -0400
Subject: [Python-Dev] 2.5 schedule
Message-ID: <fb6fbf560605231330n145e0266o891603c0b31d8352@mail.gmail.com>

Thomas Wouters  wrote:

> The fact that much testing goes on between releases
> as well makes the alphas even less important.

I suspect the amount of MS Windows testing done between releases is
fairly small.  And what little there is doesn't always use the same
compiler that will eventually be used for the final release.

A snapshot build would fill this gap almost as well as a formal alpha.

-jJ

From tim.peters at gmail.com  Tue May 23 22:50:17 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 23 May 2006 20:50:17 +0000
Subject: [Python-Dev] SSH key for work computer
In-Reply-To: <bbaeab100605231246g5ef4eba7tb37d217540e85764@mail.gmail.com>
References: <bbaeab100605231246g5ef4eba7tb37d217540e85764@mail.gmail.com>
Message-ID: <1f7befae0605231350v3ec9543fs3b5cb194238c8ecf@mail.gmail.com>

[Brett Cannon]
> Can someone install the attached SSH key (it's for my work machine)?  The
> fingerprint is::
>
>  cd:69:15:52:b2:e5:dc:2e:73:f1:62:1a:12:49:2b:a1
> bcannon at google.com

I tried.  Scream at someone else if it didn't work ;-)

> Also, how hard is it to have a specific key uninstalled?

While I have yet to try that, it appears dead easy, provided you
remember the comment on the key in question ("bcannon at google.com" in
this case).

From martin at v.loewis.de  Tue May 23 23:53:12 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 23 May 2006 23:53:12 +0200
Subject: [Python-Dev] [Python-checkins] r46064 -
 in	python/trunk:	Include/Python.h Include/pyport.h
 Misc/ACKS	Misc/NEWS	Modules/_localemodule.c Modules/main.c
 Modules/posixmodule.c	Modules/sha512module.c
 PC/pyconfig.h	Python/thread_nt.h
In-Reply-To: <BAY109-DAV147E15E853B6B7D289816D8A9B0@phx.gbl>
References: <20060522091613.3A64F1E4014@bag.python.org><ee2a432c0605222252m7a8ca95s3ad61af33c960464@mail.gmail.com>	<4472ADD0.5060109@v.loewis.de>
	<BAY109-DAV147E15E853B6B7D289816D8A9B0@phx.gbl>
Message-ID: <44738448.1030803@v.loewis.de>

Luke Dunstan wrote:
> Yes, I can do that. But what about the other 3 versions of pyconfig.h in 
> platform subdirectories?

If you think you can fix them, go ahead and include changes; no need to
test these changes. If they are wrong, some user of the platform will
hopefully provide fixes.

Likewise, if you don't think you can fix them, don't bother - some
user of the platform will provide the missing pieces.

Or, if no contributions from a user arrive, it either means that
it just works fine, or that there are no users.

Regards,
Martin

From martin at v.loewis.de  Wed May 24 00:01:10 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 24 May 2006 00:01:10 +0200
Subject: [Python-Dev] Changing python int to "long long".
In-Reply-To: <20060523110805.GB32617@tummy.com>
References: <20060523110805.GB32617@tummy.com>
Message-ID: <44738626.4000706@v.loewis.de>

Sean Reifschneider wrote:
> The big deal right now is on 32 bit platforms, giving the 64-bits for int.
> However, it will also be a win for 64-bit platforms for ints that fall
> between 64 and 128 bits.

As Guido suggests: long long isn't 128 bits on most 64-bit platforms
(AFAIK).

> My conclusion is that it's probably enough of a performance drop that we
> can't just do it wholesale.  It has a big gain for ints within 32 and 64
> on 32-bit platforms, but for <32-bit ints it's going to be pretty bad.
> Within the next few years many platforms, particularly performance-critical
> ones, will probably be 64 bit anyway, so fewer people will probably be able
> to see an advantage to it.

I'm more worried about the actual breakage instead of the speed loss.

For example, what will the "i" modifier for PyArg_ParseTuple do?
What will PyInt_AsLong do? What about systems where "long long"
is not supported?

Not that those can't be resolved (they were resolved, in some way or
the other, for ssize_t). However, I think a complete specification of
the proposed changes should be available before integrating this
into Python.

Regards,
Martin

From tim.peters at gmail.com  Wed May 24 00:27:39 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 23 May 2006 22:27:39 +0000
Subject: [Python-Dev] Changing python int to "long long".
In-Reply-To: <ca471dc20605230947i5970a41eyab865d46d0fe9f18@mail.gmail.com>
References: <20060523110805.GB32617@tummy.com>
	<ca471dc20605230947i5970a41eyab865d46d0fe9f18@mail.gmail.com>
Message-ID: <1f7befae0605231527x485be893od0758f60b1e68619@mail.gmail.com>

[Guido]
> ...
> In 2.6, I'd be okay with standardizing int on 64 bits everywhere (I
> don't think bothering with 128 bits on 64-bit platforms is worth it).
> In 2.5, I think we should leave this alone.

Nobody panic.  This wasn't on the table for 2.5, and as Martin points
out it needs more specification regardless.  If the current ~10%
slowdown for 32-bit ints in its presence (as measured by something
;-)) persists, I expect it would have a justfiably hard time getting
into 2.6.

I'm blessing small language changes at the sprint, but so far (and I
expect going on) they're harmless.  For example, Georg Brandl wanted
to change PyErr_NewException() to accept a tuple of base classes,
because some of the decimal module's exceptions have multiple bases
and coding that bit in C was needlessly convoluted without liberating
PyErr_NewException() from its historic single-inheritance limitation.
I think that's fine, so approved it.

Nevertheless, if someone feels a patch committed to the trunk goes
over the edge, do complain to me:  it's my job to push back here when
needed.  OTOH, I expect other "wild ideas" from the sprint to be
brought up here, and when they are please just assume that they're not
being proposed for 2.5 unless that's explicitly stated.

From anthony at interlink.com.au  Wed May 24 03:57:11 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed, 24 May 2006 11:57:11 +1000
Subject: [Python-Dev] 2.5 schedule
In-Reply-To: <ee2a432c0605222300y7098f5adtec7726ed7c8a6327@mail.gmail.com>
References: <ee2a432c0605182206h3a07b1f7v31a9a8ea7da938ad@mail.gmail.com>
	<20060521135313.GA13431@panix.com>
	<ee2a432c0605222300y7098f5adtec7726ed7c8a6327@mail.gmail.com>
Message-ID: <200605241157.17651.anthony@interlink.com.au>

Ok - we're going to skip the 3rd alpha, and the next release will be 
beta1, currently scheduled for June 14th. With all the changes this 
week from the needforspeed sprint, cutting a release seems like too 
much of a risk.


From dalke at dalkescientific.com  Tue May 23 16:41:18 2006
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Tue, 23 May 2006 14:41:18 +0000
Subject: [Python-Dev] PySequence_Fast_GET_ITEM in string join
Message-ID: <e412b467a9baf3f98e0421cb428d1dd6@dalkescientific.com>

The Sourceforge tracker is kaputt so I'm sending it here, in part
because the effbot says it's interesting.

I can derive from list and override __getitem__

 >>> class Spam(list):
...   def __getitem__(self, i):
...     print i
...     return 2
...
 >>> Spam()[1]
1
2
 >>> Spam()[9]
9
2
 >>>

Now consider the following

 >>> class Spam(list):
...   def __getitem__(self, i):
...     print "Asking for", i
...     if i == 0: return "zero"
...     if i == 1: return "one"
...     raise IndexError, i
...
 >>> Spam()[1]
Asking for 1
'one'
 >>>

Spiffy!  For my next trick....

 >>> "".join(Spam())
''
 >>>

The relevant code in stringobject uses PySequence_Fast_GET_ITEM(seq, i),
which likely doesn't know about my derived __getitem__.

         p = PyString_AS_STRING(res);
         for (i = 0; i < seqlen; ++i) {
                 size_t n;
                 item = PySequence_Fast_GET_ITEM(seq, i);
                 n = PyString_GET_SIZE(item);
                 memcpy(p, PyString_AS_STRING(item), n);
                 p += n;
                 if (i < seqlen - 1) {
                         memcpy(p, sep, seplen);
                         p += seplen;
                 }
         }


The Unicode join has the same problem

 >>> class Spam(list):
...   def __getitem__(self, i):
...     print "Asking for", i
...     if i == 0: return "zero"
...     if i == 1: return u"one"
...     raise IndexError, i
...
 >>> "".join(Spam())
''

While if my class is not derived from list, everything is copacetic.

 >>> class Spam:
...   def __getitem__(self, i):
...     print "Asking for", i
...     if i == 0: return "zero"
...     if i == 1: return u"one"
...     raise IndexError, i
...
 >>> "".join(Spam())
Asking for 0
Asking for 1
Asking for 2
u'zeroone'
 >>>

Ditto for deriving from object.

 >>> class Spam(object):
...   def __getitem__(self, i):
...     print "Asking for", i
...     if i == 0: return "zero"
...     if i == 1: return "one"
...     raise IndexError, i
...
 >>> "".join(Spam())
Asking for 0
Asking for 1
Asking for 2
'zeroone'
 >>>

					Andrew
					dalke at dalkescientific.com


From dalke at dalkescientific.com  Tue May 23 16:52:21 2006
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Tue, 23 May 2006 14:52:21 +0000
Subject: [Python-Dev] PySequence_Fast_GET_ITEM in string join
In-Reply-To: <e412b467a9baf3f98e0421cb428d1dd6@dalkescientific.com>
References: <e412b467a9baf3f98e0421cb428d1dd6@dalkescientific.com>
Message-ID: <edcd9e4d2c7dbcdb077ddbd6be0a9467@dalkescientific.com>

Me [Andrew Dalke] said:
> The relevant code in stringobject uses PySequence_Fast_GET_ITEM(seq, 
> i),
> which likely doesn't know about my derived __getitem__.

Oops, I didn't know what the code was doing well enough.  The
relevant problem is

         seq = PySequence_Fast(orig, "");

That calls __iter__ on my derived list object, which knows nothing
about my overridden __getitem__

So things are working as designed.

Well, back to blundering about.  Too much brennivin. ;)

					Andrew
					dalke at dalkescientific.com


From jafo-python-dev at tummy.com  Wed May 24 12:24:41 2006
From: jafo-python-dev at tummy.com (Sean Reifschneider)
Date: Wed, 24 May 2006 04:24:41 -0600
Subject: [Python-Dev] 2.5a2 try/except slow-down: Convert to type?
Message-ID: <20060524102441.GD32617@tummy.com>

We're working at the sprint on tracking this down.  I want to provide some
history first and then what we're looking for feedback on.

Steve Holden found this on Sunday, the pybench try/except test shows a ~60%
slowdown from 2.4.3 to 2.5a2.  The original test is, roughly:

   for i in range(N):
      try: raise ValueError, 'something'
      except: pass

But changing it to the following shows 0% slowdown from 2.4.3 to 2.5a2:

   e = ValueError('something')
   for i in range(N):
      try: raise e
      except: pass

The change is that from 2.4.3 to 2.5a2 includes Brett Cannon's patch to make
exceptions all new-style objects.

Brett provided the following direction:

   >Right, I meant change how it (BaseException) is written.  Right now
   >it uses PyMethodDef for magic methods and just uses PyType_New()
   >as a constructor.  I was wondering if, for some reason, it would be
   >faster if you used a PyType_Type definition for BaseException and
   >used the proper C struct to associate the methods with the class.

Richard Jones has done some investigation, and we're looking at fixing
it from the current implementation.  This is basically a direct
implementation of the old-style exception, but inheriting from object.
Converting it to a type in C should reduce the cost dramatically.

We're looking for feedback on where this may cause problems or break
things.  Thoughts?

Thanks,
Sean
-- 
 Thieves broke into Scotland Yard yesterday and stole all the toilets.
 Detectives say they have nothing to go on.
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From mwh at python.net  Wed May 24 12:36:52 2006
From: mwh at python.net (Michael Hudson)
Date: Wed, 24 May 2006 11:36:52 +0100
Subject: [Python-Dev] 2.5a2 try/except slow-down: Convert to type?
In-Reply-To: <20060524102441.GD32617@tummy.com> (Sean Reifschneider's
	message of "Wed, 24 May 2006 04:24:41 -0600")
References: <20060524102441.GD32617@tummy.com>
Message-ID: <2mmzd74sl7.fsf@starship.python.net>

Sean Reifschneider <jafo-python-dev at tummy.com> writes:

> We're working at the sprint on tracking this down.  I want to provide some
> history first and then what we're looking for feedback on.
>
> Steve Holden found this on Sunday, the pybench try/except test shows a ~60%
> slowdown from 2.4.3 to 2.5a2.  The original test is, roughly:
>
>    for i in range(N):
>       try: raise ValueError, 'something'
>       except: pass
>
> But changing it to the following shows 0% slowdown from 2.4.3 to 2.5a2:
>
>    e = ValueError('something')
>    for i in range(N):
>       try: raise e
>       except: pass
>
> The change is that from 2.4.3 to 2.5a2 includes Brett Cannon's patch to make
> exceptions all new-style objects.

Could it just be that instantiating instances of new-style classes is
slower than instantiating instances of old-style classes?  There's not
anything in what you've posted to suggest that exceptions are involved
directly.

Cheers,
mwh

-- 
  Get out your salt shakers folks, this one's going to take more
  than one grain.                 -- Ator in an Ars Technica news item

From fredrik at pythonware.com  Wed May 24 13:23:09 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 24 May 2006 13:23:09 +0200
Subject: [Python-Dev] 2.5a2 try/except slow-down: Convert to type?
In-Reply-To: <2mmzd74sl7.fsf@starship.python.net>
References: <20060524102441.GD32617@tummy.com>
	<2mmzd74sl7.fsf@starship.python.net>
Message-ID: <e51fmp$8p4$1@sea.gmane.org>

Michael Hudson wrote:

> Could it just be that instantiating instances of new-style classes is
> slower than instantiating instances of old-style classes?  There's not
> anything in what you've posted to suggest that exceptions are involved
> directly.

python -mtimeit -s "class Exception(object): pass" "Exception()"
1000000 loops, best of 3: 0.284 usec per loop

python -mtimeit -s "class Exception: pass" "Exception()"
1000000 loops, best of 3: 0.388 usec per loop

python -mtimeit "Exception()"
1000000 loops, best of 3: 1.95 usec per loop

</F>


From fredrik at pythonware.com  Wed May 24 15:02:17 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 24 May 2006 15:02:17 +0200
Subject: [Python-Dev] 2.5a2 try/except slow-down: Convert to type?
In-Reply-To: <e51fmp$8p4$1@sea.gmane.org>
References: <20060524102441.GD32617@tummy.com>	<2mmzd74sl7.fsf@starship.python.net>
	<e51fmp$8p4$1@sea.gmane.org>
Message-ID: <e51lgs$ueq$1@sea.gmane.org>

Fredrik Lundh wrote:

>> Could it just be that instantiating instances of new-style classes is
>> slower than instantiating instances of old-style classes?  There's not
>> anything in what you've posted to suggest that exceptions are involved
>> directly.

for completeness, here's the corresponding results from 2.4:

python -mtimeit -s "class Exception(object): pass" "Exception()"
1000000 loops, best of 3: 0.278 usec per loop

python -mtimeit -s "class Exception: pass" "Exception()"
1000000 loops, best of 3: 0.387 usec per loop

python -mtimeit "Exception()"
1000000 loops, best of 3: 0.989 usec per loop

> python -mtimeit -s "class Exception(object): pass" "Exception()"
> 1000000 loops, best of 3: 0.284 usec per loop
> 
> python -mtimeit -s "class Exception: pass" "Exception()"
> 1000000 loops, best of 3: 0.388 usec per loop
> 
> python -mtimeit "Exception()"
> 1000000 loops, best of 3: 1.95 usec per loop

</F>


From jafo at tummy.com  Wed May 24 12:56:48 2006
From: jafo at tummy.com (Sean Reifschneider)
Date: Wed, 24 May 2006 04:56:48 -0600
Subject: [Python-Dev] 2.5a2 try/except slow-down: Convert to type?
In-Reply-To: <2mmzd74sl7.fsf@starship.python.net>
References: <20060524102441.GD32617@tummy.com>
	<2mmzd74sl7.fsf@starship.python.net>
Message-ID: <20060524105648.GF32617@tummy.com>

On Wed, May 24, 2006 at 11:36:52AM +0100, Michael Hudson wrote:
>Could it just be that instantiating instances of new-style classes is
>slower than instantiating instances of old-style classes?  There's not
>anything in what you've posted to suggest that exceptions are involved
>directly.

Sorry, I should have mentioned that.  Yes, instantiating new-style classes
is slower than old-style, but only by about 10%.  The slow-down for
try/except is more like 60%.  So, it's not just the new versus old
difference.  It's much, much more than that.

Thanks,
Sean
-- 
 The Law of Software Development and Envelopment at MIT: Every program
 in development at MIT expands until it can read mail.
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From jeremy at alum.mit.edu  Wed May 24 17:37:27 2006
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Wed, 24 May 2006 11:37:27 -0400
Subject: [Python-Dev] PySequence_Fast_GET_ITEM in string join
In-Reply-To: <edcd9e4d2c7dbcdb077ddbd6be0a9467@dalkescientific.com>
References: <e412b467a9baf3f98e0421cb428d1dd6@dalkescientific.com>
	<edcd9e4d2c7dbcdb077ddbd6be0a9467@dalkescientific.com>
Message-ID: <e8bf7a530605240837m245e5593h8ea58c0828419d32@mail.gmail.com>

On 5/23/06, Andrew Dalke <dalke at dalkescientific.com> wrote:
> Me [Andrew Dalke] said:
> > The relevant code in stringobject uses PySequence_Fast_GET_ITEM(seq,
> > i),
> > which likely doesn't know about my derived __getitem__.
>
> Oops, I didn't know what the code was doing well enough.  The
> relevant problem is
>
>          seq = PySequence_Fast(orig, "");
>
> That calls __iter__ on my derived list object, which knows nothing
> about my overridden __getitem__

Put another way, the problem is related to your initial claim:
> I can derive from list and override __getitem__

You can do that, but you should not.  It's almost a bug that you can.

Jeremy


>
> So things are working as designed.
>
> Well, back to blundering about.  Too much brennivin. ;)
>
>                                         Andrew
>                                         dalke at dalkescientific.com
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu
>

From fredrik at pythonware.com  Wed May 24 19:36:59 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 24 May 2006 19:36:59 +0200
Subject: [Python-Dev] replace on empty strings
Message-ID: <e525jn$th1$1@sea.gmane.org>

so, which one is correct ?

Python 2.4.3
 >>> "".replace("", "a")
''
 >>> u"".replace(u"", u"a")
u'a'

</F>


From jafo-python-dev at tummy.com  Wed May 24 19:39:27 2006
From: jafo-python-dev at tummy.com (Sean Reifschneider)
Date: Wed, 24 May 2006 11:39:27 -0600
Subject: [Python-Dev] 2.5a2 try/except slow-down: Convert to type?
In-Reply-To: <20060524102441.GD32617@tummy.com>
References: <20060524102441.GD32617@tummy.com>
Message-ID: <20060524173927.GG32617@tummy.com>

We've done some more research on it, and Richard Jones is working on it
right now.  We'll see how it works, probably tomorrow.

Thanks,
Sean
-- 
 If you don't have time to do it right, when will you ever find time to do
 it over?
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From guido at python.org  Wed May 24 19:45:14 2006
From: guido at python.org (Guido van Rossum)
Date: Wed, 24 May 2006 10:45:14 -0700
Subject: [Python-Dev] replace on empty strings
In-Reply-To: <e525jn$th1$1@sea.gmane.org>
References: <e525jn$th1$1@sea.gmane.org>
Message-ID: <ca471dc20605241045h6b03380ft39fbda90d6694aae@mail.gmail.com>

On 5/24/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
> so, which one is correct ?
>
> Python 2.4.3
>  >>> "".replace("", "a")
> ''
>  >>> u"".replace(u"", u"a")
> u'a'

Since 'x'.replace('', 'a') and u'x'.replace('', u'a') return 'axa' and
u'axa', respectively, I conclude that the unicode version is correct
and the 8-bit string version is an anomaly.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ianb at colorstudy.com  Wed May 24 22:25:08 2006
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 24 May 2006 15:25:08 -0500
Subject: [Python-Dev] [Web-SIG]   Adding wsgiref to stdlib
In-Reply-To: <44728E25.60807@colorstudy.com>
References: <ca471dc20604281103o3cc9ea5fnd7e03dd724f7b0b0@mail.gmail.com>	<ca471dc20604281103o3cc9ea5fnd7e03dd724f7b0b0@mail.gmail.com>	<5.1.1.6.0.20060522153116.03ea4468@mail.telecommunity.com>
	<44728E25.60807@colorstudy.com>
Message-ID: <4474C124.3030507@colorstudy.com>

Ian Bicking wrote:
> Phillip J. Eby wrote:
> 
>>At 02:32 PM 4/28/2006 -0500, Ian Bicking wrote:
>>
>>>I'd like to include paste.lint with that as well (as wsgiref.lint or
>>>whatever).  Since the last discussion I enumerated in the docstring all
>>>the checks it does.  There's still some outstanding issues, mostly where
>>>I'm not sure if it is too restrictive (marked with @@ in the source).
>>>It's at:
>>>
>>>   http://svn.pythonpaste.org/Paste/trunk/paste/lint.py
>>
>>Ian, I see this is under the MIT license.  Do you also have a PSF 
>>contributor agreement (to license under AFL/ASF)?  If not, can you place 
>>a copy of this under a compatible license so that I can add this to the 
>>version of wsgiref that gets checked into the stdlib?
> 
> 
> I don't have a contributor agreement.  I can change the license in 
> place, or sign an agreement, or whatever; someone should just tell me 
> what to do.

I faxed in a contributor aggreement, and added this to the comment 
header of the file:


# Also licenced under the Apache License, 2.0: 
http://opensource.org/licenses/apache2.0.php
# Licensed to PSF under a Contributor Agreement

From greg.ewing at canterbury.ac.nz  Thu May 25 02:53:03 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 25 May 2006 12:53:03 +1200
Subject: [Python-Dev] replace on empty strings
In-Reply-To: <e525jn$th1$1@sea.gmane.org>
References: <e525jn$th1$1@sea.gmane.org>
Message-ID: <4474FFEF.5060305@canterbury.ac.nz>

Fredrik Lundh wrote:
> so, which one is correct ?
> 
> Python 2.4.3
>  >>> "".replace("", "a")
> ''
>  >>> u"".replace(u"", u"a")
> u'a'

Probably there shouldn't be any "correct" in this case,
i.e. the result of replacing an empty string should be
undefined (because any string contains infinitely many
empty substrings).

+0 on raising an exception if you try.

--
Greg


From tim.peters at gmail.com  Thu May 25 03:33:11 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 25 May 2006 01:33:11 +0000
Subject: [Python-Dev] replace on empty strings
In-Reply-To: <4474FFEF.5060305@canterbury.ac.nz>
References: <e525jn$th1$1@sea.gmane.org> <4474FFEF.5060305@canterbury.ac.nz>
Message-ID: <1f7befae0605241833o148ad09dnf69f915e1dfc4fd6@mail.gmail.com>

[/F]
>> so, which one is correct ?
>>
>> Python 2.4.3
>>  >>> "".replace("", "a")
>> ''
>>  >>> u"".replace(u"", u"a")
>> u'a'

[Greg Ewing]
> Probably there shouldn't be any "correct" in this case,
> i.e. the result of replacing an empty string should be
> undefined (because any string contains infinitely many
> empty substrings).

Where are they?  For a string s, I count s[0:0], s[1:1], ...,
s[len(s):len(s)], or len(s)+1 empty substrings in all.

While str and unicode `replace` currently disagree about that when
len(s)==0, they agree when len(s)>0:

>>> " ".replace("", "A")
'A A'
>>> u" ".replace("", "A")
u'A A'

> +0 on raising an exception if you try.

I'd be +1, except the idea that there are len(s)+1 empty substrings in
a string is pretty much ubiquitous:

>>> "" in ""
True
>>> u"" in u""
True
>>> "".index("")
0
>>> u"".index(u"")
0
>>> " ".rindex("")
1
>>> u" ".rindex(u"")
1
>>> "".count("")
1
>>> u"".count(u"")
1
>>> " ".count("")
2
>>> u" ".count(u"")
2

So the current str.replace really is an oddball.

From guido at python.org  Thu May 25 03:33:47 2006
From: guido at python.org (Guido van Rossum)
Date: Wed, 24 May 2006 18:33:47 -0700
Subject: [Python-Dev] replace on empty strings
In-Reply-To: <4474FFEF.5060305@canterbury.ac.nz>
References: <e525jn$th1$1@sea.gmane.org> <4474FFEF.5060305@canterbury.ac.nz>
Message-ID: <ca471dc20605241833lb386b38nd4390fbb660249b9@mail.gmail.com>

On 5/24/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Fredrik Lundh wrote:
> > so, which one is correct ?
> >
> > Python 2.4.3
> >  >>> "".replace("", "a")
> > ''
> >  >>> u"".replace(u"", u"a")
> > u'a'
>
> Probably there shouldn't be any "correct" in this case,
> i.e. the result of replacing an empty string should be
> undefined (because any string contains infinitely many
> empty substrings).

No. That's what older versions of Python did, and it was changed to
the current behavior, except someone screwed up the edge case for
8-bit strings.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From exarkun at divmod.com  Thu May 25 04:03:47 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Wed, 24 May 2006 22:03:47 -0400
Subject: [Python-Dev] Socket regression
In-Reply-To: 0
Message-ID: <20060525020347.28682.1767320259.divmod.quotient.4992@ohm>

I'd like to point out sf bug #1494314 ( http://sourceforge.net/tracker/index.php?func=detail&aid=1494314&group_id=5470&atid=105470 ) as an important one to fix before 2.5.  It's clearly a regression and the fix should be simple (there's a patch on the ticket).

Jean-Paul

From nnorwitz at gmail.com  Thu May 25 09:28:58 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 25 May 2006 00:28:58 -0700
Subject: [Python-Dev] 2.5a2 try/except slow-down: Convert to type?
In-Reply-To: <20060524173927.GG32617@tummy.com>
References: <20060524102441.GD32617@tummy.com>
	<20060524173927.GG32617@tummy.com>
Message-ID: <ee2a432c0605250028s38fa271egaeabc1b1d42e1ba6@mail.gmail.com>

This is probably orthogonal to the problem, however, you may want to
look into trying to speed up Python/errors.c.  This link will probably
not work due to sessions, but click on the latest run for python and
Python/errors.c

http://coverage.livinglogic.de/coverage/web/selectEntry.do?template=2850&entryToSelect=547460

If you look at the coverage numbers, many functions are called
millions of times.  In many cases traceback is passed as NULL in other
functions within errors.c.  If you could get rid of that check and
possibly others and do some fast pathing in there, it might speed up
exceptions some (or normal cases where exceptions are raised
internally, then discarded).

n
--

On 5/24/06, Sean Reifschneider <jafo-python-dev at tummy.com> wrote:
> We've done some more research on it, and Richard Jones is working on it
> right now.  We'll see how it works, probably tomorrow.
>
> Thanks,
> Sean
> --
>  If you don't have time to do it right, when will you ever find time to do
>  it over?
> Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
> tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/nnorwitz%40gmail.com
>

From ronaldoussoren at mac.com  Thu May 25 15:12:01 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 25 May 2006 15:12:01 +0200
Subject: [Python-Dev] extensions and multiple source files with the same
	basename
Message-ID: <38D80CE9-E11E-43E7-B80C-2B72D0E4B27E@mac.com>

Hi,

I'm trying to port ctypes to darwin/x86 (aka the new intel macs),  
which went pretty smooth. I am running into some odd behaviour of  
distutils now that I'm trying to port those changes to the trunk.

ctypes uses libffi, which contains source files in various platform- 
specific directories, such as:

libffi/src/powerpc/ffi_darwin.c
libffi/src/x86/ffi_darwin.c

To make it possible to build a univeral binary of ctypes when -- 
enable-universalsdk is specified I'm using some preprocessor trickery  
and add both versions of the file to the source list of the _ctypes  
extension.

So far so good. This works just fine when building the out-of-tree  
copy of ctypes. However, when I build the in-tree copy of ctypes  
distutils suddenly tries compile both these source files into the  
same object file (that is both get compiled into temp.../libffi/ 
ffi_darwin.o). That obviously isn't what I want to happen.

The cause of this behaviour is CCompiler._setup_compile: this calls  
self.object_filenames with
strip_dir=True when doing a python build. Removing this argument (and  
therefore having object_filenames fall back to the default value  
False for strip_dir) doesn't seem to have bad effects on the build on  
OSX.

Does anyone know why distutils behaves like this and if it would be  
safe to change this behaviour?

Ronald



From runar at runar.net  Thu May 25 17:01:36 2006
From: runar at runar.net (Runar Petursson)
Date: Thu, 25 May 2006 15:01:36 +0000
Subject: [Python-Dev] Cost-Free Slice into FromString constructors--Long
Message-ID: <6d6d77a60605250801l4583fb71r6f3c3e75f39387fc@mail.gmail.com>

We've been talking this week about ideas for speeding up the parsing of
Longs coming out of files or network.  The use case is having a large string
with embeded Long's and parsing them to real longs.  One approach would be
to use a simple slice:
long(mystring[x:y])

an expensive operation in a tight loop.  The proposed solution is to add
further keyword arguments to Long (such as):

long(mystring, base=10, start=x, end=y)

The start/end would allow for negative indexes, as slices do, but otherwise
simply limit the scope of the parsing.  There are other solutions, using
buffer-like objects and such, but this seems like a simple win for anyone
parsing a lot of text.  I implemented it in a branch  runar-longslice-branch,
but it would need to be updated with Tim's latest improvements to long.
Then you may ask, why not do it for everything else parsing from string--to
which I say it should.  Thoughts?

-- 
Runar Petursson
Betur
runar at betur.net -- http://betur.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060525/19fcde5c/attachment.html 

From jafo-python-dev at tummy.com  Thu May 25 17:19:02 2006
From: jafo-python-dev at tummy.com (Sean Reifschneider)
Date: Thu, 25 May 2006 09:19:02 -0600
Subject: [Python-Dev] Cost-Free Slice into FromString constructors--Long
In-Reply-To: <6d6d77a60605250801l4583fb71r6f3c3e75f39387fc@mail.gmail.com>
References: <6d6d77a60605250801l4583fb71r6f3c3e75f39387fc@mail.gmail.com>
Message-ID: <20060525151902.GH32617@tummy.com>

On Thu, May 25, 2006 at 03:01:36PM +0000, Runar Petursson wrote:
>simply limit the scope of the parsing.  There are other solutions, using
>buffer-like objects and such, but this seems like a simple win for anyone
>parsing a lot of text.  I implemented it in a branch  

I'll expand on that a little bit, since I talked to Runar yesterday about
it.  My original blurb on the #nfs IRC channel was that this would seem to
have the "camel's nose under the tent" problem, that once this is in you
probably want it in lots of other places where slices may be used.  Another
interface would be to have something like a lazy string slice that
functions that understood it could directly in C operate on the sub-buffer,
and oblivious functions would have to create a new slice.

Later, John showed Martin and I the Java byte buffer class, and Martin is
working on that now.  This would probably provide a solution.  The Java
interface allows you to push a new start and end, and the buffer then acts
like it's the string in that range.  Useful for parsing without having to
create new objects in a tight loop.

Conversion functions taking a start and end are the low-hanging fruit, but
I think in the long term something that does that described I would prefer.
I believe that Martin is expecting expecting to have something this week
to try.

Thanks,
Sean
-- 
 A computer lets you make more mistakes faster than any invention in human
 history -- with the possible exceptions of handguns and tequila.
                 -- Mitch Ratcliffe
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>


From exarkun at divmod.com  Thu May 25 17:28:07 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Thu, 25 May 2006 11:28:07 -0400
Subject: [Python-Dev] Cost-Free Slice into FromString constructors--Long
In-Reply-To: <6d6d77a60605250801l4583fb71r6f3c3e75f39387fc@mail.gmail.com>
Message-ID: <20060525152807.28682.1443912561.divmod.quotient.5169@ohm>

On Thu, 25 May 2006 15:01:36 +0000, Runar Petursson <runar at runar.net> wrote:
>We've been talking this week about ideas for speeding up the parsing of
>Longs coming out of files or network.  The use case is having a large string
>with embeded Long's and parsing them to real longs.  One approach would be
>to use a simple slice:
>long(mystring[x:y])
>
>an expensive operation in a tight loop.  The proposed solution is to add
>further keyword arguments to Long (such as):
>
>long(mystring, base=10, start=x, end=y)
>
>The start/end would allow for negative indexes, as slices do, but otherwise
>simply limit the scope of the parsing.  There are other solutions, using
>buffer-like objects and such, but this seems like a simple win for anyone
>parsing a lot of text.  I implemented it in a branch  runar-longslice- 
>branch,
>but it would need to be updated with Tim's latest improvements to long.
>Then you may ask, why not do it for everything else parsing from string--to
>which I say it should.  Thoughts?

This really seems like a poor option.  Why fix the problem with a hundred special cases instead of a single general solution?

Hmm, one reason could be that the general solution doesn't work:

  exarkun at kunai:~$ python
  Python 2.4.3 (#2, Apr 27 2006, 14:43:58) 
  [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> long(buffer('1234', 0, 3))
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  ValueError: null byte in argument for long()
  >>> long(buffer('123a', 0, 3))
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  ValueError: invalid literal for long(): 123a
  >>> 

Still, fixing that seems like a better idea. ;)

Jean-Paul

From guido at python.org  Thu May 25 17:30:33 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 25 May 2006 08:30:33 -0700
Subject: [Python-Dev] Cost-Free Slice into FromString constructors--Long
In-Reply-To: <20060525151902.GH32617@tummy.com>
References: <6d6d77a60605250801l4583fb71r6f3c3e75f39387fc@mail.gmail.com>
	<20060525151902.GH32617@tummy.com>
Message-ID: <ca471dc20605250830y3c60b903v1815eb6eef61db44@mail.gmail.com>

On 5/25/06, Sean Reifschneider <jafo-python-dev at tummy.com> wrote:
> Conversion functions taking a start and end are the low-hanging fruit, but
> I think in the long term something that does that described I would prefer.
> I believe that Martin is expecting expecting to have something this week
> to try.

I'm -1 on this. It adds considerable API complications for something
that 99.99% of users can do without regrets by using a slice. Yes, I
know regex and string searches already have this API complication.
They have a much more important use case. I don't want every API that
takes a string argument to also take slice arguments; then everybody
ends up doing duplicate work.

If you really need this for speed, I recommend you make it a custom extension.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rhettinger at ewtllc.com  Thu May 25 17:33:35 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Thu, 25 May 2006 08:33:35 -0700
Subject: [Python-Dev] Cost-Free Slice into FromString constructors--Long
In-Reply-To: <6d6d77a60605250801l4583fb71r6f3c3e75f39387fc@mail.gmail.com>
References: <6d6d77a60605250801l4583fb71r6f3c3e75f39387fc@mail.gmail.com>
Message-ID: <4475CE4F.9030809@ewtllc.com>

Runar Petursson wrote:

> We've been talking this week about ideas for speeding up the parsing 
> of Longs coming out of files or network.  The use case is having a 
> large string with embeded Long's and parsing them to real longs.  One 
> approach would be to use a simple slice:
> long(mystring[x:y])
>
> an expensive operation in a tight loop.  The proposed solution is to 
> add further keyword arguments to Long (such as):
>
> long(mystring, base=10, start=x, end=y)
>
> The start/end would allow for negative indexes, as slices do, but 
> otherwise simply limit the scope of the parsing.  There are other 
> solutions, using buffer-like objects and such, but this seems like a 
> simple win for anyone parsing a lot of text.  I implemented it in a 
> branch  runar-longslice-branch, but it would need to be updated with 
> Tim's latest improvements to long.  Then you may ask, why not do it 
> for everything else parsing from string--to which I say it should.  
> Thoughts?


-1 This is a somewhat specialized parsing application and should not be 
allowed to muck-up an otherwise simple, general-purpose API.   
Micro-optimizations do not warrant API changes.   Certainly, it should 
not be propogated to everything that can parse from a string.  Also, you 
are likely to find that the cost of varargs and kwargs will offset the 
slicing gains.

I think the whole notion is off base.  The int(mystring[x:y]) situation 
is only important when you're doing it many times.  Often, you'll be 
doing other conversions as well.  So, you would be better-off creating a 
text version of the struct module that would enable you to extract and 
convert many elements from a long record stored as text.   Alternately, 
you could expose a function styled after fscanf().  IOW, better to 
provide to general purpose text parsing tool than to muck-up the whole 
language for one specialized application.


Raymond

From bob at redivi.com  Thu May 25 17:36:00 2006
From: bob at redivi.com (Bob Ippolito)
Date: Thu, 25 May 2006 15:36:00 +0000
Subject: [Python-Dev] Cost-Free Slice into FromString constructors--Long
In-Reply-To: <20060525152807.28682.1443912561.divmod.quotient.5169@ohm>
References: <20060525152807.28682.1443912561.divmod.quotient.5169@ohm>
Message-ID: <71659010-4A23-428E-81D4-3481980866C2@redivi.com>


On May 25, 2006, at 3:28 PM, Jean-Paul Calderone wrote:

> On Thu, 25 May 2006 15:01:36 +0000, Runar Petursson  
> <runar at runar.net> wrote:
>> We've been talking this week about ideas for speeding up the  
>> parsing of
>> Longs coming out of files or network.  The use case is having a  
>> large string
>> with embeded Long's and parsing them to real longs.  One approach  
>> would be
>> to use a simple slice:
>> long(mystring[x:y])
>>
>> an expensive operation in a tight loop.  The proposed solution is  
>> to add
>> further keyword arguments to Long (such as):
>>
>> long(mystring, base=10, start=x, end=y)
>>
>> The start/end would allow for negative indexes, as slices do, but  
>> otherwise
>> simply limit the scope of the parsing.  There are other solutions,  
>> using
>> buffer-like objects and such, but this seems like a simple win for  
>> anyone
>> parsing a lot of text.  I implemented it in a branch  runar- 
>> longslice-
>> branch,
>> but it would need to be updated with Tim's latest improvements to  
>> long.
>> Then you may ask, why not do it for everything else parsing from  
>> string--to
>> which I say it should.  Thoughts?
>
> This really seems like a poor option.  Why fix the problem with a  
> hundred special cases instead of a single general solution?
>
> Hmm, one reason could be that the general solution doesn't work:
>
>   exarkun at kunai:~$ python
>   Python 2.4.3 (#2, Apr 27 2006, 14:43:58)
>   [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
>   Type "help", "copyright", "credits" or "license" for more  
> information.
>>>> long(buffer('1234', 0, 3))
>   Traceback (most recent call last):
>     File "<stdin>", line 1, in ?
>   ValueError: null byte in argument for long()
>>>> long(buffer('123a', 0, 3))
>   Traceback (most recent call last):
>     File "<stdin>", line 1, in ?
>   ValueError: invalid literal for long(): 123a

One problem with buffer() is that it does a memcpy of the buffer. A  
zero-copy version of buffer (a view on some object that implements  
the buffer API) would be nice.

-bob


From jcarlson at uci.edu  Thu May 25 18:00:44 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 25 May 2006 09:00:44 -0700
Subject: [Python-Dev] Cost-Free Slice into FromString constructors--Long
In-Reply-To: <71659010-4A23-428E-81D4-3481980866C2@redivi.com>
References: <20060525152807.28682.1443912561.divmod.quotient.5169@ohm>
	<71659010-4A23-428E-81D4-3481980866C2@redivi.com>
Message-ID: <20060525085346.690D.JCARLSON@uci.edu>


Bob Ippolito <bob at redivi.com> wrote:
> One problem with buffer() is that it does a memcpy of the buffer. A  
> zero-copy version of buffer (a view on some object that implements  
> the buffer API) would be nice.

Did buffers change?  I seem to remember that there were segfaulting
conditions when using array and buffer due to arrays being resizable and
leaving buffers pointing at ether...or is the memcpy part of the buffer
'fix'?  If so, the changes in performance and memory use may surprise
long-time users of buffer.

 - Josiah

P.S. There is a possible solution to the "buffer pointing at ether due
to realloc" problem, but it would rely on a non-existing (from what I
understand about memory allocation) "would a realloc of pointer foo
point to the same address?"  Of course the other alternative is to just
always alloc when resizing arrays, etc., leaving a version of the old
object around until all buffers pointing to that object have been freed.


From runar at runar.net  Thu May 25 17:59:18 2006
From: runar at runar.net (Runar Petursson)
Date: Thu, 25 May 2006 15:59:18 +0000
Subject: [Python-Dev] Cost-Free Slice into FromString constructors--Long
In-Reply-To: <ca471dc20605250830y3c60b903v1815eb6eef61db44@mail.gmail.com>
References: <6d6d77a60605250801l4583fb71r6f3c3e75f39387fc@mail.gmail.com>
	<20060525151902.GH32617@tummy.com>
	<ca471dc20605250830y3c60b903v1815eb6eef61db44@mail.gmail.com>
Message-ID: <6d6d77a60605250859j7e36f566wb10b08dd14b60d3b@mail.gmail.com>

On 5/25/06, Guido van Rossum <guido at python.org> wrote:
>
>
> If you really need this for speed, I recommend you make it a custom
> extension.
>
> The custom extension is a good idea, and what we do now.  But now that the
long algorithm is so fast, it would be nice to expose it at least at the C
level to use a substring.  Any solution--be it buffer based or even an
module could have no reuse of the code without an End Pointer.  If we added
the end pointer to a FromSubstring (or somesuch) function (leaving
PyLong_FromString alone), it would be easily reused from any parsing
module.  Assuming it didn't have any negative performance implication.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060525/8591e666/attachment-0001.htm 

From blais at furius.ca  Thu May 25 18:20:37 2006
From: blais at furius.ca (Martin Blais)
Date: Thu, 25 May 2006 12:20:37 -0400
Subject: [Python-Dev] Cost-Free Slice into FromString constructors--Long
In-Reply-To: <6d6d77a60605250859j7e36f566wb10b08dd14b60d3b@mail.gmail.com>
References: <6d6d77a60605250801l4583fb71r6f3c3e75f39387fc@mail.gmail.com>
	<20060525151902.GH32617@tummy.com>
	<ca471dc20605250830y3c60b903v1815eb6eef61db44@mail.gmail.com>
	<6d6d77a60605250859j7e36f566wb10b08dd14b60d3b@mail.gmail.com>
Message-ID: <8393fff0605250920l5e15490pdf06fa478899f49a@mail.gmail.com>

On 5/25/06, Runar Petursson <runar at runar.net> wrote:
>
> On 5/25/06, Guido van Rossum <guido at python.org> wrote:
> >
> > If you really need this for speed, I recommend you make it a custom
> extension.
> >
> >
>  The custom extension is a good idea, and what we do now.  But now that the
> long algorithm is so fast, it would be nice to expose it at least at the C
> level to use a substring.  Any solution--be it buffer based or even an
> module could have no reuse of the code without an End Pointer.  If we added
> the end pointer to a FromSubstring (or somesuch) function (leaving
> PyLong_FromString alone), it would be easily reused from any parsing module.
>  Assuming it didn't have any negative performance implication.

This discussion about conversion of longs is very much related to the
new hot buffer class I'm currently adding in the bytebuf branch, which
will directly tackle the issue of I/O on a fixed buffer without
intermediate string creation or memory copy.  You will be able to pack
and unpack structs directly into and out of a fixed memory buffer.  In
case you're interested, it is a clone of the Java NIO ByteBuffer class
for Python.  I'm making minor modifications to the socket and struct
modules to support I/O using the buffer protocol.  The hot buffer
class will live in a module and neither of the socket nor struct
modules will depend on it.

Following Fredrik's suggestion I will write up a mini-PEP for
discussion of this idea and will move the code in the sandbox (and out
of a branch).

Additionally, I must say, all that hot water in the blue lagoon this
morning was a really hot justficiation for the "hot buffer" name.
http://furius.ca/tmp/nfs3/html/2006-05-25.11185801.html




--
Martin Blais

From tim.peters at gmail.com  Thu May 25 18:20:52 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 25 May 2006 16:20:52 +0000
Subject: [Python-Dev] Cost-Free Slice into FromString constructors--Long
In-Reply-To: <71659010-4A23-428E-81D4-3481980866C2@redivi.com>
References: <20060525152807.28682.1443912561.divmod.quotient.5169@ohm>
	<71659010-4A23-428E-81D4-3481980866C2@redivi.com>
Message-ID: <1f7befae0605250920r55ff0798me0f053ad9d94cd54@mail.gmail.com>

[Jean-Paul Calderone]
>> ...
>> Hmm, one reason could be that the general solution doesn't work:
>>
>>   exarkun at kunai:~$ python
>>   Python 2.4.3 (#2, Apr 27 2006, 14:43:58)
>>   [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
>>   Type "help", "copyright", "credits" or "license" for more information.
>> >>> long(buffer('1234', 0, 3))
>>   Traceback (most recent call last):
>>     File "<stdin>", line 1, in ?
>>   ValueError: null byte in argument for long()
>> >>> long(buffer('123a', 0, 3))
>>   Traceback (most recent call last):
>>     File "<stdin>", line 1, in ?
>>   ValueError: invalid literal for long(): 123a

[Bob Ippolito]
> One problem with buffer() is that it does a memcpy of the buffer.

It does not, at least not in the cases above.  In fact, that's
essentially _why_ they fail now.  Here's what actually happens (in
outline) for the first example:  a buffer object is created that
merely points to the "1234" string object, recording the offset of 0
and the length of 3.  PyLong_FromString() only sees the starting
address, and-- as it always does --parses until it hits a character
that doesn't make sense for the input base.  That happens to be the
NUL at the end of the "1234" string object guts (Python string objects
are always NUL-terminated, BTW).

PyLong_FromString() is perfectly happy, but converted "1234" rather
than "123".  It's PyLong_FromString()'s _caller_ that's unhappy,
because PyLong_FromString also told the caller that it stopped parsing
at offset 4, and then

	if (end != s + len) {
		PyErr_SetString(PyExc_ValueError,
				"null byte in argument for long()");

The assumption here is bad:  the caller assumes that if end != s+len,
it must be the case that PyLong_FromString() stopped parsing _before_
hitting the end, and did so _because_ it hit an embedded NUL byte.
Neither is true in this case, so the error message is senseless.

In any case, none of this would have happened if the buffer object had
done a memcpy of the 3 requested bytes, and NUL-terminated it.

> A zero-copy version of buffer (a view on some object that implements
> the buffer API) would be nice.

Above we already had that.  The internal parsing APIs don't currently
support the buffer's "offset & length" view of the world, so have no
chance of working as hoped with any kind of buffer object now.

From g.brandl at gmx.net  Thu May 25 19:47:59 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 25 May 2006 17:47:59 +0000
Subject: [Python-Dev] Import hooks falling back on built-in import machinery?
Message-ID: <e54qkl$i44$1@sea.gmane.org>

While working on a patch involving sys.path_importer_cache, I discovered
that if a path_hooks import hook has been found for a given sys.path entry,
but isn't able to import a specific module, find_module() tries to import
the module from this sys.path entry as a regular file.

This results in e.g. doing an open call to /usr/lib/python25.zip/x.py when
I do "import x", while I think that once a sys.path entry has been identified
as belonging to a sys.path_hooks importer instance, it shouldn't be handled
by the built-in machinery anymore since the path string could be anything.

PEP 302 says "...it was chosen to ... simply fall back to the built-in logic
if no hook on sys.path_hooks could handle the path item", but that's
refering to sys.path entries, not individual module names to be imported.

Georg


From guido at python.org  Thu May 25 20:25:50 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 25 May 2006 11:25:50 -0700
Subject: [Python-Dev] Import hooks falling back on built-in import
	machinery?
In-Reply-To: <e54qkl$i44$1@sea.gmane.org>
References: <e54qkl$i44$1@sea.gmane.org>
Message-ID: <ca471dc20605251125q373d7c68q9b4170a48bd6b2cd@mail.gmail.com>

Good catch. I think this would save a certain number of unnecessary
stat() calls.

But it does change semantics, so we can't fix this in 2.4. In 2.5, we
should warn hook authors that this might affect them.

The PEP ought to be updated to clarify this.

--Guido

On 5/25/06, Georg Brandl <g.brandl at gmx.net> wrote:
> While working on a patch involving sys.path_importer_cache, I discovered
> that if a path_hooks import hook has been found for a given sys.path entry,
> but isn't able to import a specific module, find_module() tries to import
> the module from this sys.path entry as a regular file.
>
> This results in e.g. doing an open call to /usr/lib/python25.zip/x.py when
> I do "import x", while I think that once a sys.path entry has been identified
> as belonging to a sys.path_hooks importer instance, it shouldn't be handled
> by the built-in machinery anymore since the path string could be anything.
>
> PEP 302 says "...it was chosen to ... simply fall back to the built-in logic
> if no hook on sys.path_hooks could handle the path item", but that's
> refering to sys.path entries, not individual module names to be imported.
>
> Georg
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From g.brandl at gmx.net  Thu May 25 21:07:49 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 25 May 2006 19:07:49 +0000
Subject: [Python-Dev] Import hooks falling back on built-in import
	machinery?
In-Reply-To: <ca471dc20605251125q373d7c68q9b4170a48bd6b2cd@mail.gmail.com>
References: <e54qkl$i44$1@sea.gmane.org>
	<ca471dc20605251125q373d7c68q9b4170a48bd6b2cd@mail.gmail.com>
Message-ID: <e54vaa$32l$1@sea.gmane.org>

Guido van Rossum wrote:
> Good catch. I think this would save a certain number of unnecessary
> stat() calls.
> 
> But it does change semantics, so we can't fix this in 2.4. In 2.5, we
> should warn hook authors that this might affect them.
> 
> The PEP ought to be updated to clarify this.

Okay. The patch fixing this is at #921466. I'll see what I can do about
the PEP.

Georg


From fredrik at pythonware.com  Thu May 25 22:59:50 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 25 May 2006 22:59:50 +0200
Subject: [Python-Dev] A Horrible Inconsistency
Message-ID: <e555s1$q2j$1@sea.gmane.org>

 >>> -1 * (1, 2, 3)
()
 >>> -(1, 2, 3)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: bad operand type for unary -

We Really Need To Fix This!

[\F]


From martin at v.loewis.de  Thu May 25 23:04:06 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 25 May 2006 23:04:06 +0200
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e555s1$q2j$1@sea.gmane.org>
References: <e555s1$q2j$1@sea.gmane.org>
Message-ID: <44761BC6.50708@v.loewis.de>

Fredrik Lundh wrote:
>  >>> -1 * (1, 2, 3)
> ()
>  >>> -(1, 2, 3)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> TypeError: bad operand type for unary -
> 
> We Really Need To Fix This!

I can't find this inconsistency horrible.

py> +"Hello"
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: bad operand type for unary +
py> +1*"Hello"
'Hello'

Regards,
Martin

From g.brandl at gmx.net  Thu May 25 23:06:49 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 25 May 2006 21:06:49 +0000
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <44761BC6.50708@v.loewis.de>
References: <e555s1$q2j$1@sea.gmane.org> <44761BC6.50708@v.loewis.de>
Message-ID: <e5569d$rjr$1@sea.gmane.org>

Martin v. L?wis wrote:
> Fredrik Lundh wrote:
>>  >>> -1 * (1, 2, 3)
>> ()
>>  >>> -(1, 2, 3)
>> Traceback (most recent call last):
>>    File "<stdin>", line 1, in <module>
>> TypeError: bad operand type for unary -
>> 
>> We Really Need To Fix This!
> 
> I can't find this inconsistency horrible.
> 
> py> +"Hello"
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: bad operand type for unary +
> py> +1*"Hello"
> 'Hello'

Don't tell me that! I was actually working on a patch right now...

Georg


From guido at python.org  Thu May 25 23:09:45 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 25 May 2006 14:09:45 -0700
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e555s1$q2j$1@sea.gmane.org>
References: <e555s1$q2j$1@sea.gmane.org>
Message-ID: <ca471dc20605251409h2799962eu326991f7228c796e@mail.gmail.com>

You're joking right?

On 5/25/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
>  >>> -1 * (1, 2, 3)
> ()
>  >>> -(1, 2, 3)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> TypeError: bad operand type for unary -
>
> We Really Need To Fix This!
>
> [\F]

Doesn't the real effbot have /F as sig?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ronaldoussoren at mac.com  Thu May 25 23:10:47 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 25 May 2006 23:10:47 +0200
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <44761BC6.50708@v.loewis.de>
References: <e555s1$q2j$1@sea.gmane.org> <44761BC6.50708@v.loewis.de>
Message-ID: <2065A73B-8068-4514-860C-DE3175B532E8@mac.com>


On 25-mei-2006, at 23:04, Martin v. L?wis wrote:

> Fredrik Lundh wrote:
>>>>> -1 * (1, 2, 3)
>> ()
>>>>> -(1, 2, 3)
>> Traceback (most recent call last):
>>    File "<stdin>", line 1, in <module>
>> TypeError: bad operand type for unary -
>>
>> We Really Need To Fix This!
>
> I can't find this inconsistency horrible.
>
> py> +"Hello"
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: bad operand type for unary +
> py> +1*"Hello"
> 'Hello'

I don't know which one Fredrik thinks is wrong, but I think the  
result of -1*(1,2,3) is very surprising. I'd expect an exception here.

Ronald

>
> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ 
> ronaldoussoren%40mac.com


From martin at v.loewis.de  Thu May 25 23:15:44 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 25 May 2006 23:15:44 +0200
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <2065A73B-8068-4514-860C-DE3175B532E8@mac.com>
References: <e555s1$q2j$1@sea.gmane.org> <44761BC6.50708@v.loewis.de>
	<2065A73B-8068-4514-860C-DE3175B532E8@mac.com>
Message-ID: <44761E80.8040605@v.loewis.de>

Ronald Oussoren wrote:
> I don't know which one Fredrik thinks is wrong, but I think the result
> of -1*(1,2,3) is very surprising. I'd expect an exception here.

I agree, but this has nothing to do with whether or not the unary -
is supported.

Regards,
Martin

From jafo-python-dev at tummy.com  Thu May 25 23:16:10 2006
From: jafo-python-dev at tummy.com (Sean Reifschneider)
Date: Thu, 25 May 2006 15:16:10 -0600
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e5569d$rjr$1@sea.gmane.org>
References: <e555s1$q2j$1@sea.gmane.org> <44761BC6.50708@v.loewis.de>
	<e5569d$rjr$1@sea.gmane.org>
Message-ID: <20060525211610.GI32617@tummy.com>

On Thu, May 25, 2006 at 09:06:49PM +0000, Georg Brandl wrote:
>Don't tell me that! I was actually working on a patch right now...

While undoubtedly a performance patch, it wasn't on the list of tasks to do
today.  You risk Steve's wrath!

Thanks,
Sean
-- 
 In the end, we will remember not the words of our enemies, but the silence
 of our friends.  -- Martin Luther King Jr.
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From rhettinger at ewtllc.com  Thu May 25 23:17:35 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Thu, 25 May 2006 14:17:35 -0700
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e555s1$q2j$1@sea.gmane.org>
References: <e555s1$q2j$1@sea.gmane.org>
Message-ID: <44761EEF.4010103@ewtllc.com>

Fredrik Lundh wrote:

> >>> -1 * (1, 2, 3)
>()
> >>> -(1, 2, 3)
>Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>TypeError: bad operand type for unary -
>
>We Really Need To Fix This!
>  
>

The second one doesn't bug me.  Unary minus on a sequence is meaningless.

The first is a bit odd.  When using the * operator for sequence 
repetition, I don't expect it to have the same commutative property as 
multiplication.  IOW, "seq * n" makes sense but "n * seq" is a bit 
weird.  Also, I'm not clear on the rationale for transforming negative 
repetition counts to zero instead of raising an exception.  OTOH, 
neither of these has EVER been an issue for me or anyone I know.


Raymond

From tim.peters at gmail.com  Thu May 25 23:18:00 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 25 May 2006 21:18:00 +0000
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e555s1$q2j$1@sea.gmane.org>
References: <e555s1$q2j$1@sea.gmane.org>
Message-ID: <1f7befae0605251418h5521aa66t754a2268f8031b62@mail.gmail.com>

[Fredrik]
>  >>> -1 * (1, 2, 3)
> ()
>  >>> -(1, 2, 3)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> TypeError: bad operand type for unary -
>
> We Really Need To Fix This!

What's broken?  It's generally true that

    n*s == s*n == empty_container_of_type_type(s)

whenever s is a sequence and n is an integer <= 0.  The above is just
an instance of that.  See footnote 2 in Library Ref section 2.3.6
Sequence Types.

From guido at python.org  Thu May 25 23:23:56 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 25 May 2006 14:23:56 -0700
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <44761EEF.4010103@ewtllc.com>
References: <e555s1$q2j$1@sea.gmane.org> <44761EEF.4010103@ewtllc.com>
Message-ID: <ca471dc20605251423q7b17b686s128a45e460b8075a@mail.gmail.com>

On 5/25/06, Raymond Hettinger <rhettinger at ewtllc.com> wrote:
> Fredrik Lundh wrote:
>
> > >>> -1 * (1, 2, 3)
> >()
> > >>> -(1, 2, 3)
> >Traceback (most recent call last):
> >   File "<stdin>", line 1, in <module>
> >TypeError: bad operand type for unary -
> >
> >We Really Need To Fix This!
> >
> >
>
> The second one doesn't bug me.  Unary minus on a sequence is meaningless.
>
> The first is a bit odd.  When using the * operator for sequence
> repetition, I don't expect it to have the same commutative property as
> multiplication.  IOW, "seq * n" makes sense but "n * seq" is a bit
> weird.  Also, I'm not clear on the rationale for transforming negative
> repetition counts to zero instead of raising an exception.  OTOH,
> neither of these has EVER been an issue for me or anyone I know.

It would be very strange if n*s didn't return the same thing as s*n.
In fact, I can't even decide which one feels more natural!

As to truncation of 0, that's debatable but can't be fixed until py3k
-- surely lots of code (perhaps accidentally) depends on it.

Still unclear which one \F wanted fixed...

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim.peters at gmail.com  Thu May 25 23:25:07 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 25 May 2006 21:25:07 +0000
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <44761EEF.4010103@ewtllc.com>
References: <e555s1$q2j$1@sea.gmane.org> <44761EEF.4010103@ewtllc.com>
Message-ID: <1f7befae0605251425j607a3a6aoae2c84943424669a@mail.gmail.com>

[Raymond Hettinger]
...
> Also, I'm not clear on the rationale for transforming negative
> repetition counts to zero instead of raising an exception.

There are natural use cases.  Here's one:  you have a string and want
to right-justify it to 80 columns with blanks if it's shorter than 80.

    s = " " * (80 - len(s)) + s

does the right thing in all cases, because the blank repetition
gracefully collapses to an empty string whenever len(s) >= 80.  I've
used that (and variants thereof) dozens of times.  Then again, I don't
believe I've ever used the "integer * sequence" form (i.e., with the
integer on the left).

From fredrik at pythonware.com  Thu May 25 23:30:16 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 25 May 2006 23:30:16 +0200
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <ca471dc20605251409h2799962eu326991f7228c796e@mail.gmail.com>
References: <e555s1$q2j$1@sea.gmane.org>
	<ca471dc20605251409h2799962eu326991f7228c796e@mail.gmail.com>
Message-ID: <e557l8$u1$1@sea.gmane.org>

Guido van Rossum wrote:

>> We Really Need To Fix This!
>>
>> [\F]
> 
> Doesn't the real effbot have /F as sig?

yeah, we've had some trouble with fake bots lately.  I mean, there's a 
timbot posting to this thread, but I know for sure that the real Tim got 
tired of hacking on Python earlier tonight, and retired to his hotel 
room.  too much silica mud, I suppose.

</F>


From steve at holdenweb.com  Thu May 25 23:44:41 2006
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 25 May 2006 22:44:41 +0100
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e557l8$u1$1@sea.gmane.org>
References: <e555s1$q2j$1@sea.gmane.org>	<ca471dc20605251409h2799962eu326991f7228c796e@mail.gmail.com>
	<e557l8$u1$1@sea.gmane.org>
Message-ID: <44762549.8000403@holdenweb.com>

Fredrik Lundh wrote:
> Guido van Rossum wrote:
> 
> 
>>>We Really Need To Fix This!
>>>
>>>[\F]
>>
>>Doesn't the real effbot have /F as sig?
> 
> 
> yeah, we've had some trouble with fake bots lately.  I mean, there's a 
> timbot posting to this thread, but I know for sure that the real Tim got 
> tired of hacking on Python earlier tonight, and retired to his hotel 
> room.  too much silica mud, I suppose.
> 
In actual fact the effbot has lately found itself so permeated with 
Windows that it has become constituionally incapable of using a forward 
slash. Don't know what's with the square brackets though ...

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Love me, love my blog  http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From g.brandl at gmx.net  Thu May 25 23:45:00 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 25 May 2006 21:45:00 +0000
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <1f7befae0605251425j607a3a6aoae2c84943424669a@mail.gmail.com>
References: <e555s1$q2j$1@sea.gmane.org> <44761EEF.4010103@ewtllc.com>
	<1f7befae0605251425j607a3a6aoae2c84943424669a@mail.gmail.com>
Message-ID: <e558h1$36e$1@sea.gmane.org>

Tim Peters wrote:
> [Raymond Hettinger]
> ...
>> Also, I'm not clear on the rationale for transforming negative
>> repetition counts to zero instead of raising an exception.
> 
> There are natural use cases.  Here's one:  you have a string and want
> to right-justify it to 80 columns with blanks if it's shorter than 80.
> 
>     s = " " * (80 - len(s)) + s
> 
> does the right thing in all cases, because the blank repetition
> gracefully collapses to an empty string whenever len(s) >= 80.  I've
> used that (and variants thereof) dozens of times.  Then again, I don't
> believe I've ever used the "integer * sequence" form (i.e., with the
> integer on the left).

I originally brought that up because decimal.py uses it.
(try to translate this to C code...)

Georg


From g.brandl at gmx.net  Thu May 25 23:54:06 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 25 May 2006 21:54:06 +0000
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e5569d$rjr$1@sea.gmane.org>
References: <e555s1$q2j$1@sea.gmane.org> <44761BC6.50708@v.loewis.de>
	<e5569d$rjr$1@sea.gmane.org>
Message-ID: <e55922$50k$1@sea.gmane.org>

Georg Brandl wrote:
> Martin v. L?wis wrote:
>> Fredrik Lundh wrote:
>>>  >>> -1 * (1, 2, 3)
>>> ()
>>>  >>> -(1, 2, 3)
>>> Traceback (most recent call last):
>>>    File "<stdin>", line 1, in <module>
>>> TypeError: bad operand type for unary -
>>> 
>>> We Really Need To Fix This!
>> 
>> I can't find this inconsistency horrible.
>> 
>> py> +"Hello"
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in ?
>> TypeError: bad operand type for unary +
>> py> +1*"Hello"
>> 'Hello'
> 
> Don't tell me that! I was actually working on a patch right now...

Since I've already been bombarded with questions by some fellow Germans (which
once again prove that they've got no sense of humour at all ;): this
was a joke.

georG


From fredrik at pythonware.com  Fri May 26 00:21:14 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 26 May 2006 00:21:14 +0200
Subject: [Python-Dev] whatever happened to string.partition ?
Message-ID: <e55akk$a2q$1@sea.gmane.org>

does anyone remember?  given what we're currently working on,
implementing it would take roughly no time at all.  do people still
think it's a good idea ?

</F> <!-- note: angle brackets, forward slash -->


From steve at holdenweb.com  Fri May 26 00:36:24 2006
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 25 May 2006 23:36:24 +0100
Subject: [Python-Dev] This
In-Reply-To: <44762E77.3080303@ewtllc.com>
References: <e55adp$8m5$1@sea.gmane.org> <44762E77.3080303@ewtllc.com>
Message-ID: <44763168.4030308@holdenweb.com>

Raymond Hettinger wrote:
> Fredrik Lundh wrote:
> 
> 
>>does anyone remember?  given what we're currently working on, 
>>implementing it would take roughly no time at all.  do people still 
>>think it's a good idea ?
>>
>></F> <!-- note: angle brackets, forward slash -->
>>
>> 
>>
> 
> 
> Yes.  It went to my todo list and is awaiting some free raymond-cycles 
> to finish it up.  I've been task saturated of late but would like to get 
> this a number of other patches complete for Py2.5.
> 
> Also, Crutcher had long ago posted a patch for exec/eval to accept 
> generic mapping arguments.   If someone like Tim or /F has the time, 
> feel free to pick it up.
> 
This reminds me I am tasked with trying to find out what the interface 
to timeit.py is supposed to look like. Raymond, your name has been 
mentioned as someone who took part int he discussions. Google hasn't 
given me a lot to go on. Anyone?

[Follow-ups to python-dev would be best, I suspect].

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Love me, love my blog  http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden

From jafo-python-dev at tummy.com  Fri May 26 00:42:15 2006
From: jafo-python-dev at tummy.com (Sean Reifschneider)
Date: Thu, 25 May 2006 16:42:15 -0600
Subject: [Python-Dev] 2.5a2 try/except slow-down: Convert to type?
In-Reply-To: <20060524102441.GD32617@tummy.com>
References: <20060524102441.GD32617@tummy.com>
Message-ID: <20060525224215.GJ32617@tummy.com>

>Brett provided the following direction:
>   >Right, I meant change how it (BaseException) is written.  Right now
>   >it uses PyMethodDef for magic methods and just uses PyType_New()
>   >as a constructor.  I was wondering if, for some reason, it would be
>   >faster if you used a PyType_Type definition for BaseException and
>   >used the proper C struct to associate the methods with the class.

Just a status update.  Richard has implemented enough of this that I've
been able to run some performance tests, and this changes it this pybench
test from being 57% slower than 2.4.3 to being 32% faster.  Go Richard.
It's looking pretty good so far.

Now we just have to keep it from exploding.

Thanks,
Sean
-- 
 Denial is more than just a river in Egypt, you know?
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability
      Back off man. I'm a scientist.   http://HackingSociety.org/


From fredrik at pythonware.com  Fri May 26 00:36:17 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 26 May 2006 00:36:17 +0200
Subject: [Python-Dev] whatever happened to string.partition ?
In-Reply-To: <e55b8k$c39$1@sea.gmane.org>
References: <e55adp$8m5$1@sea.gmane.org> <44762E77.3080303@ewtllc.com>
	<e55b8k$c39$1@sea.gmane.org>
Message-ID: <e55bgs$c39$2@sea.gmane.org>

Fredrik Lundh wrote:

> no need to wait for any raymond-cycles here; just point me to the latest 
> version of the proposal, and it'll be in the trunk within 30 minutes.

are these still valid?

     http://mail.python.org/pipermail/python-dev/2005-August/055764.html
     http://mail.python.org/pipermail/python-dev/2005-August/055770.html

</F>


From rhettinger at ewtllc.com  Fri May 26 00:45:27 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Thu, 25 May 2006 15:45:27 -0700
Subject: [Python-Dev] [Python-checkins] whatever happened to
	string.partition ?
In-Reply-To: <e55bgs$c39$2@sea.gmane.org>
References: <e55adp$8m5$1@sea.gmane.org>
	<44762E77.3080303@ewtllc.com>	<e55b8k$c39$1@sea.gmane.org>
	<e55bgs$c39$2@sea.gmane.org>
Message-ID: <44763387.80308@ewtllc.com>

An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060525/108269cb/attachment.htm 

From rasky at develer.com  Fri May 26 00:47:41 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Fri, 26 May 2006 00:47:41 +0200
Subject: [Python-Dev] whatever happened to string.partition ?
References: <e55akk$a2q$1@sea.gmane.org>
Message-ID: <00d201c6804d$3b01d210$bf03030a@trilan>

Fredrik Lundh wrote:

> does anyone remember?  given what we're currently working on,
> implementing it would take roughly no time at all.  do people still
> think it's a good idea ?
>
> </F> <!-- note: angle brackets, forward slash -->

I had enquired this before but got no answer. I assume nobody's working on
it at the moment. I also proposed a slightly different semantic which would
prevent much boilerplate in the stdlib:
http://mail.python.org/pipermail/python-dev/2006-March/062582.html
-- 
Giovanni Bajo


From rhettinger at ewtllc.com  Fri May 26 00:49:43 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Thu, 25 May 2006 15:49:43 -0700
Subject: [Python-Dev] This
In-Reply-To: <44763168.4030308@holdenweb.com>
References: <e55adp$8m5$1@sea.gmane.org> <44762E77.3080303@ewtllc.com>
	<44763168.4030308@holdenweb.com>
Message-ID: <44763487.3080908@ewtllc.com>

Steve Holden wrote:

>
> This reminds me I am tasked with trying to find out what the interface 
> to timeit.py is supposed to look like. Raymond, your name has been 
> mentioned as someone who took part int he discussions. Google hasn't 
> given me a lot to go on. Anyone?
>

IIRC, Guido's words were that too much to the smarts in timeit.py were 
in the command-line interface and not in the Timer object were in should 
be.

So, this should be a simple refactoring exercise to expose more of the 
existing API.  No inventiveness is required.

While you're at it, please document the import trick or testing existing 
code without copying it in to a text string.


Raymond


From rhettinger at ewtllc.com  Fri May 26 00:51:04 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Thu, 25 May 2006 15:51:04 -0700
Subject: [Python-Dev] whatever happened to string.partition ?
In-Reply-To: <00d201c6804d$3b01d210$bf03030a@trilan>
References: <e55akk$a2q$1@sea.gmane.org>
	<00d201c6804d$3b01d210$bf03030a@trilan>
Message-ID: <447634D8.9080603@ewtllc.com>

An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060525/bcea630a/attachment.html 

From bob at redivi.com  Fri May 26 00:58:21 2006
From: bob at redivi.com (Bob Ippolito)
Date: Thu, 25 May 2006 22:58:21 +0000
Subject: [Python-Dev] Returning int instead of long from struct when
	possible for performance
Message-ID: <D09E0DAC-2F46-488D-9318-CFC7E1BA3ACD@redivi.com>

Python ints are a heck of a lot faster to work with than Python longs  
and have the additional benefit that psyco <http:// 
psyco.sourceforge.net> can optimize the hell out of int but can't do  
anything at all for long. This is important because psyco is  
currently in pretty wide-spread use amongst people who need to  
squeeze every bit of performance out of their Python programs, and  
often yields 4x or better performance improvement in real-world tests.

A more far reaching change would be to add a new PyInteger_From* API  
for code that doesn't care if long or int is returned, but I don't  
see much of a reason to go that far at this point since the rest of  
the standard library wouldn't benefit much.

Unfortunately, this change to the struct module slightly alters the  
documented API for the following format codes: I, L, q, Q. Currently  
it is documented that those format codes will always return longs,  
regardless of their value.

I've prototyped this change on the trunk (behind a currently  
undefined PY_USE_INT_WHEN_POSSIBLE macro). The standard library test  
suite passes with this enabled, but it would break any doctests in  
third party code that check the result of an unpack operation with  
one of those format codes. Given the interchangeability of int and  
long, I don't foresee any other complications with this change.

Thoughts?

-bob


From tim.peters at gmail.com  Fri May 26 01:05:53 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 25 May 2006 23:05:53 +0000
Subject: [Python-Dev] Returning int instead of long from struct when
	possible for performance
In-Reply-To: <D09E0DAC-2F46-488D-9318-CFC7E1BA3ACD@redivi.com>
References: <D09E0DAC-2F46-488D-9318-CFC7E1BA3ACD@redivi.com>
Message-ID: <1f7befae0605251605q654c243dg992a1a9da0fae1eb@mail.gmail.com>

[Bob Ippolito]
> ...
> Unfortunately, this change to the struct module slightly alters the
> documented API for the following format codes: I, L, q, Q. Currently
> it is documented that those format codes will always return longs,
> regardless of their value.

I view that more as having documented the CPython implementation du
jour than as documenting the language.  IOW, it was a doc bug <0.5
wink>.

> I've prototyped this change on the trunk (behind a currently
> undefined PY_USE_INT_WHEN_POSSIBLE macro). The standard library test
> suite passes with this enabled, but it would break any doctests in
> third party code that check the result of an unpack operation with
> one of those format codes. Given the interchangeability of int and
> long, I don't foresee any other complications with this change.
>
> Thoughts?

+1, and for 2.5.  Even int() doesn't always return an int anymore, and
it's just stupid to bear the burden of an unbounded long when it's not
really needed.  As to doctest breakage, I'm wholly unsympathetic to
any doctest user except me, and I don't have any doctests that would
break.  So that's a total non-issue :-)

From shane.holloway at ieee.org  Fri May 26 00:53:08 2006
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Thu, 25 May 2006 16:53:08 -0600
Subject: [Python-Dev] whatever happened to string.partition ?
In-Reply-To: <e55akk$a2q$1@sea.gmane.org>
References: <e55akk$a2q$1@sea.gmane.org>
Message-ID: <D70E8AFA-6873-4009-AFE1-CFC9226B8323@ieee.org>

I will certainly look forward to using it.


On May 25, 2006, at 16:21, Fredrik Lundh wrote:

> does anyone remember?  given what we're currently working on,
> implementing it would take roughly no time at all.  do people still
> think it's a good idea ?
>
> </F> <!-- note: angle brackets, forward slash -->
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ 
> shane.holloway%40ieee.org


From greg.ewing at canterbury.ac.nz  Fri May 26 02:39:04 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 26 May 2006 12:39:04 +1200
Subject: [Python-Dev] Cost-Free Slice into FromString constructors--Long
In-Reply-To: <1f7befae0605250920r55ff0798me0f053ad9d94cd54@mail.gmail.com>
References: <20060525152807.28682.1443912561.divmod.quotient.5169@ohm>
	<71659010-4A23-428E-81D4-3481980866C2@redivi.com>
	<1f7befae0605250920r55ff0798me0f053ad9d94cd54@mail.gmail.com>
Message-ID: <44764E28.4020403@canterbury.ac.nz>

Tim Peters wrote:
> PyLong_FromString() only sees the starting
> address, and-- as it always does --parses until it hits a character
> that doesn't make sense for the input base.

This is the bug, then. long() shouldn't be using
PyLong_FromString() to convert its argument, but
something that takes an address and a size. (Is
there a PyLong_FromStringAndSize()? If not, there
should be.)

--
Greg

From greg.ewing at canterbury.ac.nz  Fri May 26 02:52:51 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 26 May 2006 12:52:51 +1200
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <44762549.8000403@holdenweb.com>
References: <e555s1$q2j$1@sea.gmane.org>
	<ca471dc20605251409h2799962eu326991f7228c796e@mail.gmail.com>
	<e557l8$u1$1@sea.gmane.org> <44762549.8000403@holdenweb.com>
Message-ID: <44765163.4070208@canterbury.ac.nz>

Steve Holden wrote:

> In actual fact the effbot has lately found itself so permeated with 
> Windows that it has become constituionally incapable of using a forward 
> slash. Don't know what's with the square brackets though ...

I was thinking maybe that message had resulted from
a Windows and a VMS port of the effbot that got
mixed together somehow...

--
Greg

From guido at python.org  Fri May 26 03:38:02 2006
From: guido at python.org (Guido van Rossum)
Date: Thu, 25 May 2006 18:38:02 -0700
Subject: [Python-Dev] [Python-checkins] whatever happened to
	string.partition ?
In-Reply-To: <e55cmn$fsl$1@sea.gmane.org>
References: <e55adp$8m5$1@sea.gmane.org> <44762E77.3080303@ewtllc.com>
	<e55b8k$c39$1@sea.gmane.org> <e55bgs$c39$2@sea.gmane.org>
	<44763387.80308@ewtllc.com> <e55cmn$fsl$1@sea.gmane.org>
Message-ID: <ca471dc20605251838k3bd6caefyd24d9da515d86ac0@mail.gmail.com>

On 5/25/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
> Raymond Hettinger wrote:
>
> > IIRC, Skip had developed a smart  version that returned lazy string
> > objects that kept a reference and pointers to the original string
> > (instead of making its own copy of the string components).  The string
> > subclass would expand itself and free the reference if necessary for a
> > subsequent string operation.  The main purpose was to handle the cases
> > where one fragment of the other was never used or just had a length
> > check.  Also it was helpful when partition was used lisp-style to
> > repeatedly break-off head/tail fragments.
>
> that sounds nice in theory, but I completely fail to see how that can be
> implemented without ripping out the existing string object and replacing
> it with something entirely different, or (via horrible macro tricks)
> slow down virtually all other use of PyStringObject.
>
> skip?  was this a real design or a Py3K bluesky idea?

I suggest to forget about that particular idea and just do what you
were thinking of originally. str.partition() and unicode.partition()
should be as simple as possible, to cover the 80% use case FAST.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Fri May 26 04:03:45 2006
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 25 May 2006 21:03:45 -0500
Subject: [Python-Dev] [Python-checkins] whatever happened to
 string.partition ?
In-Reply-To: <ca471dc20605251838k3bd6caefyd24d9da515d86ac0@mail.gmail.com>
References: <e55adp$8m5$1@sea.gmane.org> <44762E77.3080303@ewtllc.com>
	<e55b8k$c39$1@sea.gmane.org> <e55bgs$c39$2@sea.gmane.org>
	<44763387.80308@ewtllc.com> <e55cmn$fsl$1@sea.gmane.org>
	<ca471dc20605251838k3bd6caefyd24d9da515d86ac0@mail.gmail.com>
Message-ID: <17526.25089.221798.743153@montanaro.dyndns.org>


    >> skip?  was this a real design or a Py3K bluesky idea?

Just about nothing I do is a real design. <wink>  I don't remember the
weather conditions that day either.  I was probably just noodling around
based upon discussions on python-dev that were mutated in my brain by gamma
rays from Alpha Centauri.

    Guido> I suggest to forget about that particular idea and just do what
    Guido> you were thinking of originally. str.partition() and
    Guido> unicode.partition() should be as simple as possible, to cover the
    Guido> 80% use case FAST.

Agreed.  Get it right then optimize it during the next NeedForSpeed sprint.

Skip

From ronaldoussoren at mac.com  Fri May 26 10:35:02 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Fri, 26 May 2006 10:35:02 +0200
Subject: [Python-Dev] SQLite header scan order
Message-ID: <D7831F80-A8A3-45C9-A7A5-EA1F36D86352@mac.com>

Hi,

The current version of setup.py looks for the sqlite header files in  
a number of sqlite-specific directories before looking into the  
default inc_dirs. I'd like to revert that order because that would  
make it possible to override the version of sqlite that gets picked  
up. Any objections to that?

Ronald

From bob at redivi.com  Fri May 26 11:26:38 2006
From: bob at redivi.com (Bob Ippolito)
Date: Fri, 26 May 2006 09:26:38 +0000
Subject: [Python-Dev] SQLite header scan order
In-Reply-To: <D7831F80-A8A3-45C9-A7A5-EA1F36D86352@mac.com>
References: <D7831F80-A8A3-45C9-A7A5-EA1F36D86352@mac.com>
Message-ID: <46ED4D1E-C1D0-40BE-A41B-18C4AB093476@redivi.com>


On May 26, 2006, at 8:35 AM, Ronald Oussoren wrote:

> The current version of setup.py looks for the sqlite header files in
> a number of sqlite-specific directories before looking into the
> default inc_dirs. I'd like to revert that order because that would
> make it possible to override the version of sqlite that gets picked
> up. Any objections to that?

+1, the version that ships with Mac OS X 10.4 is pretty old.

-bob


From ronaldoussoren at mac.com  Fri May 26 11:33:32 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Fri, 26 May 2006 11:33:32 +0200
Subject: [Python-Dev] SQLite header scan order
In-Reply-To: <46ED4D1E-C1D0-40BE-A41B-18C4AB093476@redivi.com>
References: <D7831F80-A8A3-45C9-A7A5-EA1F36D86352@mac.com>
	<46ED4D1E-C1D0-40BE-A41B-18C4AB093476@redivi.com>
Message-ID: <8BC9C9D6-FE07-43D8-BE69-47CAD357EECF@mac.com>


On 26-mei-2006, at 11:26, Bob Ippolito wrote:

>
> On May 26, 2006, at 8:35 AM, Ronald Oussoren wrote:
>
>> The current version of setup.py looks for the sqlite header files in
>> a number of sqlite-specific directories before looking into the
>> default inc_dirs. I'd like to revert that order because that would
>> make it possible to override the version of sqlite that gets picked
>> up. Any objections to that?
>
> +1, the version that ships with Mac OS X 10.4 is pretty old.

That's why I want this :-). I'll just check in a change and see who  
starts to yell. We're still in alpha-mode after all.

Ronald


From tim.peters at gmail.com  Fri May 26 11:48:16 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 26 May 2006 09:48:16 +0000
Subject: [Python-Dev] Cost-Free Slice into FromString constructors--Long
In-Reply-To: <44764E28.4020403@canterbury.ac.nz>
References: <20060525152807.28682.1443912561.divmod.quotient.5169@ohm>
	<71659010-4A23-428E-81D4-3481980866C2@redivi.com>
	<1f7befae0605250920r55ff0798me0f053ad9d94cd54@mail.gmail.com>
	<44764E28.4020403@canterbury.ac.nz>
Message-ID: <1f7befae0605260248p41c53ea1w385c621dec0a1a6e@mail.gmail.com>

[Tim]
>> PyLong_FromString() only sees the starting
>> address, and-- as it always does --parses until it hits a character
>> that doesn't make sense for the input base.

[Greg Ewing]
> This is the bug, then. long() shouldn't be using
> PyLong_FromString() to convert its argument, but
> something that takes an address and a size. (Is
> there a PyLong_FromStringAndSize()? If not, there
> should be.)

Yes, that's what I said ;-):

    The internal parsing APIs don't currently support the buffer's "offset &
    length" view of the world, so have no chance of working as hoped
    with any kind of buffer object now.

Note that this isn't limited to long():  _no_ internal parsing routine
has a "sliceish" API now, and none of the code on the way toward
calling parsing routines is paying any attention to that it can't work
when handed a buffer object.  All of that must change.  At least the
complex() constructor gives up at once:

>>> b = buffer("123a", 0, 3)
>>> long(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for long() with base 10: '123a'
>>> int(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '123a'
>>> float(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for float(): 123a
>>> complex(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: complex() argument must be a string or a number

From walter at livinglogic.de  Fri May 26 11:59:05 2006
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 26 May 2006 11:59:05 +0200
Subject: [Python-Dev] 2.5a2 try/except slow-down: Convert to type?
In-Reply-To: <ee2a432c0605250028s38fa271egaeabc1b1d42e1ba6@mail.gmail.com>
References: <20060524102441.GD32617@tummy.com>	<20060524173927.GG32617@tummy.com>
	<ee2a432c0605250028s38fa271egaeabc1b1d42e1ba6@mail.gmail.com>
Message-ID: <4476D169.7000603@livinglogic.de>

Neal Norwitz wrote:

> This is probably orthogonal to the problem, however, you may want to
> look into trying to speed up Python/errors.c.  This link will probably
> not work due to sessions, but click on the latest run for python and
> Python/errors.c
> 
> http://coverage.livinglogic.de/coverage/web/selectEntry.do?template=2850&entryToSelect=547460
> 
> If you look at the coverage numbers, many functions are called
> millions of times.  In many cases traceback is passed as NULL in other
> functions within errors.c.  If you could get rid of that check and
> possibly others and do some fast pathing in there, it might speed up
> exceptions some (or normal cases where exceptions are raised
> internally, then discarded).

If that helps, here is the list of the 1000 most executed lines of the
last run of the test suite:

http://styx.livinglogic.de/~walter/python/topcoverage.html

Servus,
   Walter


From richard at commonground.com.au  Fri May 26 11:17:55 2006
From: richard at commonground.com.au (Richard Jones)
Date: Fri, 26 May 2006 09:17:55 +0000
Subject: [Python-Dev] [Python-checkins] r46247 - in
	python/branches/sreifschneider-newnewexcept: Makefile.pre.in
	Objects/exceptions.c Python/exceptions.c
In-Reply-To: <fb6fbf560605251400j1afa9f16kd42bf591f482b712@mail.gmail.com>
References: <20060525194304.CF03E1E401C@bag.python.org>
	<fb6fbf560605251400j1afa9f16kd42bf591f482b712@mail.gmail.com>
Message-ID: <E0069453-5C40-473F-9A33-2D744506AEE3@commonground.com.au>

On 25/05/2006, at 9:00 PM, Jim Jewett wrote:
>> +/*
>> + *    OverflowWarning extends Warning
>> + */
>> +SimpleExtendsException(PyExc_Warning, OverflowWarning, "Base  
>> class for warnings about numeric overflow.  Won't exist in Python  
>> 2.5.");
>
> Take it out now?

What do people say? I implemented this one because it was implemented  
in the old code in 2.5. Happy to remove it.


     Richard


From steve at holdenweb.com  Fri May 26 12:19:16 2006
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 26 May 2006 11:19:16 +0100
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <44765163.4070208@canterbury.ac.nz>
References: <e555s1$q2j$1@sea.gmane.org>	<ca471dc20605251409h2799962eu326991f7228c796e@mail.gmail.com>	<e557l8$u1$1@sea.gmane.org>
	<44762549.8000403@holdenweb.com>
	<44765163.4070208@canterbury.ac.nz>
Message-ID: <e56kms$2h0$2@sea.gmane.org>

Greg Ewing wrote:
> Steve Holden wrote:
> 
> 
>>In actual fact the effbot has lately found itself so permeated with 
>>Windows that it has become constituionally incapable of using a forward 
>>slash. Don't know what's with the square brackets though ...
> 
> 
> I was thinking maybe that message had resulted from
> a Windows and a VMS port of the effbot that got
> mixed together somehow...
> 
That would be interesting: if we start to see semicolons in the sig 
strings we'll know which version of the bot is mailing.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Love me, love my blog  http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From thomas at python.org  Fri May 26 12:32:20 2006
From: thomas at python.org (Thomas Wouters)
Date: Fri, 26 May 2006 12:32:20 +0200
Subject: [Python-Dev] Returning int instead of long from struct when
	possible for performance
In-Reply-To: <1f7befae0605251605q654c243dg992a1a9da0fae1eb@mail.gmail.com>
References: <D09E0DAC-2F46-488D-9318-CFC7E1BA3ACD@redivi.com>
	<1f7befae0605251605q654c243dg992a1a9da0fae1eb@mail.gmail.com>
Message-ID: <9e804ac0605260332l6cf1863ek3b6b3f9c71e04c63@mail.gmail.com>

On 5/26/06, Tim Peters <tim.peters at gmail.com> wrote:
>
> [Bob Ippolito]
> > Given the interchangeability of int and
> > long, I don't foresee any other complications with this change.
> >
> > Thoughts?
>
> +1, and for 2.5.  Even int() doesn't always return an int anymore, and
> it's just stupid to bear the burden of an unbounded long when it's not
> really needed.


Completely agreed. We've been unifying longs and integers for whole
releases, I cannot imagine anyone not getting the hint. Ints and longs are
the same thing, deal with it. Whether you get an int or a long for a
particular value is a platform-dependant accident anyway (people seem to be
ignoring 64-bit hardware all the time... I know Tim does :)

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060526/09bbfbf0/attachment.htm 

From aahz at pythoncraft.com  Fri May 26 13:12:54 2006
From: aahz at pythoncraft.com (Aahz)
Date: Fri, 26 May 2006 04:12:54 -0700
Subject: [Python-Dev] SQLite status?
Message-ID: <20060526111254.GA29786@panix.com>

We're coming down to the wire on _Python for Dummies_, and I'm trying to
persuade the publisher to stick a blurb about SQLite on the cover, but my
last understanding was that there was a small chance we might pull SQLite
for insufficient docs.  Is that still true?
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there."  --Steve Gonedes

From g.brandl at gmx.net  Fri May 26 13:43:46 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 26 May 2006 11:43:46 +0000
Subject: [Python-Dev] Request for patch review
Message-ID: <e56plp$jel$1@sea.gmane.org>

I've worked on two patches for NeedForSpeed, and would like someone
familiar with the areas they touch to review them before I check them
in, breaking all the buildbots which aren't broken yet ;)

They are:

http://python.org/sf/1346214
    Better dead code elimination for the AST compiler

http://python.org/sf/921466
    Reduce number of open calls on startup GB

Thanks!
Georg


From amk at amk.ca  Fri May 26 14:44:43 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Fri, 26 May 2006 08:44:43 -0400
Subject: [Python-Dev] partition() variants
Message-ID: <20060526124443.GA19416@localhost.localdomain>

I didn't find an answer in the str.partition() thread in the archives
(it's enormous, so easy to miss the right message), so I have two
questions:

1) Is str.rpartition() still wanted?

2) What about adding partition() to the re module?

--amk


From walter at livinglogic.de  Fri May 26 15:42:39 2006
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 26 May 2006 15:42:39 +0200
Subject: [Python-Dev] partition() variants
In-Reply-To: <20060526124443.GA19416@localhost.localdomain>
References: <20060526124443.GA19416@localhost.localdomain>
Message-ID: <447705CF.9000306@livinglogic.de>

A.M. Kuchling wrote:

> I didn't find an answer in the str.partition() thread in the archives
> (it's enormous, so easy to miss the right message), so I have two
> questions:
> 
> 1) Is str.rpartition() still wanted?
> 
> 2) What about adding partition() to the re module?

And what happens if the separator is an instance of a subclass?

class s2(str):
    def __repr__(self):
        return "s2(%r)" % str(self)

print "foobar".partition(s2("o"))

Currently this prints:
   ('f', s2('o'), 'obar')
Should this be
   ('f', 'o', 'obar')
or not?

And what about:
   print s2("foobar").partition("x")
Currently this prints
   (s2('foobar'), '', '')

Servus,
   Walter


From anthony at interlink.com.au  Fri May 26 16:14:04 2006
From: anthony at interlink.com.au (Anthony Baxter)
Date: Sat, 27 May 2006 00:14:04 +1000
Subject: [Python-Dev] SQLite status?
In-Reply-To: <20060526111254.GA29786@panix.com>
References: <20060526111254.GA29786@panix.com>
Message-ID: <200605270014.07847.anthony@interlink.com.au>

On Friday 26 May 2006 21:12, Aahz wrote:
> We're coming down to the wire on _Python for Dummies_, and I'm
> trying to persuade the publisher to stick a blurb about SQLite on
> the cover, but my last understanding was that there was a small
> chance we might pull SQLite for insufficient docs.  Is that still
> true?


Not in my universe, we won't. As far as I know, the docs are checked 
in. I still need to review them.


-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From arigo at tunes.org  Fri May 26 16:38:50 2006
From: arigo at tunes.org (Armin Rigo)
Date: Fri, 26 May 2006 16:38:50 +0200
Subject: [Python-Dev] Returning int instead of long from struct when
	possible for performance
In-Reply-To: <D09E0DAC-2F46-488D-9318-CFC7E1BA3ACD@redivi.com>
References: <D09E0DAC-2F46-488D-9318-CFC7E1BA3ACD@redivi.com>
Message-ID: <20060526143850.GA30741@code0.codespeak.net>

Hi Bob,

On Thu, May 25, 2006 at 10:58:21PM +0000, Bob Ippolito wrote:
> Python ints are a heck of a lot faster to work with than Python longs  
> and have the additional benefit that psyco <http:// 
> psyco.sourceforge.net> can optimize the hell out of int but can't do  
> anything at all for long.

Alternatively, someone could implement in Psyco longs that are known to
fit in an int, and longs that are known to fit in an unsigned int, and
longs that fit in a C long long, etc...  Given a full range of long
variants, it would be easier and faster to emulate a CPython behavior
where the user-visible return type is consistent.

So, as far as Psyco is concerned I'm -1 on this change.  (I won't deny
its speed and memory benefits to plain CPython, though.)


Armin

From facundobatista at gmail.com  Fri May 26 17:28:03 2006
From: facundobatista at gmail.com (Facundo Batista)
Date: Fri, 26 May 2006 12:28:03 -0300
Subject: [Python-Dev] This
In-Reply-To: <44763487.3080908@ewtllc.com>
References: <e55adp$8m5$1@sea.gmane.org> <44762E77.3080303@ewtllc.com>
	<44763168.4030308@holdenweb.com> <44763487.3080908@ewtllc.com>
Message-ID: <e04bdf310605260828q7e5cb00mf97ee0d95fc094e8@mail.gmail.com>

2006/5/25, Raymond Hettinger <rhettinger at ewtllc.com>:

> IIRC, Guido's words were that too much to the smarts in timeit.py were
> in the command-line interface and not in the Timer object were in should
> be.

Just end user experience's two cents here (btw, this line is correct
at English level?)

I found myself a lot of times going to command line to test a
function, just to use timeit.py, because doing that from the
interactive interpreter was more complicated.

Maybe should exist some "quick measure method" like,

>>> import timeit
>>> timeit.no_effort(some_function_defined_before)

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From facundobatista at gmail.com  Fri May 26 17:37:02 2006
From: facundobatista at gmail.com (Facundo Batista)
Date: Fri, 26 May 2006 12:37:02 -0300
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e555s1$q2j$1@sea.gmane.org>
References: <e555s1$q2j$1@sea.gmane.org>
Message-ID: <e04bdf310605260837i4cba67d6u8a70c5228916b6ad@mail.gmail.com>

2006/5/25, Fredrik Lundh <fredrik at pythonware.com>:

>  >>> -1 * (1, 2, 3)
> ()
>  >>> -(1, 2, 3)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> TypeError: bad operand type for unary -
>
> We Really Need To Fix This!

I don't see here an inconsistency. The operator "*" is not a
multiplier as in math, it's more a "repeater", so math multiplier
attributes don't apply here.

There's no concept like "negative tuple" or "positive tuple", so the
second example is clearly an error.

Regarding the first line, in the docs expresely says "Values of n less
than 0 are treated as 0".

I think that we can do one of the following, when we found "-1 * (1, 2, 3)":

- Treat -1 as 0 and return an empty tuple (actual behavior).
- Treat the negative as a reverser, so we get back (3, 2, 1).
- Raise an error.

Personally, +0 on the third.

Regards,

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From jafo-python-dev at tummy.com  Fri May 26 17:41:11 2006
From: jafo-python-dev at tummy.com (Sean Reifschneider)
Date: Fri, 26 May 2006 09:41:11 -0600
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e04bdf310605260837i4cba67d6u8a70c5228916b6ad@mail.gmail.com>
References: <e555s1$q2j$1@sea.gmane.org>
	<e04bdf310605260837i4cba67d6u8a70c5228916b6ad@mail.gmail.com>
Message-ID: <20060526154111.GL32617@tummy.com>

On Fri, May 26, 2006 at 12:37:02PM -0300, Facundo Batista wrote:
>- Treat the negative as a reverser, so we get back (3, 2, 1).

Then we could get:

   >>> print -123
   321

Yay!

Thanks,
Sean
-- 
 Sometimes it pays to stay in bed on Monday, rather than spending the rest
 of the week debugging Monday's code.  -- Christopher Thompson
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From facundobatista at gmail.com  Fri May 26 17:44:45 2006
From: facundobatista at gmail.com (Facundo Batista)
Date: Fri, 26 May 2006 12:44:45 -0300
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <20060526154111.GL32617@tummy.com>
References: <e555s1$q2j$1@sea.gmane.org>
	<e04bdf310605260837i4cba67d6u8a70c5228916b6ad@mail.gmail.com>
	<20060526154111.GL32617@tummy.com>
Message-ID: <e04bdf310605260844g535b655cv49941d22e5faed08@mail.gmail.com>

2006/5/26, Sean Reifschneider <jafo-python-dev at tummy.com>:

> On Fri, May 26, 2006 at 12:37:02PM -0300, Facundo Batista wrote:
> >- Treat the negative as a reverser, so we get back (3, 2, 1).
>
> Then we could get:
>
>    >>> print -123
>    321

An integer is NOT a sequence.

OTOH, that should be consistent to

>>> -1 * "123"
"321"

And remember I voted for returning an error, ;)

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From g.brandl at gmx.net  Fri May 26 17:50:06 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 26 May 2006 15:50:06 +0000
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e04bdf310605260844g535b655cv49941d22e5faed08@mail.gmail.com>
References: <e555s1$q2j$1@sea.gmane.org>	<e04bdf310605260837i4cba67d6u8a70c5228916b6ad@mail.gmail.com>	<20060526154111.GL32617@tummy.com>
	<e04bdf310605260844g535b655cv49941d22e5faed08@mail.gmail.com>
Message-ID: <e5783j$9ef$1@sea.gmane.org>

Facundo Batista wrote:
> 2006/5/26, Sean Reifschneider <jafo-python-dev at tummy.com>:
> 
>> On Fri, May 26, 2006 at 12:37:02PM -0300, Facundo Batista wrote:
>> >- Treat the negative as a reverser, so we get back (3, 2, 1).
>>
>> Then we could get:
>>
>>    >>> print -123
>>    321
> 
> An integer is NOT a sequence.
> 
> OTOH, that should be consistent to
> 
>>>> -1 * "123"
> "321"

This is actually a nice idea, because it's even a more nonintuitive
answer for Python newbies posting to c.l.py asking how to reverse
a string <wink>

Georg


From fredrik at pythonware.com  Fri May 26 17:56:58 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 26 May 2006 17:56:58 +0200
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <20060526154111.GL32617@tummy.com>
References: <e555s1$q2j$1@sea.gmane.org>	<e04bdf310605260837i4cba67d6u8a70c5228916b6ad@mail.gmail.com>
	<20060526154111.GL32617@tummy.com>
Message-ID: <e578g6$bn7$1@sea.gmane.org>

Sean Reifschneider wrote:

>> - Treat the negative as a reverser, so we get back (3, 2, 1).
> 
> Then we could get:
> 
>    >>> print -123
>    321
> 
> Yay!

and while we're at it, let's fix this:

     >>> 0.66 * (1, 2, 3)
     (1, 2)

and maybe even this

     >>> 0.5 * (1, 2, 3)
     (1, 1)

but I guess the latter one might need a pronunciation.

</F>


From fdrake at acm.org  Fri May 26 18:00:44 2006
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 26 May 2006 12:00:44 -0400
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e5783j$9ef$1@sea.gmane.org>
References: <e555s1$q2j$1@sea.gmane.org>
	<e04bdf310605260844g535b655cv49941d22e5faed08@mail.gmail.com>
	<e5783j$9ef$1@sea.gmane.org>
Message-ID: <200605261200.45290.fdrake@acm.org>

On Friday 26 May 2006 11:50, Georg Brandl wrote:
 > This is actually a nice idea, because it's even a more nonintuitive
 > answer for Python newbies posting to c.l.py asking how to reverse
 > a string <wink>

Even better:

    "123"*-1

We'd get to explain:

- what the "*-" operator is all about, and

- why we'd use it with a string and an int.

I see possibilities here.  :-)


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From g.brandl at gmx.net  Fri May 26 18:00:19 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 26 May 2006 16:00:19 +0000
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e578g6$bn7$1@sea.gmane.org>
References: <e555s1$q2j$1@sea.gmane.org>	<e04bdf310605260837i4cba67d6u8a70c5228916b6ad@mail.gmail.com>	<20060526154111.GL32617@tummy.com>
	<e578g6$bn7$1@sea.gmane.org>
Message-ID: <e578mo$bq1$1@sea.gmane.org>

Fredrik Lundh wrote:
> Sean Reifschneider wrote:
> 
>>> - Treat the negative as a reverser, so we get back (3, 2, 1).
>> 
>> Then we could get:
>> 
>>    >>> print -123
>>    321
>> 
>> Yay!
> 
> and while we're at it, let's fix this:
> 
>      >>> 0.66 * (1, 2, 3)
>      (1, 2)
> 
> and maybe even this
> 
>      >>> 0.5 * (1, 2, 3)
>      (1, 1)
> 
> but I guess the latter one might need a pronunciation.

No, no, no! Floating point is so inaccurate! It has to be

 >>> Decimal("0.5") * (1, 2, 3)
(1, Decimal("1"))

Georg


From skip at pobox.com  Fri May 26 18:09:09 2006
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 26 May 2006 11:09:09 -0500
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <200605261200.45290.fdrake@acm.org>
References: <e555s1$q2j$1@sea.gmane.org>
	<e04bdf310605260844g535b655cv49941d22e5faed08@mail.gmail.com>
	<e5783j$9ef$1@sea.gmane.org> <200605261200.45290.fdrake@acm.org>
Message-ID: <17527.10277.553493.224697@montanaro.dyndns.org>


    Fred> I see possibilities here.  :-)

Fred appears to be looking for more job security. ;-)

Skip

From fredrik at pythonware.com  Fri May 26 18:15:26 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 26 May 2006 18:15:26 +0200
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <200605261200.45290.fdrake@acm.org>
References: <e555s1$q2j$1@sea.gmane.org>	<e04bdf310605260844g535b655cv49941d22e5faed08@mail.gmail.com>	<e5783j$9ef$1@sea.gmane.org>
	<200605261200.45290.fdrake@acm.org>
Message-ID: <e579iq$gdo$1@sea.gmane.org>

Fred L. Drake, Jr. wrote:

> Even better:
> 
>     "123"*-1
> 
> We'd get to explain:
> 
> - what the "*-" operator is all about, and
> 
> - why we'd use it with a string and an int.
> 
> I see possibilities here.  :-)

the infamous "*-" clear operator?  who snuck that one into python?

</F>


From facundobatista at gmail.com  Fri May 26 18:16:18 2006
From: facundobatista at gmail.com (Facundo Batista)
Date: Fri, 26 May 2006 13:16:18 -0300
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <200605261200.45290.fdrake@acm.org>
References: <e555s1$q2j$1@sea.gmane.org>
	<e04bdf310605260844g535b655cv49941d22e5faed08@mail.gmail.com>
	<e5783j$9ef$1@sea.gmane.org> <200605261200.45290.fdrake@acm.org>
Message-ID: <e04bdf310605260916w561fb292k41ec76c72b17a804@mail.gmail.com>

2006/5/26, Fred L. Drake, Jr. <fdrake at acm.org>:

> Even better:
>
>     "123"*-1
>
> We'd get to explain:
>
> - what the "*-" operator is all about, and
>
> - why we'd use it with a string and an int.
>
> I see possibilities here.  :-)

Aaaaall this different ways enforce my vote: we should get an error...

Regards,

--
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From tim.peters at gmail.com  Fri May 26 18:19:14 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 26 May 2006 16:19:14 +0000
Subject: [Python-Dev] Low-level exception invariants?
Message-ID: <1f7befae0605260919u158acdd6u1c0df38cb675d2c4@mail.gmail.com>

In various places we store triples of exception info, like a
PyFrameObject's f_exc_type, f_exc_value, and f_exc_traceback PyObject*
members.

No invariants are documented, and that's a shame.  Patch 1145039 aims
to speed ceval a bit by relying on a weak guessed invariant, but I'd
like to make the stronger claim (which allows for stronger speedups)
that, at any moment,

    all three are NULL
or
    all three are not NULL

Now I know that's not true, although Python's test suite doesn't
provoke a violation.  In particular, PyErr_SetNone() forces
tstate->curexc_value to NULL.  Other code then special-cases the snot
out of XYZ_value, replacing NULL with Py_None on the fly, like

	PyErr_Fetch(&exc, &val, &tb);
	if (val == NULL) {
		val = Py_None;
		Py_INCREF(val);
	}

in the main eval loop.  I'd much rather change PyErr_SetNone() to set
the value to Py_None to begin with -- that function is almost never
called, so special-casing in that is less damaging than special-casing
NULL everywhere else.

The docs are confused about this now too, claiming:

void PyErr_SetNone(PyObject *type)
    This is a shorthand for "PyErr_SetObject(type, Py_None)".

It's actually a shorthand for PyErr_SetObject(type, NULL) now.

Does anyone see a real problem with _establishing_ an "all NULL or
none NULL" invariant?  It makes outstanding ;-) _conceptual_ sense to
me, and would allow removing some masses of distributed test-branch
special-casing in code trying to use these guys.

From steven.bethard at gmail.com  Fri May 26 18:25:28 2006
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri, 26 May 2006 10:25:28 -0600
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e04bdf310605260916w561fb292k41ec76c72b17a804@mail.gmail.com>
References: <e555s1$q2j$1@sea.gmane.org>
	<e04bdf310605260844g535b655cv49941d22e5faed08@mail.gmail.com>
	<e5783j$9ef$1@sea.gmane.org> <200605261200.45290.fdrake@acm.org>
	<e04bdf310605260916w561fb292k41ec76c72b17a804@mail.gmail.com>
Message-ID: <d11dcfba0605260925r71dc2fcascc8594dc57bacb3f@mail.gmail.com>

On 5/26/06, Facundo Batista <facundobatista at gmail.com> wrote:
> Aaaaall this different ways enforce my vote: we should get an error...

Perhaps you missed Tim's post, so here's a few lines of my own code
that I know would break:

    padding = [None] * (self.width - len(leaves))
    left_padding = [None] * (self.left_width - len(left_window))
    right_padding = [None] * (self.right_width - len(right_window))

Sure, I could write these as:

    padding = [None] * max(0, self.width - len(leaves))
    left_padding = [None] * max(0, self.left_width - len(left_window))
    right_padding = [None] * max(0, self.right_width - len(right_window))

But if this change goes in, I want a big "we're breaking backwards
incompatibility" message somewhere.  I say if you really want an
exception raised for these cases, ask for the behavior change in
Python 3000.  All it would give us in Python 2.X is a bunch of broken
code.

STeVe
-- 
Grammar am for people who can't think for myself.
        --- Bucky Katt, Get Fuzzy

From rcohen at snurgle.org  Fri May 26 18:34:15 2006
From: rcohen at snurgle.org (Ross Cohen)
Date: Fri, 26 May 2006 12:34:15 -0400
Subject: [Python-Dev] epoll implementation
Message-ID: <20060526163415.GD18740@snurgle.org>

I wrote an epoll implementation which can be used as a drop-in replacement
for parts of the select module (assuming the program is using only poll).
The code can currently be used by doing:

import epoll as select

It was released under the Python license on sourceforge:
http://sourceforge.net/projects/pyepoll

Is there any interest in incorporating this into the standard python
distribution?

Ross

From facundobatista at gmail.com  Fri May 26 18:40:02 2006
From: facundobatista at gmail.com (Facundo Batista)
Date: Fri, 26 May 2006 13:40:02 -0300
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <d11dcfba0605260925r71dc2fcascc8594dc57bacb3f@mail.gmail.com>
References: <e555s1$q2j$1@sea.gmane.org>
	<e04bdf310605260844g535b655cv49941d22e5faed08@mail.gmail.com>
	<e5783j$9ef$1@sea.gmane.org> <200605261200.45290.fdrake@acm.org>
	<e04bdf310605260916w561fb292k41ec76c72b17a804@mail.gmail.com>
	<d11dcfba0605260925r71dc2fcascc8594dc57bacb3f@mail.gmail.com>
Message-ID: <e04bdf310605260940x25df7e72y589b65b31ae8259d@mail.gmail.com>

2006/5/26, Steven Bethard <steven.bethard at gmail.com>:

> On 5/26/06, Facundo Batista <facundobatista at gmail.com> wrote:
> > Aaaaall this different ways enforce my vote: we should get an error...
> ...
> But if this change goes in, I want a big "we're breaking backwards
> incompatibility" message somewhere.  I say if you really want an
> exception raised for these cases, ask for the behavior change in
> Python 3000.  All it would give us in Python 2.X is a bunch of broken
> code.

Of course that this change can not be made in 2.x.

Regards,

-- 

.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From brett at python.org  Fri May 26 18:40:58 2006
From: brett at python.org (Brett Cannon)
Date: Fri, 26 May 2006 09:40:58 -0700
Subject: [Python-Dev] [Python-checkins] r46247 - in
	python/branches/sreifschneider-newnewexcept: Makefile.pre.in
	Objects/exceptions.c Python/exceptions.c
In-Reply-To: <E0069453-5C40-473F-9A33-2D744506AEE3@commonground.com.au>
References: <20060525194304.CF03E1E401C@bag.python.org>
	<fb6fbf560605251400j1afa9f16kd42bf591f482b712@mail.gmail.com>
	<E0069453-5C40-473F-9A33-2D744506AEE3@commonground.com.au>
Message-ID: <bbaeab100605260940v4b7ba7ecs554ac773e9c23188@mail.gmail.com>

Since no Python code should be generating the warning (I would double-check
that, obviously), I say take it out.  At least from a PEP 352 perspective
there is nothing about keeping it (or removing it).

-Brett

On 5/26/06, Richard Jones <richard at commonground.com.au> wrote:
>
> On 25/05/2006, at 9:00 PM, Jim Jewett wrote:
> >> +/*
> >> + *    OverflowWarning extends Warning
> >> + */
> >> +SimpleExtendsException(PyExc_Warning, OverflowWarning, "Base
> >> class for warnings about numeric overflow.  Won't exist in Python
> >> 2.5.");
> >
> > Take it out now?
>
> What do people say? I implemented this one because it was implemented
> in the old code in 2.5. Happy to remove it.
>
>
>      Richard
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060526/d72666ef/attachment.html 

From skip at pobox.com  Fri May 26 18:47:43 2006
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 26 May 2006 11:47:43 -0500
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526163415.GD18740@snurgle.org>
References: <20060526163415.GD18740@snurgle.org>
Message-ID: <17527.12591.496473.707923@montanaro.dyndns.org>


    Ross> I wrote an epoll implementation which can be used as a drop-in
    Ross> replacement for parts of the select module
    ...
    Ross> Is there any interest in incorporating this into the standard
    Ross> python distribution?

Without going to the trouble of downloading epoll (always an adventure with
SourceForget), can you explain what epoll does, how it's better than (parts
of) select, how widely it's used and how stable it is?

Thx,

Skip


From jason.orendorff at gmail.com  Fri May 26 18:50:58 2006
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Fri, 26 May 2006 12:50:58 -0400
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e04bdf310605260837i4cba67d6u8a70c5228916b6ad@mail.gmail.com>
References: <e555s1$q2j$1@sea.gmane.org>
	<e04bdf310605260837i4cba67d6u8a70c5228916b6ad@mail.gmail.com>
Message-ID: <bb8868b90605260950x178407ccpabb0f21c45af503f@mail.gmail.com>

On 5/26/06, Facundo Batista <facundobatista at gmail.com> wrote:
> I think that we can do one of the following, when we found "-1 * (1, 2, 3)":
>
> - Treat -1 as 0 and return an empty tuple (actual behavior).
> - Treat the negative as a reverser, so we get back (3, 2, 1).
> - Raise an error.

No, no, no.  The important invariant is that n * seq is
loop(seq)[:n*len(seq)] where loop(seq) is an endless loop of the
elements of seq.

So obviously, if n is negative, the result should be an infinite
sequence that's == to loop(seq).

-j

From guido at python.org  Fri May 26 18:58:43 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 26 May 2006 09:58:43 -0700
Subject: [Python-Dev] partition() variants
In-Reply-To: <447705CF.9000306@livinglogic.de>
References: <20060526124443.GA19416@localhost.localdomain>
	<447705CF.9000306@livinglogic.de>
Message-ID: <ca471dc20605260958l577e6ea9u6777da4d74f04e99@mail.gmail.com>

On 5/26/06, Walter D?rwald <walter at livinglogic.de> wrote:
> A.M. Kuchling wrote:
>
> > I didn't find an answer in the str.partition() thread in the archives
> > (it's enormous, so easy to miss the right message), so I have two
> > questions:
> >
> > 1) Is str.rpartition() still wanted?

Can't remember. Raymond?

> > 2) What about adding partition() to the re module?

No.

> And what happens if the separator is an instance of a subclass?
>
> class s2(str):
>     def __repr__(self):
>         return "s2(%r)" % str(self)
>
> print "foobar".partition(s2("o"))
>
> Currently this prints:
>    ('f', s2('o'), 'obar')
> Should this be
>    ('f', 'o', 'obar')
> or not?
>
> And what about:
>    print s2("foobar").partition("x")
> Currently this prints
>    (s2('foobar'), '', '')

These are both fine with me.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri May 26 19:05:54 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 26 May 2006 10:05:54 -0700
Subject: [Python-Dev] Low-level exception invariants?
In-Reply-To: <1f7befae0605260919u158acdd6u1c0df38cb675d2c4@mail.gmail.com>
References: <1f7befae0605260919u158acdd6u1c0df38cb675d2c4@mail.gmail.com>
Message-ID: <ca471dc20605261005r1352bbe2w72bf3e009c286268@mail.gmail.com>

+1, if you can also prove that the traceback will never be null. I
failed at that myself last time I tried, but I didn't try very long or
hard.

--Guido

On 5/26/06, Tim Peters <tim.peters at gmail.com> wrote:
> In various places we store triples of exception info, like a
> PyFrameObject's f_exc_type, f_exc_value, and f_exc_traceback PyObject*
> members.
>
> No invariants are documented, and that's a shame.  Patch 1145039 aims
> to speed ceval a bit by relying on a weak guessed invariant, but I'd
> like to make the stronger claim (which allows for stronger speedups)
> that, at any moment,
>
>     all three are NULL
> or
>     all three are not NULL
>
> Now I know that's not true, although Python's test suite doesn't
> provoke a violation.  In particular, PyErr_SetNone() forces
> tstate->curexc_value to NULL.  Other code then special-cases the snot
> out of XYZ_value, replacing NULL with Py_None on the fly, like
>
>         PyErr_Fetch(&exc, &val, &tb);
>         if (val == NULL) {
>                 val = Py_None;
>                 Py_INCREF(val);
>         }
>
> in the main eval loop.  I'd much rather change PyErr_SetNone() to set
> the value to Py_None to begin with -- that function is almost never
> called, so special-casing in that is less damaging than special-casing
> NULL everywhere else.
>
> The docs are confused about this now too, claiming:
>
> void PyErr_SetNone(PyObject *type)
>     This is a shorthand for "PyErr_SetObject(type, Py_None)".
>
> It's actually a shorthand for PyErr_SetObject(type, NULL) now.
>
> Does anyone see a real problem with _establishing_ an "all NULL or
> none NULL" invariant?  It makes outstanding ;-) _conceptual_ sense to
> me, and would allow removing some masses of distributed test-branch
> special-casing in code trying to use these guys.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri May 26 19:07:41 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 26 May 2006 10:07:41 -0700
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526163415.GD18740@snurgle.org>
References: <20060526163415.GD18740@snurgle.org>
Message-ID: <ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>

I don't know what epoll is.

On a related note, perhaps the needforspeed folks should look into
supporting kqueue on systems where it's available? That's a really
fast BSD-originated API to replace select/poll. (It's fast due to the
way the API is designed.)

--Guido

On 5/26/06, Ross Cohen <rcohen at snurgle.org> wrote:
> I wrote an epoll implementation which can be used as a drop-in replacement
> for parts of the select module (assuming the program is using only poll).
> The code can currently be used by doing:
>
> import epoll as select
>
> It was released under the Python license on sourceforge:
> http://sourceforge.net/projects/pyepoll
>
> Is there any interest in incorporating this into the standard python
> distribution?
>
> Ross
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From exarkun at divmod.com  Fri May 26 19:10:30 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Fri, 26 May 2006 13:10:30 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <17527.12591.496473.707923@montanaro.dyndns.org>
Message-ID: <20060526171030.28682.359208431.divmod.quotient.5735@ohm>

On Fri, 26 May 2006 11:47:43 -0500, skip at pobox.com wrote:
>
>    Ross> I wrote an epoll implementation which can be used as a drop-in
>    Ross> replacement for parts of the select module
>    ...
>    Ross> Is there any interest in incorporating this into the standard
>    Ross> python distribution?
>
>Without going to the trouble of downloading epoll (always an adventure with
>SourceForget), can you explain what epoll does, how it's better than (parts
>of) select, how widely it's used and how stable it is?

epoll is a high-performance io notification mechanism provided by linux 2.6.  It supports more sockets than select: it has no FD_SETSIZE or equivalent. It is more efficient than select: it scales roughly with the number of events instead of with the number of sockets.  There is little or no controversy over its superiority to select and poll.

The idea was first introduced to linux about five years ago, but the API has changed a lot in the interval.  The epoll(4) API has been stable for all of linux 2.6.

Including a wrapper for this functionality would be quite useful for many python network apps.  However, I think with ctypes going into 2.5, it might be better to consider providing epoll support using a ctypes-based module rather than an extension module.

Of course, if there is a volunteer to maintain and support an extension module, that's better than nothing.  PyEpoll is missing a couple features I would like to see - the size of the epoll set is hard-coded to FD_SETSIZE, for example: while this makes little difference to Python 2.4.3 (since you cannot use sockets with fileno >= FD_SETSIZE at all in that version of Python), it is a serious limitation for other versions of Python.  Similarly, the number of events to retrieve when invoking epoll_wait() is also hardcoded to FD_SETSIZE.  Real applications often want to tune this value to avoid being overwhelmed by io events.

These features could easily be added, I suspect, since they are primarily just extra integer parameters to various methods.

Of course the other standard things should be added as well - documentation and test coverage, neither of which seem to be present at all in PyEpoll.

Jean-Paul


From rcohen at snurgle.org  Fri May 26 19:13:44 2006
From: rcohen at snurgle.org (Ross Cohen)
Date: Fri, 26 May 2006 13:13:44 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <17527.12591.496473.707923@montanaro.dyndns.org>
References: <20060526163415.GD18740@snurgle.org>
	<17527.12591.496473.707923@montanaro.dyndns.org>
Message-ID: <20060526171344.GE18740@snurgle.org>

On Fri, May 26, 2006 at 11:47:43AM -0500, skip at pobox.com wrote:
> 
>     Ross> I wrote an epoll implementation which can be used as a drop-in
>     Ross> replacement for parts of the select module
>     ...
>     Ross> Is there any interest in incorporating this into the standard
>     Ross> python distribution?
> 
> Without going to the trouble of downloading epoll (always an adventure with
> SourceForget), can you explain what epoll does, how it's better than (parts
> of) select, how widely it's used and how stable it is?

Sure, I can provide a short version of what's on this page:
http://www.kegel.com/c10k.html

epoll is a replacement for the standard poll() system call. It's available
on Linux, introduced sometime during the 2.5 development series. Whereas poll
requires that an fd list be maintained in userspace and handed in on every
call, epoll maintains that list in the kernel and provides (de)registration
functions for individual fds. This prevents the kernel from having to scan
potentially thousands of file descriptors for each call to discover only a
handful which had activity.

Oddly enough, the interface to the poll call in the python select module
much more closely resembles the design of epoll (kudos to whoever designed
that).

This code has undergone some testing with both codeville and BitTorrent, but
I'm not going to claim it's seen as much testing as it should.

The NeedForSpeed wiki lists /dev/epoll as a goal for the twisted folks. I
assume that the naming here is either confusion with the Solaris /dev/poll
or that it refers to the initial implementation of epoll, which I believe
used a device file. Either way, this would give them that support for free,
as well as everyone else.

Ross

From tjreedy at udel.edu  Fri May 26 19:25:03 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 26 May 2006 13:25:03 -0400
Subject: [Python-Dev] This
References: <e55adp$8m5$1@sea.gmane.org>
	<44762E77.3080303@ewtllc.com><44763168.4030308@holdenweb.com>
	<44763487.3080908@ewtllc.com>
	<e04bdf310605260828q7e5cb00mf97ee0d95fc094e8@mail.gmail.com>
Message-ID: <e57dle$v78$1@sea.gmane.org>


"Facundo Batista" <facundobatista at gmail.com> wrote in message 
news:e04bdf310605260828q7e5cb00mf97ee0d95fc094e8 at mail.gmail.com...

> Just end user experience's two cents here
> (btw, this line is correct  at English level?)

Since you asked...your question would be better written "is this line 
correct English?"
And the line before, while not formal English of the kind needed, say, for 
Decimal docs, works well enough for me as an expression of an informal 
conversational thought.

> Maybe should exist some "quick measure method" like,

 Maybe *there* should....
[This is one situation where English is wordier than, for instance, 
Spanish.]

Terry Jan Reedy




From jafo-python-dev at tummy.com  Fri May 26 19:23:09 2006
From: jafo-python-dev at tummy.com (Sean Reifschneider)
Date: Fri, 26 May 2006 11:23:09 -0600
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526171344.GE18740@snurgle.org>
References: <20060526163415.GD18740@snurgle.org>
	<17527.12591.496473.707923@montanaro.dyndns.org>
	<20060526171344.GE18740@snurgle.org>
Message-ID: <20060526172309.GM32617@tummy.com>

On Fri, May 26, 2006 at 01:13:44PM -0400, Ross Cohen wrote:
>The NeedForSpeed wiki lists /dev/epoll as a goal for the twisted folks. I

Just FYI, none of the Twisted items made it onto the list of items being
worked.

Thanks,
Sean
-- 
 "You're thinking of Mr. Wizard."  "[Emilio Lizardo's] a top scientist,
 dumbkopf."  "So was Mr. Wizard."  -- _Buckaroo_Banzai_
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From fredrik at pythonware.com  Fri May 26 19:28:47 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 26 May 2006 19:28:47 +0200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>
References: <20060526163415.GD18740@snurgle.org>
	<ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>
Message-ID: <e57dsb$vjd$1@sea.gmane.org>

Guido van Rossum wrote:

> I don't know what epoll is.
> 
> On a related note, perhaps the needforspeed folks should look into
> supporting kqueue on systems where it's available? That's a really
> fast BSD-originated API to replace select/poll. (It's fast due to the
> way the API is designed.)

roughly speaking, epoll is kqueue for linux.

</F>


From rcohen at snurgle.org  Fri May 26 19:31:33 2006
From: rcohen at snurgle.org (Ross Cohen)
Date: Fri, 26 May 2006 13:31:33 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526171030.28682.359208431.divmod.quotient.5735@ohm>
References: <17527.12591.496473.707923@montanaro.dyndns.org>
	<20060526171030.28682.359208431.divmod.quotient.5735@ohm>
Message-ID: <20060526173133.GF18740@snurgle.org>

On Fri, May 26, 2006 at 01:10:30PM -0400, Jean-Paul Calderone wrote:
> Including a wrapper for this functionality would be quite useful for many python network apps.  However, I think with ctypes going into 2.5, it might be better to consider providing epoll support using a ctypes-based module rather than an extension module.

I have to admit that I wrote this for python 2.4 and haven't yet educated
myself on ctypes. If this is what's desired I can put some energy into it.
However, the PyEpoll module is largely a modified version of the python 2.4
select module.

> Of course, if there is a volunteer to maintain and support an extension module, that's better than nothing.  PyEpoll is missing a couple features I would like to see - the size of the epoll set is hard-coded to FD_SETSIZE, for example: while this makes little difference to Python 2.4.3 (since you cannot use sockets with fileno >= FD_SETSIZE at all in that version of Python), it is a serious limitation for other versions of Python.  Similarly, the number of events to retrieve when invoking epoll_wait() is also hardcoded to FD_SETSIZE.  Real applications often want to tune this value to avoid being overwhelmed by io events.

It is not true that this module limits the number of file descriptors to
FD_SETSIZE. You were probable looking at the number of file descriptors
returned by the epoll call, which is already limited by FD_SETSIZE because
of the size of event array. In any case, this can be made tunable.

> Of course the other standard things should be added as well - documentation and test coverage, neither of which seem to be present at all in PyEpoll.

Like I said, this code is a modified version of the python select module.
It should (I hope, unless I accidentally removed some) include all the
documentation from there, and any test cases for that module will apply
here. The API provided to python programs has not changed, only the backend
implementation of the API. My hope is that this can be included as a
compile-time option for the select module instead of being its own module.

Ross

From facundobatista at gmail.com  Fri May 26 19:35:43 2006
From: facundobatista at gmail.com (Facundo Batista)
Date: Fri, 26 May 2006 14:35:43 -0300
Subject: [Python-Dev] This
In-Reply-To: <e57dle$v78$1@sea.gmane.org>
References: <e55adp$8m5$1@sea.gmane.org> <44762E77.3080303@ewtllc.com>
	<44763168.4030308@holdenweb.com> <44763487.3080908@ewtllc.com>
	<e04bdf310605260828q7e5cb00mf97ee0d95fc094e8@mail.gmail.com>
	<e57dle$v78$1@sea.gmane.org>
Message-ID: <e04bdf310605261035i418aedd6r7f233897d1fa72f1@mail.gmail.com>

2006/5/26, Terry Reedy <tjreedy at udel.edu>:

> And the line before, while not formal English of the kind needed, say, for
> Decimal docs, works well enough for me as an expression of an informal
> conversational thought.

That's why Decimal docs were written by Raymond, ;)

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From tim.peters at gmail.com  Fri May 26 19:46:59 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 26 May 2006 17:46:59 +0000
Subject: [Python-Dev] Low-level exception invariants?
In-Reply-To: <ca471dc20605261005r1352bbe2w72bf3e009c286268@mail.gmail.com>
References: <1f7befae0605260919u158acdd6u1c0df38cb675d2c4@mail.gmail.com>
	<ca471dc20605261005r1352bbe2w72bf3e009c286268@mail.gmail.com>
Message-ID: <1f7befae0605261046h6d99236cneac5f9671dd0ac13@mail.gmail.com>

[Guido]
> +1, if you can also prove that the traceback will never be null. I
> failed at that myself last time I tried, but I didn't try very long or
> hard.

Thanks!  I'm digging.

Stuck right now on this miserable problem that's apparently been here
forever:  I changed PyErr_SetObject to start like so:

"""
PyErr_SetObject(PyObject *exception, PyObject *value)
{
	   assert(exception != NULL);
"""

This triggered instantly upon firing up Python, and it is was _called_ via

    PyErr_Format(PyExc_AttributeError, ...

!  I thought I had gone insane, until I realized this happens during
_PyExc_Init():  that can try to raise exceptions before the exception
pointers have been initialized.  So long as we can get into exception
machinery while the exception objects point to trash, there aren't
many useful invariants ;-)

From martin at v.loewis.de  Fri May 26 20:03:12 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 26 May 2006 20:03:12 +0200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526163415.GD18740@snurgle.org>
References: <20060526163415.GD18740@snurgle.org>
Message-ID: <447742E0.4090803@v.loewis.de>

Ross Cohen wrote:
> Is there any interest in incorporating this into the standard python
> distribution?

I would like to see epoll support in Python, but not in the way PyEpoll
is packaged. Instead, I think it should go into the select module,
and be named epoll_*.

Likewise, if kqueue was ever supported, I think it should also go
into the select module.

Regards,
Martin

From csw at k12hq.com  Fri May 26 20:08:17 2006
From: csw at k12hq.com (Christopher Weimann)
Date: Fri, 26 May 2006 14:08:17 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>
References: <20060526163415.GD18740@snurgle.org>
	<ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>
Message-ID: <20060526180817.GA63090@tektite.k12usa.internal>

On 05/26/2006-10:07AM, Guido van Rossum wrote:
> 
> I don't know what epoll is.
> 
> On a related note, perhaps the needforspeed folks should look into
> supporting kqueue on systems where it's available? That's a really
> fast BSD-originated API to replace select/poll. (It's fast due to the
> way the API is designed.)
> 

http://python-hpio.net/trac/wiki/LibEventPython

  libevent-python is a CPython wrapper extension for Niels Provos'
  libevent, a small, fast asynchronous IO framework that supports
  select(), poll(), kqueue(), /dev/poll, and epoll().

Libevent can be found here http://monkey.org/~provos/libevent/
and it explains the advantages of epoll/kqueue over select/poll.



From jonathan-lists at cleverdevil.org  Fri May 26 19:57:53 2006
From: jonathan-lists at cleverdevil.org (Jonathan LaCour)
Date: Fri, 26 May 2006 13:57:53 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>
References: <20060526163415.GD18740@snurgle.org>
	<ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>
Message-ID: <A710291B-DB4F-46BA-9343-3D0D9021B814@cleverdevil.org>

Guido van Rossum wrote:
> On a related note, perhaps the needforspeed folks should look into
> supporting kqueue on systems where it's available? That's a really
> fast BSD-originated API to replace select/poll. (It's fast due to the
> way the API is designed.)

There is, in fact, an implementation of kqueue for Python already
available.  I have not used it myself, but it is available here:

     http://python-hpio.net/trac/

maybe the needforspeed people could take a look at this?

--
Jonathan LaCour
http://cleverdevil.org



From rcohen at snurgle.org  Fri May 26 20:31:33 2006
From: rcohen at snurgle.org (Ross Cohen)
Date: Fri, 26 May 2006 14:31:33 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <447742E0.4090803@v.loewis.de>
References: <20060526163415.GD18740@snurgle.org> <447742E0.4090803@v.loewis.de>
Message-ID: <20060526183133.GG18740@snurgle.org>

On Fri, May 26, 2006 at 08:03:12PM +0200, "Martin v. L?wis" wrote:
> Ross Cohen wrote:
> > Is there any interest in incorporating this into the standard python
> > distribution?
> 
> I would like to see epoll support in Python, but not in the way PyEpoll
> is packaged. Instead, I think it should go into the select module,
> and be named epoll_*.
> 
> Likewise, if kqueue was ever supported, I think it should also go
> into the select module.

I agree that it should go into the select module, but not as a seperate
set of calls. epoll is strictly better than poll. kqueue is strictly
better than poll. Windows has its own completion ports API. Why should
an application developer have to detect what platform they are running on?
Why not simply provide the best implementation for the platform through the
python API everyone is already using?

Ross

From rhettinger at ewtllc.com  Fri May 26 20:34:35 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Fri, 26 May 2006 11:34:35 -0700
Subject: [Python-Dev] partition() variants
In-Reply-To: <ca471dc20605260958l577e6ea9u6777da4d74f04e99@mail.gmail.com>
References: <20060526124443.GA19416@localhost.localdomain>	<447705CF.9000306@livinglogic.de>
	<ca471dc20605260958l577e6ea9u6777da4d74f04e99@mail.gmail.com>
Message-ID: <44774A3B.3080909@ewtllc.com>

 >   1) Is str.rpartition() still wanted?



Yes.

From martin at v.loewis.de  Fri May 26 20:42:07 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 26 May 2006 20:42:07 +0200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <A710291B-DB4F-46BA-9343-3D0D9021B814@cleverdevil.org>
References: <20060526163415.GD18740@snurgle.org>	<ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>
	<A710291B-DB4F-46BA-9343-3D0D9021B814@cleverdevil.org>
Message-ID: <44774BFF.9090203@v.loewis.de>

Jonathan LaCour wrote:
> There is, in fact, an implementation of kqueue for Python already
> available.  I have not used it myself, but it is available here:
> 
>      http://python-hpio.net/trac/
> 
> maybe the needforspeed people could take a look at this?

"take a look" is really quite involved. We cannot just incorporate the
code without permission of the author. So if the code is itself
acceptable, then somebody would have to talk the author into
contributing.

Regards,
Martin

From fredrik at pythonware.com  Fri May 26 20:39:51 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 26 May 2006 20:39:51 +0200
Subject: [Python-Dev] partition() variants
In-Reply-To: <44774A3B.3080909@ewtllc.com>
References: <20060526124443.GA19416@localhost.localdomain>	<447705CF.9000306@livinglogic.de>	<ca471dc20605260958l577e6ea9u6777da4d74f04e99@mail.gmail.com>
	<44774A3B.3080909@ewtllc.com>
Message-ID: <e57i1i$epq$2@sea.gmane.org>

Raymond Hettinger wrote:

> 1) Is str.rpartition() still wanted?
> 
> Yes.

I might have missed my earlier 30-minute deadline with one minute (not 
my fault! I was distracted! seriously!), but  this time, I actually 
managed to get the code in there *before* I saw the pronouncement ;-)

</F>


From martin at v.loewis.de  Fri May 26 20:48:42 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 26 May 2006 20:48:42 +0200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526183133.GG18740@snurgle.org>
References: <20060526163415.GD18740@snurgle.org> <447742E0.4090803@v.loewis.de>
	<20060526183133.GG18740@snurgle.org>
Message-ID: <44774D8A.2000600@v.loewis.de>

Ross Cohen wrote:
> I agree that it should go into the select module, but not as a seperate
> set of calls. epoll is strictly better than poll. kqueue is strictly
> better than poll. Windows has its own completion ports API. Why should
> an application developer have to detect what platform they are running on?

Because the APIs have different semantics. Design some API for epoll,
and make it replace select or poll (your choice), and I create you
an application that breaks under your poll "emulation".

If a complete emulation was possible, the OS developers would not have
invented a new API.

> Why not simply provide the best implementation for the platform through the
> python API everyone is already using?

Well, what *is* the API everyone is already using? This is really
something for a higher-level API that assumes a certain processing
model (unless the OS API, which assumes no processing model).

Python has a tradition of exposing system APIs *as is*, with no
second-guessing either the OS developers, nor the application
developers. Then, we put a unified API *on top* of that. For
event processing, the standard library always provided
asyncore/asynchat, although people don't like it. I'm sure
the Twisted people would integrate epoll in no time, and with
no API change for *their* users.

Regards,
Martin

From exarkun at divmod.com  Fri May 26 20:49:44 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Fri, 26 May 2006 14:49:44 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526183133.GG18740@snurgle.org>
Message-ID: <20060526184944.28682.581801558.divmod.quotient.5772@ohm>

On Fri, 26 May 2006 14:31:33 -0400, Ross Cohen <rcohen at snurgle.org> wrote:
>On Fri, May 26, 2006 at 08:03:12PM +0200, "Martin v. L?wis" wrote:
>> Ross Cohen wrote:
>> > Is there any interest in incorporating this into the standard python
>> > distribution?
>>
>> I would like to see epoll support in Python, but not in the way PyEpoll
>> is packaged. Instead, I think it should go into the select module,
>> and be named epoll_*.
>>
>> Likewise, if kqueue was ever supported, I think it should also go
>> into the select module.
>
>I agree that it should go into the select module, but not as a seperate
>set of calls.

How about "not *only* as a separate set of calls"?  If poll(2) and epoll(4) are both available on the underlying platform, then they should both be exposed to Python as separate APIs.  Then, on top of that, a relatively simple layer which selects the most efficient mechanism can be exposed, and developers can be encouraged to use that instead.

Hiding the difference between poll and epoll would be detrimental to more advanced developers since it would hide the edge-triggered mode of epoll, which is still more efficient than the level-triggered mode, since only the level-triggered mode can be considered API-compatible with poll(2) without adding a much more complex layer on top.

>epoll is strictly better than poll. kqueue is strictly
>better than poll.

AIUI, kqueue actually isn't implemented for PTYs on OS X, whereas poll(2) is.  Given this, I don't think kqueue is actually strictly better.  Although hopefully Apple will get their act together and fix this deficiency.

>Windows has its own completion ports API. Why should
>an application developer have to detect what platform they are running on?
>Why not simply provide the best implementation for the platform through the
>python API everyone is already using?

So far as this is possible, I think it is a good idea.

Jean-Paul

From exarkun at divmod.com  Fri May 26 20:57:05 2006
From: exarkun at divmod.com (Jean-Paul Calderone)
Date: Fri, 26 May 2006 14:57:05 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526173133.GF18740@snurgle.org>
Message-ID: <20060526185705.28682.1860098948.divmod.quotient.5776@ohm>

On Fri, 26 May 2006 13:31:33 -0400, Ross Cohen <rcohen at snurgle.org> wrote:
>On Fri, May 26, 2006 at 01:10:30PM -0400, Jean-Paul Calderone wrote:
>> Of course, if there is a volunteer to maintain and support an extension module, that's better than nothing.  PyEpoll is missing a couple features I would like to see - the size of the epoll set is hard-coded to FD_SETSIZE, for example: while this makes little difference to Python 2.4.3 (since you cannot use sockets with fileno >= FD_SETSIZE at all in that version of Python), it is a serious limitation for other versions of Python.  Similarly, the number of events to retrieve when invoking epoll_wait() is also hardcoded to FD_SETSIZE.  Real applications often want to tune this value to avoid being overwhelmed by io events.
>
>It is not true that this module limits the number of file descriptors to
>FD_SETSIZE. You were probable looking at the number of file descriptors
>returned by the epoll call, which is already limited by FD_SETSIZE because
>of the size of event array. In any case, this can be made tunable.

Woops, you're right, of course.  I forgot that the argument to epoll_create is only a hint, not a limit.  Sorry about that.

Jean-Paul

From rcohen at snurgle.org  Fri May 26 21:19:42 2006
From: rcohen at snurgle.org (Ross Cohen)
Date: Fri, 26 May 2006 15:19:42 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <44774D8A.2000600@v.loewis.de>
References: <20060526163415.GD18740@snurgle.org> <447742E0.4090803@v.loewis.de>
	<20060526183133.GG18740@snurgle.org> <44774D8A.2000600@v.loewis.de>
Message-ID: <20060526191942.GH18740@snurgle.org>

On Fri, May 26, 2006 at 08:48:42PM +0200, "Martin v. L?wis" wrote:
> Ross Cohen wrote:
> > I agree that it should go into the select module, but not as a seperate
> > set of calls. epoll is strictly better than poll. kqueue is strictly
> > better than poll. Windows has its own completion ports API. Why should
> > an application developer have to detect what platform they are running on?
> 
> Because the APIs have different semantics. Design some API for epoll,
> and make it replace select or poll (your choice), and I create you
> an application that breaks under your poll "emulation".

It doesn't emulate the poll system call, it implements the poll call from
the select module. See below.

> If a complete emulation was possible, the OS developers would not have
> invented a new API.

True, and as I mentioned before, the python API more closely matches epoll
in this case. The level-triggered mode of epoll is an almost perfect match.
Someone went to some lengths to hide the details of the system poll
interface.

> > Why not simply provide the best implementation for the platform through the
> > python API everyone is already using?
> 
> Well, what *is* the API everyone is already using? This is really
> something for a higher-level API that assumes a certain processing
> model (unless the OS API, which assumes no processing model).

Without resorting to extension modules which are not part of the standard
python library, people are using either select.poll or select.select. I
was referring to select.poll, which is a better interface, but I believe
isn't available on OS X.

> Python has a tradition of exposing system APIs *as is*, with no
> second-guessing either the OS developers, nor the application
> developers. Then, we put a unified API *on top* of that. For
> event processing, the standard library always provided
> asyncore/asynchat, although people don't like it.

Someone already broke that tradition in this case. If the select.poll
implementation is improved then everyone reaps the benefit.

Ross

From martin at v.loewis.de  Fri May 26 22:12:15 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 26 May 2006 22:12:15 +0200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526191942.GH18740@snurgle.org>
References: <20060526163415.GD18740@snurgle.org> <447742E0.4090803@v.loewis.de>
	<20060526183133.GG18740@snurgle.org> <44774D8A.2000600@v.loewis.de>
	<20060526191942.GH18740@snurgle.org>
Message-ID: <4477611F.8050609@v.loewis.de>

Ross Cohen wrote:
> True, and as I mentioned before, the python API more closely matches epoll
> in this case. The level-triggered mode of epoll is an almost perfect match.
> Someone went to some lengths to hide the details of the system poll
> interface.

Ah right, I missed that point. That makes it difficult to find an
application that would break :-) One problem is that epoll allocates
another file handle, when poll does not; the other problem could exist
when the EPOLL constants differ in value from the POLL constants (which
isn't the case on Linux), and then somebody uses numeric values instead
of symbolic ones for register().

That said, I would be in favour of having select.poll "silently" use
epoll where available. Of course, it would be good if a "cheap" run-time
test could be made whether epoll is available at run-time (although
just waiting for ENOSYS from epoll_create is probably cheap enough).
Also, it would be good if the application could find out it is using
epoll; for example, epollObject could expose a fileno member.

Regards,
Martin

From rcohen at snurgle.org  Fri May 26 22:22:25 2006
From: rcohen at snurgle.org (Ross Cohen)
Date: Fri, 26 May 2006 16:22:25 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526184944.28682.581801558.divmod.quotient.5772@ohm>
References: <20060526183133.GG18740@snurgle.org>
	<20060526184944.28682.581801558.divmod.quotient.5772@ohm>
Message-ID: <20060526202225.GI18740@snurgle.org>

On Fri, May 26, 2006 at 02:49:44PM -0400, Jean-Paul Calderone wrote:
> On Fri, 26 May 2006 14:31:33 -0400, Ross Cohen <rcohen at snurgle.org> wrote:
> >
> >I agree that it should go into the select module, but not as a seperate
> >set of calls.
> 
> How about "not *only* as a separate set of calls"?  If poll(2) and epoll(4) are both available on the underlying platform, then they should both be exposed to Python as separate APIs.  Then, on top of that, a relatively simple layer which selects the most efficient mechanism can be exposed, and developers can be encouraged to use that instead.

This is reasonable, though it requires some surgery to the select module.
select.poll is a very reasonable layer over whatever mechanism is available.
The system poll call would need to be properly wrapped and exposed in
addition to epoll, because currently it is not.

> Hiding the difference between poll and epoll would be detrimental to more advanced developers since it would hide the edge-triggered mode of epoll, which is still more efficient than the level-triggered mode, since only the level-triggered mode can be considered API-compatible with poll(2) without adding a much more complex layer on top.

I've never been fond of the edge-triggered mode. It's a pretty minor
optimization, it saves scanning the set of previously returned fds to see
if the events on them are still relevant. Given that there are added
pitfalls to using that mode, it never seemed worth it. Any layer in python
which tries to smooth over the differences will certainly kill the benefits.

Of course, that's just my personal preference. If someone wants to use
the edge-triggered mode, by all means let them.

> >epoll is strictly better than poll. kqueue is strictly
> >better than poll.
> 
> AIUI, kqueue actually isn't implemented for PTYs on OS X, whereas poll(2) is.  Given this, I don't think kqueue is actually strictly better.  Although hopefully Apple will get their act together and fix this deficiency.

Ok, I'm not familiar with intimate details of kqueue. However, if there
were a select.poll implementation based on kqueue, it would still be an
improvement, since there isn't *any* implementation on OS X right now.

Ross

From martin at v.loewis.de  Fri May 26 22:27:14 2006
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 26 May 2006 22:27:14 +0200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526184944.28682.581801558.divmod.quotient.5772@ohm>
References: <20060526184944.28682.581801558.divmod.quotient.5772@ohm>
Message-ID: <447764A2.9040103@v.loewis.de>

> How about "not *only* as a separate set of calls"?  If poll(2) and
> epoll(4) are both available on the underlying platform, then they
> should both be exposed to Python as separate APIs.  Then, on top of
> that, a relatively simple layer which selects the most efficient
> mechanism can be exposed, and developers can be encouraged to use
> that instead.

Why does it have to be a separate API? Isn't the edge-triggered
mode just a matter of passing EPOLLET into register?

If so, I would advise against having two sets of functions,
although having two sets of constants is propably fine.
I would just provide a second constructor, select.epoll(),
which guarantees that epoll is used as the implementation
strategy (i.e. it would raise OSError if the kernel doesn't
support epoll_create).

Regards,
Martin


From rcohen at snurgle.org  Fri May 26 22:34:36 2006
From: rcohen at snurgle.org (Ross Cohen)
Date: Fri, 26 May 2006 16:34:36 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <4477611F.8050609@v.loewis.de>
References: <20060526163415.GD18740@snurgle.org> <447742E0.4090803@v.loewis.de>
	<20060526183133.GG18740@snurgle.org> <44774D8A.2000600@v.loewis.de>
	<20060526191942.GH18740@snurgle.org> <4477611F.8050609@v.loewis.de>
Message-ID: <20060526203436.GJ18740@snurgle.org>

On Fri, May 26, 2006 at 10:12:15PM +0200, "Martin v. L?wis" wrote:
> That said, I would be in favour of having select.poll "silently" use
> epoll where available. Of course, it would be good if a "cheap" run-time
> test could be made whether epoll is available at run-time (although
> just waiting for ENOSYS from epoll_create is probably cheap enough).
> Also, it would be good if the application could find out it is using
> epoll; for example, epollObject could expose a fileno member.

Woo! I'd be happy to code this up. First I'd like to get a feel for the
prefered way of getting this integrated. Would people rather see:

1) Provide an epoll implementation which is used "silently" when the call is available.

2) Expose both poll(2) and epoll(4) in python and build select.poll on top of whatever is available.

Ok, so 2 is only different in that it exposes the lower level APIs. I'd like
to know if that's what people want.

Ross

From walter at livinglogic.de  Fri May 26 23:09:22 2006
From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 26 May 2006 23:09:22 +0200 (CEST)
Subject: [Python-Dev] partition() variants
In-Reply-To: <ca471dc20605260958l577e6ea9u6777da4d74f04e99@mail.gmail.com>
References: <20060526124443.GA19416@localhost.localdomain>
	<447705CF.9000306@livinglogic.de>
	<ca471dc20605260958l577e6ea9u6777da4d74f04e99@mail.gmail.com>
Message-ID: <61044.89.54.23.151.1148677762.squirrel@isar.livinglogic.de>

Guido van Rossum wrote:
> On 5/26/06, Walter D?rwald <walter at livinglogic.de> wrote:
> [...]
>> And what happens if the separator is an instance of a subclass?
>>
>> class s2(str):
>>     def __repr__(self):
>>         return "s2(%r)" % str(self)
>>
>> print "foobar".partition(s2("o"))
>>
>> Currently this prints:
>>    ('f', s2('o'), 'obar')
>> Should this be
>>    ('f', 'o', 'obar')
>> or not?
>>
>> And what about:
>>    print s2("foobar").partition("x")
>> Currently this prints
>>    (s2('foobar'), '', '')
>
> These are both fine with me.

split() doesn't behave that way:

>>> s2("foobar").split("x")
['foo']

Servus,
   Walter




From guido at python.org  Fri May 26 23:27:01 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 26 May 2006 14:27:01 -0700
Subject: [Python-Dev] partition() variants
In-Reply-To: <61044.89.54.23.151.1148677762.squirrel@isar.livinglogic.de>
References: <20060526124443.GA19416@localhost.localdomain>
	<447705CF.9000306@livinglogic.de>
	<ca471dc20605260958l577e6ea9u6777da4d74f04e99@mail.gmail.com>
	<61044.89.54.23.151.1148677762.squirrel@isar.livinglogic.de>
Message-ID: <ca471dc20605261427j451d64dfv2d5e7ac06d5c9eac@mail.gmail.com>

I think you're getting to implementation details here. Whether a new
string is returned or a reference to the old one is an optimization
decision. I don't think it's worth legislating this behavior one way
or another (especially since it's mostly a theoretical issue).

--Guido

On 5/26/06, Walter D?rwald <walter at livinglogic.de> wrote:
> Guido van Rossum wrote:
> > On 5/26/06, Walter D?rwald <walter at livinglogic.de> wrote:
> > [...]
> >> And what happens if the separator is an instance of a subclass?
> >>
> >> class s2(str):
> >>     def __repr__(self):
> >>         return "s2(%r)" % str(self)
> >>
> >> print "foobar".partition(s2("o"))
> >>
> >> Currently this prints:
> >>    ('f', s2('o'), 'obar')
> >> Should this be
> >>    ('f', 'o', 'obar')
> >> or not?
> >>
> >> And what about:
> >>    print s2("foobar").partition("x")
> >> Currently this prints
> >>    (s2('foobar'), '', '')
> >
> > These are both fine with me.
>
> split() doesn't behave that way:
>
> >>> s2("foobar").split("x")
> ['foo']
>
> Servus,
>    Walter
>
>
>
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Fri May 26 23:41:04 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 26 May 2006 23:41:04 +0200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526203436.GJ18740@snurgle.org>
References: <20060526163415.GD18740@snurgle.org> <447742E0.4090803@v.loewis.de>
	<20060526183133.GG18740@snurgle.org> <44774D8A.2000600@v.loewis.de>
	<20060526191942.GH18740@snurgle.org> <4477611F.8050609@v.loewis.de>
	<20060526203436.GJ18740@snurgle.org>
Message-ID: <447775F0.7020301@v.loewis.de>

Ross Cohen wrote:
> 1) Provide an epoll implementation which is used "silently" when the call is available.
> 
> 2) Expose both poll(2) and epoll(4) in python and build select.poll on top of whatever is available.
> 
> Ok, so 2 is only different in that it exposes the lower level APIs. I'd like
> to know if that's what people want.

Supporting EPOLLET reliably looks like a valid use case, so there must
be some way to specify "I really want epoll". Whether this is through
a second entry point, or through an optional flag to poll, I don't care
(providing the flag might be actually more expressive, since it would
also allow to specify "I want poll", although I can't see a use case
for that).

Regards,
Martin

From ronaldoussoren at mac.com  Fri May 26 23:43:05 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Fri, 26 May 2006 23:43:05 +0200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526202225.GI18740@snurgle.org>
References: <20060526183133.GG18740@snurgle.org>
	<20060526184944.28682.581801558.divmod.quotient.5772@ohm>
	<20060526202225.GI18740@snurgle.org>
Message-ID: <26140B95-02D1-4C9D-AF82-533061173EB9@mac.com>


On 26-mei-2006, at 22:22, Ross Cohen wrote:


>>
>> AIUI, kqueue actually isn't implemented for PTYs on OS X, whereas  
>> poll(2) is.  Given this, I don't think kqueue is actually strictly  
>> better.  Although hopefully Apple will get their act together and  
>> fix this deficiency.
>
> Ok, I'm not familiar with intimate details of kqueue. However, if  
> there
> were a select.poll implementation based on kqueue, it would still  
> be an
> improvement, since there isn't *any* implementation on OS X right now.

Huh? osx support poll just fine, at least in recent versions. 10.3's  
poll is slightly broken, the one on 10.4 works fine (AFAIK).

Ronald


From ocean at m2.ccsnet.ne.jp  Fri May 26 23:44:00 2006
From: ocean at m2.ccsnet.ne.jp (H.Yamamoto)
Date: Sat, 27 May 2006 06:44:00 +0900
Subject: [Python-Dev] patch for mbcs codecs
Message-ID: <20060527064400.CBC2A9A0.ocean@m2.ccsnet.ne.jp>

Hello. This is my first post.

I sent the patch for fixing mbcs codec bug to SourceForge, and I was adviced to find
someone here with a multibyte version of Windows to look at the patch.

http://sourceforge.net/tracker/index.php?func=detail&aid=1455898&group_id=5470&atid=305470

Is there anyone? Any help will be appriciated.

Thank you.


From martin at v.loewis.de  Fri May 26 23:56:06 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 26 May 2006 23:56:06 +0200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526202225.GI18740@snurgle.org>
References: <20060526183133.GG18740@snurgle.org>	<20060526184944.28682.581801558.divmod.quotient.5772@ohm>
	<20060526202225.GI18740@snurgle.org>
Message-ID: <44777976.9010904@v.loewis.de>

Ross Cohen wrote:
> I've never been fond of the edge-triggered mode. It's a pretty minor
> optimization, it saves scanning the set of previously returned fds to see
> if the events on them are still relevant. Given that there are added
> pitfalls to using that mode, it never seemed worth it. Any layer in python
> which tries to smooth over the differences will certainly kill the benefits.

I thought about this (even though I never used it), and I think there
are good uses for EPOLLET. Suppose you have a large set of descriptors
where you possibly want to write to.

So you poll for write, then write, then poll again. Eventually you fill
the write buffer, and need to wait for the other side to read some data;
you block in poll. Later, you have sent all data you wanted to send,
yet the connection remains open. So you have to *remove* the fd from
the poll list, because otherwise poll would return immediately each
time. So you keep adding and removing file descriptors to the epoll
set, essentially for each write operation.

I imagine it would be more efficient to just leave the write file
descriptors in the set, even if you have sent all data. With EPOLLET,
the epoll_wait will just ignore those descriptors, and it will watch
only those for which you have actually filled the transmit queue,
and where you have pending data.

This feature sounds reasonable to me.

> Ok, I'm not familiar with intimate details of kqueue. However, if there
> were a select.poll implementation based on kqueue, it would still be an
> improvement, since there isn't *any* implementation on OS X right now.

That depends on the kernel version, no? I though 10.4 added poll.

Also, kqueue support would be for the BSDs primarily. If somebody
contributed a patch, I would accept it providing it was disabled on
OSX (assuming kqueue really is not true poll replacement on this
system).

Regards,
Martin

From ronaldoussoren at mac.com  Sat May 27 00:07:36 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sat, 27 May 2006 00:07:36 +0200
Subject: [Python-Dev] warnings about missing __init__.py in toplevel
	directories
Message-ID: <4E174EB4-CB3D-4E36-9532-9BBABF090D6C@mac.com>

Hi,

Some time ago a warning was introduced for directories on sys.path  
that don't contain an __init__.py but have the same name as a package/ 
module that is being imported.

Is it intentional that this triggers for toplevel imports? These  
warnings are triggered in the build process for PyObjC, which is in  
itself pretty harmless but very annoying.


We have a source-deps directory that contains several external  
packages that are used during the build. Those external packages are  
included through the svn:external mechanism and are in directories  
named after those packages, with a .pth file to include them on  
sys.path. In most cases the package name is the same as the python  
package, which is triggering the warning.

Our package structure:

	setup.py
	source-deps/
		subprocess/
		subprocess.pth # includes subproces on sys.path
		py2app/
		py2app.pth  # includes py2app on sys.path
		...
	... # the regular sources, documentation and stuff

Ronald

From guido at python.org  Sat May 27 00:46:11 2006
From: guido at python.org (Guido van Rossum)
Date: Fri, 26 May 2006 15:46:11 -0700
Subject: [Python-Dev] warnings about missing __init__.py in toplevel
	directories
In-Reply-To: <4E174EB4-CB3D-4E36-9532-9BBABF090D6C@mac.com>
References: <4E174EB4-CB3D-4E36-9532-9BBABF090D6C@mac.com>
Message-ID: <ca471dc20605261546r6fe2efbcm30bc5bae7d99b042@mail.gmail.com>

On 5/26/06, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
> Some time ago a warning was introduced for directories on sys.path
> that don't contain an __init__.py but have the same name as a package/
> module that is being imported.
>
> Is it intentional that this triggers for toplevel imports?

Yes, since that's the major use case.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rcohen at snurgle.org  Sat May 27 00:46:51 2006
From: rcohen at snurgle.org (Ross Cohen)
Date: Fri, 26 May 2006 18:46:51 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <26140B95-02D1-4C9D-AF82-533061173EB9@mac.com>
References: <20060526183133.GG18740@snurgle.org>
	<20060526184944.28682.581801558.divmod.quotient.5772@ohm>
	<20060526202225.GI18740@snurgle.org>
	<26140B95-02D1-4C9D-AF82-533061173EB9@mac.com>
Message-ID: <20060526224650.GL18740@snurgle.org>

On Fri, May 26, 2006 at 11:43:05PM +0200, Ronald Oussoren wrote:
> >Ok, I'm not familiar with intimate details of kqueue. However, if  
> >there
> >were a select.poll implementation based on kqueue, it would still  
> >be an
> >improvement, since there isn't *any* implementation on OS X right now.
> 
> Huh? osx support poll just fine, at least in recent versions. 10.3's  
> poll is slightly broken, the one on 10.4 works fine (AFAIK).

I'm probably not up to date. All I could find reference to was that poll()
on 10.3 is emulated using select().

Ross

From rcohen at snurgle.org  Sat May 27 01:23:01 2006
From: rcohen at snurgle.org (Ross Cohen)
Date: Fri, 26 May 2006 19:23:01 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <44777976.9010904@v.loewis.de>
References: <20060526183133.GG18740@snurgle.org>
	<20060526184944.28682.581801558.divmod.quotient.5772@ohm>
	<20060526202225.GI18740@snurgle.org> <44777976.9010904@v.loewis.de>
Message-ID: <20060526232301.GM18740@snurgle.org>

On Fri, May 26, 2006 at 11:56:06PM +0200, "Martin v. L?wis" wrote:
> I thought about this (even though I never used it), and I think there
> are good uses for EPOLLET.

There are, if the programmer wants to deal with it. Easy enough to add the
flag and give them the choice. I'll put together a select module which
adds this and code and uses it if the right stuff is present at compile and
run time.

epoll also allows 64 bits of data to be tucked away and returned when events
happen. Could be useful for saving a dict lookup for every event. However,
there are some refcounting issues. Dict lookup per event could be traded
for one on deregistration. All it needs is a small forward-compatible
extension to the current select.poll API.

Ross

From greg.ewing at canterbury.ac.nz  Sat May 27 03:22:46 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 27 May 2006 13:22:46 +1200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <e57dsb$vjd$1@sea.gmane.org>
References: <20060526163415.GD18740@snurgle.org>
	<ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>
	<e57dsb$vjd$1@sea.gmane.org>
Message-ID: <4477A9E6.2020108@canterbury.ac.nz>

Fredrik Lundh wrote:

> roughly speaking, epoll is kqueue for linux.

There are many different select-like things around now
(select, poll, epoll, kqueue -- are there others?) and
random combinations of them seem to be available on any
given platform. This makes writing platform-independent
code that needs select-like functionality very awkward.

Rather than adding yet another platform-dependent module,
I'd like to see a unified Python interface in the stdlib
that uses whichever is the best one available.

--
Greg

From steve at holdenweb.com  Sat May 27 03:27:20 2006
From: steve at holdenweb.com (Steve Holden)
Date: Sat, 27 May 2006 02:27:20 +0100
Subject: [Python-Dev] epoll implementation
In-Reply-To: <4477A9E6.2020108@canterbury.ac.nz>
References: <20060526163415.GD18740@snurgle.org>	<ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>	<e57dsb$vjd$1@sea.gmane.org>
	<4477A9E6.2020108@canterbury.ac.nz>
Message-ID: <e589te$nbd$1@sea.gmane.org>

Greg Ewing wrote:
> Fredrik Lundh wrote:
> 
> 
>>roughly speaking, epoll is kqueue for linux.
> 
> 
> There are many different select-like things around now
> (select, poll, epoll, kqueue -- are there others?) and
> random combinations of them seem to be available on any
> given platform. This makes writing platform-independent
> code that needs select-like functionality very awkward.
> 
> Rather than adding yet another platform-dependent module,
> I'd like to see a unified Python interface in the stdlib
> that uses whichever is the best one available.
> 
Of course that would mean establishing which *was* the best available 
which, as we've seen this week, may not be easy.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Love me, love my blog  http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From shlomme at gmx.net  Thu May 25 19:42:49 2006
From: shlomme at gmx.net (Torsten Marek)
Date: Thu, 25 May 2006 19:42:49 +0200
Subject: [Python-Dev] Proposal for a new itertools function: iwindow
Message-ID: <4475EC99.7050006@gmx.net>

Hi,

in the last time, I've found myself reimplementing a generator that provides a
sliding-window-view over a sequence, and I think this function is of a greater
usefullness, so that it might be included in itertools.

Basically, what the generator does it return all m consecutive elements from a
sequence [i0, i1, ... in]. It then returns [i0, i1, i2], [i1, i2, i3], ...
[in-2, in-1, in] (assuming that m = 3).
In code, it looks like this:

>>> list(iwindow(range(0,5), 3))
[[0, 1, 2], [1, 2, 3], [2, 3, 4]]

This list can be generated by using izip(a, a[1:], ..., a[n:]), but typing all
the sequence arguments gets tedious. If a is not a sequence but a generator,
tee(...) and islice has to be used.

It might be possible that the windows should be padded, so that the sequence of
windows starts with [pad, pad, ..., i0] and ends with [in, pad, pad, ...]

>>> list(iwindow(range(0,5), 3, pad=True))
[[None, None, 0], [None, 0, 1], [0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, None],
[4, None, None]]

Additionally, the value used for padding can be specified. This makes the
argument list of this function rather long, but the last two arguments are
optional anyway:
iwindow(iterable, window_size=3, pad = False, padding_value = None)

Some open question remain:
- should iwindow return lists or tuples?
- what happens if the length of the iterable is smaller than the window size,
and no padding is specified? Is this an error? Should the generator return no
value at all or one window that is too small?


I've attached a Python implementation of this function. If the function is
deemed to be actually useful, I'd be happy to brush up my C and provide a C
implementation along with docs and tests.


best,

Torsten

PS: Please CC me, as I'm not subscribed to the list
-- 
Torsten Marek <shlomme at gmx.net>
ID: A244C858 -- FP: 1902 0002 5DFC 856B F146  894C 7CC5 451E A244 C858
Keyserver: subkeys.pgp.net


-------------- next part --------------
A non-text attachment was scrubbed...
Name: iwindow.py
Type: application/x-python
Size: 666 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060525/97d16655/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 252 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20060525/97d16655/attachment.pgp 

From rcohen at snurgle.org  Thu May 25 21:41:15 2006
From: rcohen at snurgle.org (Ross Cohen)
Date: Thu, 25 May 2006 15:41:15 -0400
Subject: [Python-Dev] epoll implementation
Message-ID: <20060525194114.GV11922@snurgle.org>

I wrote an epoll implementation which can be used as a drop-in replacement
for parts of the select module (assuming the program is using only poll).
The code can currently be used by doing:

import epoll as select

It was released under the Python license on sourceforge:
http://sourceforge.net/projects/pyepoll

Is there any interest in incorporating this into the standard python
distribution?

Ross

From aleaxit at gmail.com  Sat May 27 05:59:24 2006
From: aleaxit at gmail.com (Alex Martelli)
Date: Fri, 26 May 2006 20:59:24 -0700
Subject: [Python-Dev] epoll implementation
In-Reply-To: <e589te$nbd$1@sea.gmane.org>
References: <20060526163415.GD18740@snurgle.org>	<ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>	<e57dsb$vjd$1@sea.gmane.org>
	<4477A9E6.2020108@canterbury.ac.nz> <e589te$nbd$1@sea.gmane.org>
Message-ID: <D493B3B9-53D9-42F8-B2EB-AF04F10E34DC@gmail.com>


On May 26, 2006, at 6:27 PM, Steve Holden wrote:

> Greg Ewing wrote:
>> Fredrik Lundh wrote:
>>
>>
>>> roughly speaking, epoll is kqueue for linux.
>>
>>
>> There are many different select-like things around now
>> (select, poll, epoll, kqueue -- are there others?) and
>> random combinations of them seem to be available on any
>> given platform. This makes writing platform-independent
>> code that needs select-like functionality very awkward.
>>
>> Rather than adding yet another platform-dependent module,
>> I'd like to see a unified Python interface in the stdlib
>> that uses whichever is the best one available.
>>
> Of course that would mean establishing which *was* the best available
> which, as we've seen this week, may not be easy.

I believe it's: kqueue on FreeBSD (for recent-enough versions  
thereof), otherwise epoll where available and nonbuggy, otherwise  
poll ditto, otherwise select -- that's roughly what Twisted uses for  
Reactor implementations, if memory serves me well.  The platform- 
based heuristic should try to identify things this way but let the  
developer easily override if they know better.  (One might add a  
Windows event loop as the best implementation available there --  
Twisted does -- and GUI toolkit based event loops -- but in general  
that takes an abstraction level similar to Twisted's Reactor... maybe  
we should just repurpose that bit of Twisted?-)

I don't think this is feasible for 2.5 (too late in the cycle to add  
major new stuff, IMHO), but it's well worth it for 2.6 (again IMHO).


Alex


From nnorwitz at gmail.com  Sat May 27 06:51:55 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Fri, 26 May 2006 21:51:55 -0700
Subject: [Python-Dev] Need for Speed Sprint status
Message-ID: <ee2a432c0605262151s72c83e3rb387a21fb5480ef9@mail.gmail.com>

First off, good work to everyone involved.  You did a tremendous job.
I just hope to hell you're done, because I can't keep up! :-)

It would help me enormously if someone could summarize the status and
everything that went on.  These are the things that would help me the
most.

 * What are the speed diffs before/after the sprint
 * What was modified (summary)
 * What is left to do
    - doc
    - tests
    - code
 * Which branches are still planning to remain active
 * Lessons learned, how we can improve for the next time
 * Suggestions for further areas to look into improving

It looks like there were a lot of additions to the string test suite,
that's great.  I'm not sure if the other areas touched got similar
boosts to their tests.  It would be good to upgrade all tests to
verify corner cases of the implementation.  These tests should also be
documented that they are to test the implementation rather than the
language spec.  We don't want to write obscure tests that can't pass
in other impls like Jython, IronPython, or PyPy.

I will turn my amd64 box back on tomorrow and will also run Python
through valgrind and pychecker when I get a chance.  There are a
couple of new Coverity complaints that need to be addressed.

Cheers,
n

From python at rcn.com  Sat May 27 07:00:34 2006
From: python at rcn.com (Raymond Hettinger)
Date: Fri, 26 May 2006 22:00:34 -0700
Subject: [Python-Dev] Need for Speed Sprint status
References: <ee2a432c0605262151s72c83e3rb387a21fb5480ef9@mail.gmail.com>
Message-ID: <001201c6814a$7e0222b0$dc00000a@RaymondLaptop1>

> It would help me enormously if someone could summarize the status and
> everything that went on.  These are the things that would help me the
> most.
> 
> * What are the speed diffs before/after the sprint
> * What was modified (summary)
> * What is left to do
>    - doc
>    - tests
>    - code
> * Which branches are still planning to remain active
> * Lessons learned, how we can improve for the next time
> * Suggestions for further areas to look into improving

http://wiki.python.org/moin/NeedForSpeed/Successes
http://wiki.python.org/moin/NeedForSpeed/Failures
http://wiki.python.org/moin/NeedForSpeed/Deferred



Raymond

From rganesan at myrealbox.com  Sat May 27 07:13:07 2006
From: rganesan at myrealbox.com (Ganesan Rajagopal)
Date: Sat, 27 May 2006 10:43:07 +0530
Subject: [Python-Dev] epoll implementation
References: <20060526163415.GD18740@snurgle.org>
	<ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>
	<e57dsb$vjd$1@sea.gmane.org> <4477A9E6.2020108@canterbury.ac.nz>
	<e589te$nbd$1@sea.gmane.org>
Message-ID: <xkvhhd3c3va4.fsf@grajagop-lnx.cisco.com>

>>>>> Steve Holden <steve at holdenweb.com> writes:

>> Rather than adding yet another platform-dependent module,
>> I'd like to see a unified Python interface in the stdlib
>> that uses whichever is the best one available.
>> 
> Of course that would mean establishing which *was* the best available 
> which, as we've seen this week, may not be easy.

There is a C aync event notification library called libevent which wraps
kqueue, epoll, /dev/poll etc and provides a unified interface (See 
http://www.monkey.org/~provos/libevent/).

Ganesan


-- 
Ganesan Rajagopal


From martin at v.loewis.de  Sat May 27 08:36:12 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 27 May 2006 08:36:12 +0200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060526232301.GM18740@snurgle.org>
References: <20060526183133.GG18740@snurgle.org>
	<20060526184944.28682.581801558.divmod.quotient.5772@ohm>
	<20060526202225.GI18740@snurgle.org> <44777976.9010904@v.loewis.de>
	<20060526232301.GM18740@snurgle.org>
Message-ID: <4477F35C.6090601@v.loewis.de>

Ross Cohen wrote:
> epoll also allows 64 bits of data to be tucked away and returned when events
> happen. Could be useful for saving a dict lookup for every event. However,
> there are some refcounting issues. Dict lookup per event could be traded
> for one on deregistration. All it needs is a small forward-compatible
> extension to the current select.poll API.

I don't understand. You only need the dictionary on registration, to
find out whether this is a new registration or a modification of an
existing one. How could you save the dict lookup at registration time?
And what would you store in the additional data?

Regards,
Martin

From martin at v.loewis.de  Sat May 27 08:49:50 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 27 May 2006 08:49:50 +0200
Subject: [Python-Dev] warnings about missing __init__.py in
	toplevel	directories
In-Reply-To: <4E174EB4-CB3D-4E36-9532-9BBABF090D6C@mac.com>
References: <4E174EB4-CB3D-4E36-9532-9BBABF090D6C@mac.com>
Message-ID: <4477F68E.1000403@v.loewis.de>

Ronald Oussoren wrote:
> Some time ago a warning was introduced for directories on sys.path  
> that don't contain an __init__.py but have the same name as a package/ 
> module that is being imported.
> 
> Is it intentional that this triggers for toplevel imports? These  
> warnings are triggered in the build process for PyObjC, which is in  
> itself pretty harmless but very annoying.

They were very close to not being harmless earlier this year: Python
was almost changed to actually treat the directory as a package
even if it does not contain an __init__.py. In that case, the directory
itself would have been imported, not the thing inside. The warning
is (also) a hint that you should do some renaming - future versions
of Python might drop the need for __init__.

Regards,
Martin

From martin at v.loewis.de  Sat May 27 08:51:09 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 27 May 2006 08:51:09 +0200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <4477A9E6.2020108@canterbury.ac.nz>
References: <20060526163415.GD18740@snurgle.org>	<ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>	<e57dsb$vjd$1@sea.gmane.org>
	<4477A9E6.2020108@canterbury.ac.nz>
Message-ID: <4477F6DD.7040403@v.loewis.de>

Greg Ewing wrote:
> There are many different select-like things around now
> (select, poll, epoll, kqueue -- are there others?) and
> random combinations of them seem to be available on any
> given platform. This makes writing platform-independent
> code that needs select-like functionality very awkward.
> 
> Rather than adding yet another platform-dependent module,
> I'd like to see a unified Python interface in the stdlib
> that uses whichever is the best one available.

For epoll, that interface seems to be select.poll.

Regards,
Martin

From martin at v.loewis.de  Sat May 27 08:56:51 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 27 May 2006 08:56:51 +0200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <D493B3B9-53D9-42F8-B2EB-AF04F10E34DC@gmail.com>
References: <20060526163415.GD18740@snurgle.org>	<ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>	<e57dsb$vjd$1@sea.gmane.org>	<4477A9E6.2020108@canterbury.ac.nz>
	<e589te$nbd$1@sea.gmane.org>
	<D493B3B9-53D9-42F8-B2EB-AF04F10E34DC@gmail.com>
Message-ID: <4477F833.7050606@v.loewis.de>

Alex Martelli wrote:
>> Of course that would mean establishing which *was* the best available
>> which, as we've seen this week, may not be easy.
> 
> I believe it's: kqueue on FreeBSD ...

Such a statement assumes they are semantically equivalent. However, I
believe they aren't. A specific usage pattern can be expressed with
any of them, for such a usage pattern, you can come up with a ranking
even.

For example, to check whether a single specific file descriptor is
ready, I would assume that poll is best (even better than epoll and
kqueue). To check whether file descriptor 0 is ready, I would assume
that select is even better.

Regards,
Martin

From rcohen at snurgle.org  Sat May 27 09:00:14 2006
From: rcohen at snurgle.org (Ross Cohen)
Date: Sat, 27 May 2006 03:00:14 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <4477F35C.6090601@v.loewis.de>
References: <20060526183133.GG18740@snurgle.org>
	<20060526184944.28682.581801558.divmod.quotient.5772@ohm>
	<20060526202225.GI18740@snurgle.org> <44777976.9010904@v.loewis.de>
	<20060526232301.GM18740@snurgle.org> <4477F35C.6090601@v.loewis.de>
Message-ID: <20060527070014.GR18740@snurgle.org>

On Sat, May 27, 2006 at 08:36:12AM +0200, "Martin v. L?wis" wrote:
> Ross Cohen wrote:
> > epoll also allows 64 bits of data to be tucked away and returned when events
> > happen. Could be useful for saving a dict lookup for every event. However,
> > there are some refcounting issues. Dict lookup per event could be traded
> > for one on deregistration. All it needs is a small forward-compatible
> > extension to the current select.poll API.
> 
> I don't understand. You only need the dictionary on registration, to
> find out whether this is a new registration or a modification of an
> existing one. How could you save the dict lookup at registration time?
> And what would you store in the additional data?

The first thing any user of the poll interface does with the file descriptor
is map it to some state object. That's where the lookup can be saved, the
object can just be handed back directly. Problem being that when the fd is
unregistered, we won't get back the PyObject pointer so can't decrement
the refcount, has to be stored and looked up manually.

Ross

From rcohen at snurgle.org  Sat May 27 09:46:49 2006
From: rcohen at snurgle.org (Ross Cohen)
Date: Sat, 27 May 2006 03:46:49 -0400
Subject: [Python-Dev] epoll implementation
In-Reply-To: <e589te$nbd$1@sea.gmane.org>
References: <20060526163415.GD18740@snurgle.org>
	<ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>
	<e57dsb$vjd$1@sea.gmane.org> <4477A9E6.2020108@canterbury.ac.nz>
	<e589te$nbd$1@sea.gmane.org>
Message-ID: <20060527074649.GS18740@snurgle.org>

On Sat, May 27, 2006 at 02:27:20AM +0100, Steve Holden wrote:
> Greg Ewing wrote:
> > Rather than adding yet another platform-dependent module,
> > I'd like to see a unified Python interface in the stdlib
> > that uses whichever is the best one available.
> > 
> Of course that would mean establishing which *was* the best available 
> which, as we've seen this week, may not be easy.

Lots of systems (Windows? what's that?) implement both select and poll. I
have yet to see a system which implements more than 1 of epoll, kqueue,
/dev/poll, I/O completion ports, etc.

You'd have to search long and hard for a case where the non-select/poll
interfaces didn't perform better. Correctness/completeness is, of course,
a stronger criterion than performance, so if kqueue on OS X doesn't support
all relevant types of file descriptors, then it shouldn't used.

Fortunately, the python select.poll interface is a good match for all of
epoll, kqueue and /dev/poll. kqueue supports watching a lot more than file
descriptors, but that doesn't matter much for this discussion. kqueue also
mashes register/unregister/poll into a single call, which contributes to
how confusing the interface is, though from what I can tell it really is
basically the same as epoll.

As for Windows, the WSASelect interface is truly awful. The python library
currently sets it so you can pass in up to 512 descriptors, but don't count
on getting even that many. There is a hard limit on the number of sockets
a *user* can have open (varies based on OS version) with WSASelect and it
will just start returning WSAENOBUFS once you go over. You then have the
task of guessing how many you need to close before it will start working
again. Firewalls especially can eat into the pool.

IOCP has neither the WSAENOBUFS problem nor the 512 descriptor limit. It
would be nice (if you or your customers run Windows) if someone did a
select.poll interface with IOCP as a backend.

Ross

From greg.ewing at canterbury.ac.nz  Sat May 27 09:53:56 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 27 May 2006 19:53:56 +1200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <D493B3B9-53D9-42F8-B2EB-AF04F10E34DC@gmail.com>
References: <20060526163415.GD18740@snurgle.org>
	<ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>
	<e57dsb$vjd$1@sea.gmane.org> <4477A9E6.2020108@canterbury.ac.nz>
	<e589te$nbd$1@sea.gmane.org>
	<D493B3B9-53D9-42F8-B2EB-AF04F10E34DC@gmail.com>
Message-ID: <44780594.2040507@canterbury.ac.nz>

Alex Martelli wrote:
> On May 26, 2006, at 6:27 PM, Steve Holden wrote:

>>Of course that would mean establishing which *was* the best available
>>which, as we've seen this week, may not be easy.
> 
> I believe it's: kqueue on FreeBSD (for recent-enough versions  
> thereof), otherwise epoll where available and nonbuggy, otherwise  
> poll ditto, otherwise select

It would be an improvement if it would just pick *some*
implementation that worked, even if it weren't strictly
the best.

--
Greg

From martin at v.loewis.de  Sat May 27 10:00:49 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 27 May 2006 10:00:49 +0200
Subject: [Python-Dev] epoll implementation
In-Reply-To: <20060527070014.GR18740@snurgle.org>
References: <20060526183133.GG18740@snurgle.org>	<20060526184944.28682.581801558.divmod.quotient.5772@ohm>	<20060526202225.GI18740@snurgle.org>
	<44777976.9010904@v.loewis.de>	<20060526232301.GM18740@snurgle.org>
	<4477F35C.6090601@v.loewis.de> <20060527070014.GR18740@snurgle.org>
Message-ID: <44780731.2020306@v.loewis.de>

Ross Cohen wrote:
> The first thing any user of the poll interface does with the file descriptor
> is map it to some state object. That's where the lookup can be saved, the
> object can just be handed back directly. Problem being that when the fd is
> unregistered, we won't get back the PyObject pointer so can't decrement
> the refcount, has to be stored and looked up manually.

Ah. That would be an interface change; I would not do that. The lookup
is usually-constant and very fast. It is complexity that grows with the
number of file descriptors that these APIs worry about.

Regards,
Martin

From tim.peters at gmail.com  Sat May 27 10:31:32 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 27 May 2006 08:31:32 +0000
Subject: [Python-Dev] Need for Speed Sprint status
In-Reply-To: <001201c6814a$7e0222b0$dc00000a@RaymondLaptop1>
References: <ee2a432c0605262151s72c83e3rb387a21fb5480ef9@mail.gmail.com>
	<001201c6814a$7e0222b0$dc00000a@RaymondLaptop1>
Message-ID: <1f7befae0605270131u5a8f6748m176a4173bb81bab9@mail.gmail.com>

> http://wiki.python.org/moin/NeedForSpeed/Successes
> http://wiki.python.org/moin/NeedForSpeed/Failures
> http://wiki.python.org/moin/NeedForSpeed/Deferred

And

    http://wiki.python.org/moin/ListOfPerformanceRelatedPatches

All of these are linked to from the top page:

    http://wiki.python.org/moin/NeedForSpeed

We still have 8 hours of sprinting today.  I'm sure we'll run
before+after benchmarks (at least pystone + pybench) on a variety of
boxes, and record the results, before it's over.  Many speedups won't
be reflected in that (e.g., major struct-module speedups &
enhancements, various string->integer speedups, major psyco speedups,
...).

From martin at v.loewis.de  Sat May 27 10:44:36 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 27 May 2006 10:44:36 +0200
Subject: [Python-Dev] [Python-checkins] This
In-Reply-To: <e57dle$v78$1@sea.gmane.org>
References: <e55adp$8m5$1@sea.gmane.org>	<44762E77.3080303@ewtllc.com><44763168.4030308@holdenweb.com>	<44763487.3080908@ewtllc.com>	<e04bdf310605260828q7e5cb00mf97ee0d95fc094e8@mail.gmail.com>
	<e57dle$v78$1@sea.gmane.org>
Message-ID: <44781174.1050102@v.loewis.de>

Terry Reedy wrote:
>> Just end user experience's two cents here
>> (btw, this line is correct  at English level?)
> 
> Since you asked...your question would be better written "is this line 
> correct English?"
> And the line before, while not formal English of the kind needed, say, for 
> Decimal docs, works well enough for me as an expression of an informal 
> conversational thought.

Wouldn't it be still be conventional to have an article somewhere?
e.g. " Just /some/ end user's two cents here" (also, isn't "two cents"
and "experience" roughly the same thing, so that one is redundant?)

Regards,
Martin

From steve at holdenweb.com  Sat May 27 11:47:08 2006
From: steve at holdenweb.com (Steve Holden)
Date: Sat, 27 May 2006 10:47:08 +0100
Subject: [Python-Dev] Need for Speed Sprint status
In-Reply-To: <ee2a432c0605262151s72c83e3rb387a21fb5480ef9@mail.gmail.com>
References: <ee2a432c0605262151s72c83e3rb387a21fb5480ef9@mail.gmail.com>
Message-ID: <4478201C.2070903@holdenweb.com>

Neal Norwitz wrote:
> First off, good work to everyone involved.  You did a tremendous job.
> I just hope to hell you're done, because I can't keep up! :-)
> 
Not quite done yet, but I will be encouraging the team to start wrapping 
up in time to draw a line under everything that *isn't* going to 
continue by 17:00 GMT today, when we all turn into pumpkins.

> It would help me enormously if someone could summarize the status and
> everything that went on.  These are the things that would help me the
> most.
> 
>  * What are the speed diffs before/after the sprint
>  * What was modified (summary)
>  * What is left to do
>     - doc
>     - tests
>     - code
>  * Which branches are still planning to remain active
>  * Lessons learned, how we can improve for the next time
>  * Suggestions for further areas to look into improving
> 
One of today's most important missions for me is to make sure that the 
results of the sprint are adequately documented, including pointers to 
further work that might yield speedups.

> It looks like there were a lot of additions to the string test suite,
> that's great.  I'm not sure if the other areas touched got similar
> boosts to their tests.  It would be good to upgrade all tests to
> verify corner cases of the implementation.  These tests should also be
> documented that they are to test the implementation rather than the
> language spec.  We don't want to write obscure tests that can't pass
> in other impls like Jython, IronPython, or PyPy.
> 
Yes, we could really do with implementation-specific tests for this stuff.

> I will turn my amd64 box back on tomorrow and will also run Python
> through valgrind and pychecker when I get a chance.  There are a
> couple of new Coverity complaints that need to be addressed.
> 
That'll be great. Again, let me say we've had terrific support from many 
other developers. I hope the whole community benefits from this sprint.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Love me, love my blog  http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden


From aahz at pythoncraft.com  Sat May 27 12:59:59 2006
From: aahz at pythoncraft.com (Aahz)
Date: Sat, 27 May 2006 03:59:59 -0700
Subject: [Python-Dev] Proposal for a new itertools function: iwindow
In-Reply-To: <4475EC99.7050006@gmx.net>
References: <4475EC99.7050006@gmx.net>
Message-ID: <20060527105959.GA29292@panix.com>

On Thu, May 25, 2006, Torsten Marek wrote:
>
> Some open question remain:
> - should iwindow return lists or tuples?
> - what happens if the length of the iterable is smaller than the window size,
> and no padding is specified? Is this an error? Should the generator return no
> value at all or one window that is too small?

You should probably try this idea out on comp.lang.python; if there is
approval, you'll probably need to write a PEP because of these issues.
Note that my guess is that the complexity of the issues relative to the
benefit means that BDFL will probably veto it.

Side note: if you do want to discuss this further on python-dev, please
subscribe
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there."  --Steve Gonedes

From aahz at pythoncraft.com  Sat May 27 13:02:28 2006
From: aahz at pythoncraft.com (Aahz)
Date: Sat, 27 May 2006 04:02:28 -0700
Subject: [Python-Dev] A Horrible Inconsistency
In-Reply-To: <e578g6$bn7$1@sea.gmane.org>
References: <e555s1$q2j$1@sea.gmane.org>
	<e04bdf310605260837i4cba67d6u8a70c5228916b6ad@mail.gmail.com>
	<20060526154111.GL32617@tummy.com> <e578g6$bn7$1@sea.gmane.org>
Message-ID: <20060527110228.GB29292@panix.com>

On Fri, May 26, 2006, Fredrik Lundh wrote:
>
> and while we're at it, let's fix this:
> 
>      >>> 0.66 * (1, 2, 3)
>      (1, 2)
> 
> and maybe even this
> 
>      >>> 0.5 * (1, 2, 3)
>      (1, 1)
> 
> but I guess the latter one might need a pronunciation.

This should certainly get fixed in 3.0 thanks to __index__
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there."  --Steve Gonedes

From jjl at pobox.com  Sat May 27 13:03:09 2006
From: jjl at pobox.com (John J Lee)
Date: Sat, 27 May 2006 11:03:09 +0000 (UTC)
Subject: [Python-Dev] [Python-checkins] This
In-Reply-To: <44781174.1050102@v.loewis.de>
References: <e55adp$8m5$1@sea.gmane.org>
	<44762E77.3080303@ewtllc.com><44763168.4030308@holdenweb.com>
	<44763487.3080908@ewtllc.com>
	<e04bdf310605260828q7e5cb00mf97ee0d95fc094e8@mail.gmail.com>
	<e57dle$v78$1@sea.gmane.org> <44781174.1050102@v.loewis.de>
Message-ID: <Pine.LNX.4.64.0605271059560.8533@localhost>

On Sat, 27 May 2006, "Martin v. L?wis" wrote:
[...]
>>> Just end user experience's two cents here
>>> (btw, this line is correct  at English level?)
[...]
> Wouldn't it be still be conventional to have an article somewhere?
> e.g. " Just /some/ end user's two cents here"

Yes, but "one" (or maybe "an") rather than "some".


> (also, isn't "two cents"
> and "experience" roughly the same thing, so that one is redundant?)

There's no such thing as a synonym of course, but it does sound funny, 
yes.


John

From bob at redivi.com  Sat May 27 14:06:58 2006
From: bob at redivi.com (Bob Ippolito)
Date: Sat, 27 May 2006 12:06:58 +0000
Subject: [Python-Dev] [Python-checkins] r46300 - in python/trunk:
	Lib/socket.py Lib/test/test_socket.py Lib/test/test_struct.py
	Modules/_struct.c Modules/arraymodule.c Modules/socketmodule.c
In-Reply-To: <ca471dc20605260956t35aafa71tcad19b926c0c4b41@mail.gmail.com>
References: <20060526120329.9A9671E400C@bag.python.org>
	<ca471dc20605260956t35aafa71tcad19b926c0c4b41@mail.gmail.com>
Message-ID: <CD67B679-50C4-4398-81BF-40F3ED6B106B@redivi.com>

On May 26, 2006, at 4:56 PM, Guido van Rossum wrote:

> On 5/26/06, martin.blais <python-checkins at python.org> wrote:
>> Log:
>> Support for buffer protocol for socket and struct.
>>
>> * Added socket.recv_buf() and socket.recvfrom_buf() methods, that  
>> use the buffer
>>   protocol (send and sendto already did).
>>
>> * Added struct.pack_to(), that is the corresponding buffer  
>> compatible method to
>>   unpack_from().
>
> Hm... The file object has a similar method readinto(). Perhaps the
> methods introduced here could follow that lead instead of using two
> different new naming conventions?

(speaking specifically about struct and not socket)

pack_to and unpack_from are named as such because they work with objects
that support the buffer API (not file-like-objects). I couldn't find any
existing convention for objects that manipulate buffers in such a way.
If there is an existing convention then I'd be happy to rename these.

"readinto" seems to imply that some kind of position is being  
incremented.
Grammatically it only works if it's implemented on all buffer  
objects, but
in this case it's implemented on the Struct type.

-bob


From ronaldoussoren at mac.com  Sat May 27 15:24:11 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Sat, 27 May 2006 15:24:11 +0200
Subject: [Python-Dev] warnings about missing __init__.py in toplevel
	directories
In-Reply-To: <4477F68E.1000403@v.loewis.de>
References: <4E174EB4-CB3D-4E36-9532-9BBABF090D6C@mac.com>
	<4477F68E.1000403@v.loewis.de>
Message-ID: <3E82B8A8-FAC3-4A01-8982-E412603015C8@mac.com>


On 27-mei-2006, at 8:49, Martin v. L?wis wrote:

> Ronald Oussoren wrote:
>> Some time ago a warning was introduced for directories on sys.path
>> that don't contain an __init__.py but have the same name as a  
>> package/
>> module that is being imported.
>>
>> Is it intentional that this triggers for toplevel imports? These
>> warnings are triggered in the build process for PyObjC, which is in
>> itself pretty harmless but very annoying.
>
> They were very close to not being harmless earlier this year: Python
> was almost changed to actually treat the directory as a package
> even if it does not contain an __init__.py. In that case, the  
> directory
> itself would have been imported, not the thing inside. The warning
> is (also) a hint that you should do some renaming - future versions
> of Python might drop the need for __init__.

I'd do the renaming even if the warning would stay as a hint for  
users that forgot to create the __init__ file, I'm trying to keep my  
code warning free. I asked just to make sure that the warning is  
intentional for toplevel packages (which it is), I vaguely recall  
that there was some discussion about enabling the warning only for  
subpackages.

Ronald


From fredrik at pythonware.com  Sat May 27 17:51:51 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 27 May 2006 17:51:51 +0200
Subject: [Python-Dev] getting rid of confusing "expected a character buffer
	object" messages
Message-ID: <e59sil$1l2$1@sea.gmane.org>

several string methods accepts either strings or objects that support 
the buffer api, and ends up raising a "expected a character buffer 
object" error if you pass in something else.  this isn't exactly helpful 
for non-experts -- the term "character buffer object" doesn't appear in 
any python tutorials I've seen.

     >>> "x".find(1)
     Traceback (most recent call last):
       File "<stdin>", line 1, in <module>
     TypeError: expected a character buffer object


I think should be fixed, but I cannot make up my mind on what's the best 
way to fix it:

     1. map all errors from PyObject_AsCharBuffer to "expected string
        or compatible object", or some such.

     2. modify PyObject_AsCharBuffer so it returns different negative
        error codes depending on whether the object doesn't support the
        protocol at all, or if something went wrong when using it.

     3. to avoid changing the PyObject_AsCharBuffer return value, add a
        PyObject_AsCharBufferEx that does the above

     4. do something entirely different

     5. fugetabotit

comments?

</F>


From rasky at develer.com  Sat May 27 18:06:44 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Sat, 27 May 2006 18:06:44 +0200
Subject: [Python-Dev] Proposal for a new itertools function: iwindow
References: <4475EC99.7050006@gmx.net> <20060527105959.GA29292@panix.com>
Message-ID: <0a5d01c681a7$8d7f75a0$05432597@bagio>

Aahz <aahz at pythoncraft.com> wrote:

>> Some open question remain:
>> - should iwindow return lists or tuples?
>> - what happens if the length of the iterable is smaller than the
>> window size, and no padding is specified? Is this an error? Should
>> the generator return no value at all or one window that is too small?
>
> You should probably try this idea out on comp.lang.python; if there is
> approval, you'll probably need to write a PEP because of these issues.
> Note that my guess is that the complexity of the issues relative to
> the benefit means that BDFL will probably veto it.


A PEP for adding a simple generator to a library of generators??

The function does look useful to me. It's useful even if it returns a tuple
instead of a list, or if it doesn't support padding. It's still better than
nothing, and can still be succesfully used even if it requires some smallish
wrapper to achieve the exact functionality. I know I have been implementing
something similar very often. Since when do we need a full PEP process,
nitpicking the small details to death, just to add a simple function?

Giovanni Bajo


From raymond.hettinger at verizon.net  Sat May 27 18:48:08 2006
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 27 May 2006 09:48:08 -0700
Subject: [Python-Dev] Proposal for a new itertools function: iwindow
References: <4475EC99.7050006@gmx.net> <20060527105959.GA29292@panix.com>
	<0a5d01c681a7$8d7f75a0$05432597@bagio>
Message-ID: <005101c681ad$55e76070$dc00000a@RaymondLaptop1>

> Aahz <aahz at pythoncraft.com> wrote:
>>> Some open question remain:
>>> - should iwindow return lists or tuples?
>>> - what happens if the length of the iterable is smaller than the
>>> window size, and no padding is specified? Is this an error? Should
>>> the generator return no value at all or one window that is too small?


An itertools windowing function was previously discussed and rejected.


Raymond

From tjreedy at udel.edu  Sat May 27 23:22:55 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 27 May 2006 17:22:55 -0400
Subject: [Python-Dev] getting rid of confusing "expected a character
	bufferobject" messages
References: <e59sil$1l2$1@sea.gmane.org>
Message-ID: <e5afvd$mi0$1@sea.gmane.org>


"Fredrik Lundh" <fredrik at pythonware.com> wrote in message 
news:e59sil$1l2$1 at sea.gmane.org...
> several string methods accepts either strings or objects that support
> the buffer api, and ends up raising a "expected a character buffer
> object" error if you pass in something else.  this isn't exactly helpful
> for non-experts -- the term "character buffer object" doesn't appear in
> any python tutorials I've seen.
>
>     >>> "x".find(1)
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in <module>
>     TypeError: expected a character buffer object
>
>
> I think should be fixed, but I cannot make up my mind on what's the best

'expected a string or character buffer' would be an improvement.

tjr




From tjreedy at udel.edu  Sun May 28 00:03:24 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 27 May 2006 18:03:24 -0400
Subject: [Python-Dev] Need for Speed Sprint status
References: <ee2a432c0605262151s72c83e3rb387a21fb5480ef9@mail.gmail.com>
	<4478201C.2070903@holdenweb.com>
Message-ID: <e5aiba$s8h$1@sea.gmane.org>


"Steve Holden" <steve at holdenweb.com> wrote in message 
news:4478201C.2070903 at holdenweb.com...
>> It looks like there were a lot of additions to the string test suite,
>> that's great.  I'm not sure if the other areas touched got similar
>> boosts to their tests.  It would be good to upgrade all tests to
>> verify corner cases of the implementation.  These tests should also be
>> documented that they are to test the implementation rather than the
>> language spec.  We don't want to write obscure tests that can't pass
>> in other impls like Jython, IronPython, or PyPy.
>>
> Yes, we could really do with implementation-specific tests for this 
> stuff.

Just a reminder, in mid-April, there was a short thread on Python-3000 (to 
be continued here) entitled Separating out CPython and core Python tests,

I suggested separate files in separate directories.  Brett C. suggested, I 
believe, decorators to mark tests within a file.  Guido suggested maybe 
both, as appropriate.  The plan was to defer marking or movement of 
existing test until 2.6, but lets clearly comment any new 
implementation-specific tests as such.

Terry Jan Reedy






From greg.ewing at canterbury.ac.nz  Sun May 28 02:26:34 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 28 May 2006 12:26:34 +1200
Subject: [Python-Dev] [Python-checkins] This
In-Reply-To: <44781174.1050102@v.loewis.de>
References: <e55adp$8m5$1@sea.gmane.org> <44762E77.3080303@ewtllc.com>
	<44763168.4030308@holdenweb.com> <44763487.3080908@ewtllc.com>
	<e04bdf310605260828q7e5cb00mf97ee0d95fc094e8@mail.gmail.com>
	<e57dle$v78$1@sea.gmane.org> <44781174.1050102@v.loewis.de>
Message-ID: <4478EE3A.8010407@canterbury.ac.nz>

Martin v. L?wis wrote:
> 
>>>Just end user experience's two cents here
>>>(btw, this line is correct  at English level?)
>
> Wouldn't it be still be conventional to have an article somewhere?
> e.g. " Just /some/ end user's two cents here"

Or "Just two cents' worth of end-user experience here",
which is almost as concise as the original and much
easier to parse.

--
Greg

From ncoghlan at gmail.com  Sun May 28 03:57:51 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 28 May 2006 11:57:51 +1000
Subject: [Python-Dev] Proposal for a new itertools function: iwindow
In-Reply-To: <005101c681ad$55e76070$dc00000a@RaymondLaptop1>
References: <4475EC99.7050006@gmx.net>
	<20060527105959.GA29292@panix.com>	<0a5d01c681a7$8d7f75a0$05432597@bagio>
	<005101c681ad$55e76070$dc00000a@RaymondLaptop1>
Message-ID: <4479039F.1060501@gmail.com>

Raymond Hettinger wrote:
>> Aahz <aahz at pythoncraft.com> wrote:
>>>> Some open question remain:
>>>> - should iwindow return lists or tuples?
>>>> - what happens if the length of the iterable is smaller than the
>>>> window size, and no padding is specified? Is this an error? Should
>>>> the generator return no value at all or one window that is too small?
> 
> 
> An itertools windowing function was previously discussed and rejected.

A python-dev Google search for "itertools window" found me your original 
suggestion to include Jack Diedrich's itertools.window in Python 2.3 (which 
was only deferred because 2.3 was already past beta 1 at that point).

I couldn't find any discussion of the idea after that (aside from your 
pointing out that you'd never found a real-life use case for the pairwise() 
recipe in the docs, which is a basic form of windowing).

One option would be to add a windowing function to the recipes in the 
itertools docs. Something like:


def window(iterable, window_len=2, window_step=1):
     iterators = tee(iterable, window_len)
     for skip_steps, itr in enumerate(iterators):
         for ignored in islice(itr, skip_steps):
             pass
     window_itr = izip(*iterators)
     if window_step != 1:
         window_itr = islice(window_itr, step=window_step)
     return window_itr

As you can see, I'd leave the padding out of this function signature - if you 
want padding you can more easily specify it directly when calculating the 
iterable to be windowed. Do you want padding before and after the sequence, or 
just after? Pad with None or with zeroes? Better to expand on the "padnone" 
recipe:

def pad(iterable, pad_value=None, times=None):
     """Pads an iterable with a repeating value

        The quantity of padding may be limited by specifying
        a specific number of occurrences.
     """
     return chain(iterable, repeat(pad_value, times))

def padwindow(iterable, window_len, pad_value=None):
     """Pads an iterable appropriately for the given window length

        Padding is added to both the front and back of the sequence
        such that the first and last windows will each contain one
        value from the actual sequence.
     """
     return chain(repeat(pad_value, window_len-1),
                  iterable,
                  repeat(pad_value, window_len-1))


Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From raymond.hettinger at verizon.net  Sun May 28 05:04:11 2006
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sat, 27 May 2006 20:04:11 -0700
Subject: [Python-Dev] Proposal for a new itertools function: iwindow
References: <4475EC99.7050006@gmx.net> <20060527105959.GA29292@panix.com>
	<0a5d01c681a7$8d7f75a0$05432597@bagio>
	<005101c681ad$55e76070$dc00000a@RaymondLaptop1>
	<4479039F.1060501@gmail.com>
Message-ID: <000c01c68203$65f3b120$dc00000a@RaymondLaptop1>

From: "Nick Coghlan" <ncoghlan at gmail.com>
> A python-dev Google search for "itertools window" found me your original 
> suggestion to include Jack Diedrich's itertools.window in Python 2.3 (which 
> was only deferred because 2.3 was already past beta 1 at that point).
>
> I couldn't find any discussion of the idea after that (aside from your 
> pointing out that you'd never found a real-life use case for the pairwise() 
> recipe in the docs, which is a basic form of windowing).
>
> One option would be to add a windowing function to the recipes in the 
> itertools docs. Something like:
>
>
> def window(iterable, window_len=2, window_step=1):
>     iterators = tee(iterable, window_len)
>     for skip_steps, itr in enumerate(iterators):
>         for ignored in islice(itr, skip_steps):
>             pass
>     window_itr = izip(*iterators)
>     if window_step != 1:
>         window_itr = islice(window_itr, step=window_step)
>     return window_itr

No thanks.  The resolution of this one was that windowing iterables is not a 
good idea.  It is the natural province of sequences, not iterables.  With 
sequences, it is a matter of using an index and offset.  With iterables, there 
is a great deal of data shifting.  Also note that some of the uses are subsumed 
by collections.deque().  In implementing a draft of itertools window, we found 
that the data shifting was unnatural and unavoidable (unless you output some 
sort of buffer instead of a real tuple).  Also, we looked at use cases and found 
that most had solutions that were dominated by some other approach.  The 
addition of a windowing tool would tend to steer people away from better 
solutions.  In short, after much deliberation and experimenting with a sample 
implementation, the idea was rejected.  Hopefully, it will stay dead and no one 
will start a cruscade for it simply because it can be done and because it seems 
cute.

The thought process was documented in a series of newsgroup postings:
    http://groups.google.com/group/comp.lang.python/msg/026da8f9eec4becf

The SF history is less informative because most of the discussions were held by 
private email:
   http://www.python.org/sf/756253

If someone aspires to code some new itertools, I have approved two new ones, 
imerge() and izip_longest().  The pure python code for imerge() is at:
    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/491285

The impetus behind izip_longest() was expressed in a newsgroup thread, 
http://groups.google.com/group/comp.lang.python/browse_thread/thread/f424f63bfdb77c4/38c31991133757f7 . 
The discussion elicted neither broad support, nor condemnation.  Also, the use 
cases were sparse.  Ultimately, I was convinced by encountering a couple of 
natural use cases.  Another thought is that while other solutions are usually 
available for any given use case, there is a natural tendency to reach for a 
zip-type tool whenever presented with lock-step iteration issues, even when the 
inputs are of uneven length.  Note, that the signature for izip_longest() needs 
to include an optional pad value (defaulting to None) -- there are plenty of use 
cases where an empty string, zero, or a null object would be a preferred pad 
value.

Cheers,



Raymond 


From g.brandl at gmx.net  Sun May 28 09:02:15 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 28 May 2006 07:02:15 +0000
Subject: [Python-Dev] Remove METH_OLDARGS?
Message-ID: <e5bhs7$509$1@sea.gmane.org>

In the process of reviewing and possibly extending getargs.c, I stumbled
over the "compatibility" flag supposedly used for METH_OLDARGS functions.
No code in the core uses this calling convention any more, and it has been
deprecated for quite a long time (since 2.2), so would it be appropriate to end
support for it in 2.5?

[G]


From nnorwitz at gmail.com  Sun May 28 09:14:41 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 28 May 2006 00:14:41 -0700
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <e5bhs7$509$1@sea.gmane.org>
References: <e5bhs7$509$1@sea.gmane.org>
Message-ID: <ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>

On 5/28/06, Georg Brandl <g.brandl at gmx.net> wrote:
> In the process of reviewing and possibly extending getargs.c, I stumbled
> over the "compatibility" flag supposedly used for METH_OLDARGS functions.
> No code in the core uses this calling convention any more, and it has been
> deprecated for quite a long time (since 2.2), so would it be appropriate to end
> support for it in 2.5?

There's still a ton used under Modules.  Also, if no flag is
specified, it will default to 0 (ie, METH_OLDARGS).  I wonder how many
third party modules use METH_OLDARGS directly or more likely
indirectly.

The most important modules to fix from grep (and memory):

Modules/_sre.c:    /* FIXME: use METH_OLDARGS instead of 0 or fix to
use METH_VARARGS */
Modules/_sre.c:    /*        METH_OLDARGS is not in Python 1.5.2 */
Modules/_tkinter.c:     {"call",               Tkapp_Call, METH_OLDARGS},
Modules/_tkinter.c:     {"globalcall",         Tkapp_GlobalCall,
METH_OLDARGS},Modules/_tkinter.c:     {"merge",
Tkapp_Merge, METH_OLDARGS},

$ grep -c METH_OLDARGS */*.c | egrep -v :0
Modules/_sre.c:2
Modules/_tkinter.c:3
Modules/audioop.c:24
Modules/clmodule.c:19
Modules/flmodule.c:103
Modules/fmmodule.c:6
Modules/glmodule.c:430
Modules/svmodule.c:34
Objects/methodobject.c:2

I would like to get rid of the flag, but I'm not sure we can do it
safely until 3.0.

n

From python at rcn.com  Sun May 28 09:17:00 2006
From: python at rcn.com (Raymond Hettinger)
Date: Sun, 28 May 2006 00:17:00 -0700
Subject: [Python-Dev] Remove METH_OLDARGS?
References: <e5bhs7$509$1@sea.gmane.org>
Message-ID: <005401c68226$b740c040$dc00000a@RaymondLaptop1>

> In the process of reviewing and possibly extending getargs.c, I stumbled
> over the "compatibility" flag supposedly used for METH_OLDARGS functions.
> No code in the core uses this calling convention any more, and it has been
> deprecated for quite a long time (since 2.2), so would it be appropriate to 
> end
> support for it in 2.5?

FWIW, I wouldn't miss it, nor would any of the code I've ever seen.


Raymond 

From ncoghlan at gmail.com  Sun May 28 12:40:22 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 28 May 2006 20:40:22 +1000
Subject: [Python-Dev] Proposal for a new itertools function: iwindow
In-Reply-To: <000c01c68203$65f3b120$dc00000a@RaymondLaptop1>
References: <4475EC99.7050006@gmx.net>
	<20060527105959.GA29292@panix.com>	<0a5d01c681a7$8d7f75a0$05432597@bagio>
	<005101c681ad$55e76070$dc00000a@RaymondLaptop1>
	<4479039F.1060501@gmail.com>
	<000c01c68203$65f3b120$dc00000a@RaymondLaptop1>
Message-ID: <44797E16.70302@gmail.com>

Raymond Hettinger wrote:
> No thanks.  The resolution of this one was that windowing iterables is 
> not a good idea.  It is the natural province of sequences, not 
> iterables.  With sequences, it is a matter of using an index and 
> offset.  With iterables, there is a great deal of data shifting.  Also 
> note that some of the uses are subsumed by collections.deque().

Interesting - the old 'hammer-nail' tunnel vision strikes again, I guess.

So moving the question to a different part of the docs, would it make sense to 
include a deque recipe showing how to use a deque in a generator to permit 
windowing of an arbitrary iterator? Something like:

def window(iterable, window_len=2, window_step=1):
      itr = iter(iterable)
      step_range = xrange(window_step)
      current_window = collections.deque(islice(itr, window_len))
      while window_len == len(current_window):
          yield current_window
          for idx in step_range:
              current_window.popleft()
          current_window.extend(islice(itr, window_step))

Even if an application doesn't use the generator approach directly, it still 
illustrates how to use a deque for windowing instead of as a FIFO queue or a 
stack.

> The thought process was documented in a series of newsgroup postings:
>    http://groups.google.com/group/comp.lang.python/msg/026da8f9eec4becf

Thanks for that reference - I was searching the wrong list, so I didn't find it.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From tim.peters at gmail.com  Sun May 28 13:23:23 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Sun, 28 May 2006 11:23:23 +0000
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
	refleak (101)
In-Reply-To: <20060528090534.GA5026@python.psfb.org>
References: <20060528090534.GA5026@python.psfb.org>
Message-ID: <1f7befae0605280423n1e378af0mf3dcb200c2a58c45@mail.gmail.com>

[... a huge number of reference leaks reported ...]

FYI, I "reduced" the relatively simple test_bisect's leaks to this
self-contained program:

libreftest = """
    No actual doctests here.
"""

import doctest
import gc

def main():
    from sys import gettotalrefcount as trc
    for i in range(10):
        doctest.testmod()
        print trc()
        doctest.master = None
        gc.collect()

if __name__ == "__main__":
    main()

Running that here in a debug build:

C:\Code\python\PCbuild>python_d blah.py
54867
54873
54879
54885
54891
54897
54903
54909
54915
54921

So it leaks 6 references per iteration, and merely invoking
doctest.testmod() is all it takes to provoke it.  Comment out the:

         test_support.run_doctest(test_bisect, verbose)

line in test_bisect.py, and test_bisect stops leaking too (which isn't
a trivial stmt:  its doctests are a small part of test_bisect --
_most_ of it isn't leaking).

What happens next isn't obvious to me -- nobody has touched doctest.py
in over 2 weeks, so that's not the _source_ of the problem.  Alas, I
won't have computer access for the next 13 hours or so, and have to
leave this here for now.

FWIW, the biggest change that went in since that last "normal" refleak
run is the new exception reworking, so that has to be a top suspect.

From thomas at python.org  Sun May 28 13:31:15 2006
From: thomas at python.org (Thomas Wouters)
Date: Sun, 28 May 2006 13:31:15 +0200
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
Message-ID: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>

I'm seeing a dubious failure of test_gzip and test_tarfile on my AMD64
machine. It's triggered by the recent struct changes, but I'd say it's
probably caused by a bug/misfeature in zlibmodule: zlib.crc32 is the result
of a zlib 'crc32' functioncall, which returns an unsigned long.
zlib.crc32turns that unsigned long into a (signed) Python int, which
means a number
beyond 1<<31 goes negative on 32-bit systems and other systems with 32-bit
longs, but stays positive on systems with 64-bit longs:

(32-bit)
>>> zlib.crc32("foobabazr")
-271938108

(64-bit)
>>> zlib.crc32("foobabazr")
4023029188

The old structmodule coped with that:
>>> struct.pack("<l", -271938108)
'\xc4\x8d\xca\xef'
>>> struct.pack("<l", 4023029188)
'\xc4\x8d\xca\xef'

The new one does not:
>>> struct.pack("<l", -271938108)
'\xc4\x8d\xca\xef'
>>> struct.pack("<l", 4023029188)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "Lib/struct.py", line 63, in pack
    return o.pack(*args)
struct.error: 'l' format requires -2147483647 <= number <= 2147483647

The structmodule should be fixed (and a test added ;) but I'm also wondering
if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my suggested fix
would be to change the PyInt_FromLong() call to PyLong_FromUnsignedLong(),
making zlib always return positive numbers -- it might break some code on
32-bit platforms, but that code is already broken on 64-bit platforms. But I
guess I'm okay with the long being changed into an actual 32-bit signed
number on 64-bit platforms, too.

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060528/188fea6f/attachment.htm 

From thomas at python.org  Sun May 28 13:35:02 2006
From: thomas at python.org (Thomas Wouters)
Date: Sun, 28 May 2006 13:35:02 +0200
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
	refleak (101)
In-Reply-To: <1f7befae0605280423n1e378af0mf3dcb200c2a58c45@mail.gmail.com>
References: <20060528090534.GA5026@python.psfb.org>
	<1f7befae0605280423n1e378af0mf3dcb200c2a58c45@mail.gmail.com>
Message-ID: <9e804ac0605280435v149c361dt222d155517694bb1@mail.gmail.com>

On 5/28/06, Tim Peters <tim.peters at gmail.com> wrote:
>
> [... a huge number of reference leaks reported ...]
>
> FYI, I "reduced" the relatively simple test_bisect's leaks to this
> self-contained program:


Funny, I reduced it to more or less the same thing, except the other way
'round: I suspected exceptions to be the source of the leak, so I kept
adding more complicated exception processing to my leak-test until it
started leaking; it started leaking the moment I added a doctest to it, but
not before. I somewhat doubt it's directly related to exceptions, though.

Michael Hudson also mentioned it on #nfs, and it seems he went a little
further:

00:54 <mwh> i found that test.test_support.check_syntax reliably leaks 5
            references
00:55 <mwh> as does compile('1=1', '', 'exec') in fact
00:58 <mwh> it's leaking a tuple containing two Nones a string and an int, i
            think
01:00 <mwh> oh well, i guess sean and richard know what they changed...

Does 'a tuple containing two Nones, a string and an int' ring a bell to
anyone? :)

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060528/b0fcc263/attachment.html 

From thomas at python.org  Sun May 28 14:18:23 2006
From: thomas at python.org (Thomas Wouters)
Date: Sun, 28 May 2006 14:18:23 +0200
Subject: [Python-Dev] Iterating generator from C (PostgreSQL's pl/python
	RETUN SETOF/RECORD iterator support broken on RedHat buggy libs)
In-Reply-To: <1148144037.3833.89.camel@localhost.localdomain>
References: <1148144037.3833.89.camel@localhost.localdomain>
Message-ID: <9e804ac0605280518i7c1c543evf44a372c0f20568e@mail.gmail.com>

On 5/20/06, Hannu Krosing <hannu at tm.ee> wrote:
>
> I try to move this to -dev as I hope there more people reading it who
> are competent in internal working :). So please replay to -dev only.


I'm not sure if you have found the problem on another mailinglist then, but
I saw no answers on python-dev.

-------------
>
> The question is about use of generators in embedde v2.4 with asserts
> enabled.
>
> Can somebody explain, why the code in try2.c works with wrappers 2 and 3
> but crashes on buggy exception for all others, that is pure generator
> and wrappers 1,4,5 ?


Your example code does not crash for me, not for any of the 'wrapper_source'
variants, linking against Python 2.4 (debian's python 2.4.3 on amd64, both
in 64-bit and 32-bit mode) or Python 2.5 (current trunk, same hardware.) I
don't know what kind of crash you were expecting, but I do see unchecked
return values, which can cause crashes for the simplest of reasons ;-)

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060528/6254062e/attachment.htm 

From mwh at python.net  Sun May 28 16:00:12 2006
From: mwh at python.net (Michael Hudson)
Date: Sun, 28 May 2006 15:00:12 +0100
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
 refleak (101)
In-Reply-To: <9e804ac0605280435v149c361dt222d155517694bb1@mail.gmail.com>
	(Thomas Wouters's message of "Sun, 28 May 2006 13:35:02 +0200")
References: <20060528090534.GA5026@python.psfb.org>
	<1f7befae0605280423n1e378af0mf3dcb200c2a58c45@mail.gmail.com>
	<9e804ac0605280435v149c361dt222d155517694bb1@mail.gmail.com>
Message-ID: <2my7wmz1ub.fsf@starship.python.net>

"Thomas Wouters" <thomas at python.org> writes:

> Does 'a tuple containing two Nones, a string and an int' ring a bell to
> anyone? :)

I found this one on the train (look at SyntaxError_init, it's
obvious).  I also found a number of other bugs in the new exceptions.c
code, from leaks:

>>> def f():
...     try: UnicodeEncodeError()
...     except TypeError: pass
... 
>>> for i in range(10):
...     f()
...     print sys.gettotalrefcount()
... 
30816
30821
30826
30831
30836
30841
30846
30851
30856
30861

to potential crashes:

>>> s = SystemExit(); s.code = []; s.__init__(); s.code
[[...]]
Bus error

So I think I'll be reading through exceptions.c pretty carefully.  I
don't think Sean and Richard have acquired as much paranoid
anal-mindedness and I have when hacking on Python C internals yet :)

Cheers,
mwh

-- 
  > Why are we talking about bricks and concrete in a lisp newsgroup?
  After long experiment it was found preferable to talking about why
  Lisp is slower than C++...
                        -- Duane Rettig & Tim Bradshaw, comp.lang.lisp

From andychambers2002 at yahoo.co.uk  Sun May 28 17:27:31 2006
From: andychambers2002 at yahoo.co.uk (Andrew Chambers)
Date: Sun, 28 May 2006 16:27:31 +0100
Subject: [Python-Dev] PEP 42 - New kind of Temporary file
Message-ID: <20060528152731.GB20739@yapgi.hopto.org>

The mailing-list bot said introduce myself so...

I'm Andy Chambers.  I've been lurking on the web-interface to the list
for a while.  I recently started trying to implement one of the items on
PEP 42 so I thought I should join the list and make myself known.

The item I'm working on is the new kind of temporary file that only
turns into a file when required (before which time presumably its a
StringIO).

Do we still want this?  I suggest it be exposed as a class TempFile in
the tempfile module.  If yes, I'll write the test cases and submit a
patch.

Regards,
Andy


		
___________________________________________________________ 
Inbox full of spam? Get leading spam protection and 1GB storage with All New Yahoo! Mail. http://uk.docs.yahoo.com/nowyoucan.html

From g.brandl at gmx.net  Sun May 28 18:29:27 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 28 May 2006 16:29:27 +0000
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
	refleak (101)
In-Reply-To: <2my7wmz1ub.fsf@starship.python.net>
References: <20060528090534.GA5026@python.psfb.org>	<1f7befae0605280423n1e378af0mf3dcb200c2a58c45@mail.gmail.com>	<9e804ac0605280435v149c361dt222d155517694bb1@mail.gmail.com>
	<2my7wmz1ub.fsf@starship.python.net>
Message-ID: <e5cj3n$2hl$1@sea.gmane.org>

Michael Hudson wrote:

> So I think I'll be reading through exceptions.c pretty carefully.  I
> don't think Sean and Richard have acquired as much paranoid
> anal-mindedness and I have when hacking on Python C internals yet :)

I intended to go through the code again today or tomorrow, looking for
things we missed, but I was on the plane until now, so obviously you
were faster.

Thanks for doing our work, though ;)

Georg


From g.brandl at gmx.net  Sun May 28 18:36:23 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 28 May 2006 16:36:23 +0000
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>
References: <e5bhs7$509$1@sea.gmane.org>
	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>
Message-ID: <e5cjgn$3pg$1@sea.gmane.org>

Neal Norwitz wrote:
> On 5/28/06, Georg Brandl <g.brandl at gmx.net> wrote:
>> In the process of reviewing and possibly extending getargs.c, I stumbled
>> over the "compatibility" flag supposedly used for METH_OLDARGS functions.
>> No code in the core uses this calling convention any more, and it has been
>> deprecated for quite a long time (since 2.2), so would it be appropriate to end
>> support for it in 2.5?
> 
> There's still a ton used under Modules.  Also, if no flag is
> specified, it will default to 0 (ie, METH_OLDARGS).  I wonder how many
> third party modules use METH_OLDARGS directly or more likely
> indirectly.

These modules can be converted.

> I would like to get rid of the flag, but I'm not sure we can do it
> safely until 3.0.

It has been deprecated since 2.2, so it'd be no surprise if it went away. We'd
still have to support PyArg_Parse since it's a public API function and not
deprecated, but it could just convert the arguments to new style and call
PyArg_ParseTuple.

Also, it would be easy to detect METH_OLDARGS in PyCFunction_New and raise
an appropriate exception.

It's not my decision though.

Georg


From mwh at python.net  Sun May 28 18:42:14 2006
From: mwh at python.net (Michael Hudson)
Date: Sun, 28 May 2006 17:42:14 +0100
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
 refleak (101)
In-Reply-To: <e5cj3n$2hl$1@sea.gmane.org> (Georg Brandl's message of "Sun,
	28 May 2006 16:29:27 +0000")
References: <20060528090534.GA5026@python.psfb.org>
	<1f7befae0605280423n1e378af0mf3dcb200c2a58c45@mail.gmail.com>
	<9e804ac0605280435v149c361dt222d155517694bb1@mail.gmail.com>
	<2my7wmz1ub.fsf@starship.python.net> <e5cj3n$2hl$1@sea.gmane.org>
Message-ID: <2mu07ayuc9.fsf@starship.python.net>

Georg Brandl <g.brandl at gmx.net> writes:

> Michael Hudson wrote:
>
>> So I think I'll be reading through exceptions.c pretty carefully.  I
>> don't think Sean and Richard have acquired as much paranoid
>> anal-mindedness and I have when hacking on Python C internals yet :)
>
> I intended to go through the code again today or tomorrow, looking for
> things we missed, but I was on the plane until now, so obviously you
> were faster.

Oh right.  I'd have preferred you didn't check in unfinished
code to the trunk, I think...

> Thanks for doing our work, though ;)

It has to be said, the degree of the refleaks put me into emergency
fire fighting mode...

Cheers,
mwh

-- 
  You have run into the classic Dmachine problem: your machine has
  become occupied by a malevolent spirit.  Replacing hardware or
  software will not fix this - you need an exorcist. 
                                       -- Tim Bradshaw, comp.lang.lisp

From g.brandl at gmx.net  Sun May 28 19:00:56 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 28 May 2006 17:00:56 +0000
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
	refleak (101)
In-Reply-To: <2mu07ayuc9.fsf@starship.python.net>
References: <20060528090534.GA5026@python.psfb.org>	<1f7befae0605280423n1e378af0mf3dcb200c2a58c45@mail.gmail.com>	<9e804ac0605280435v149c361dt222d155517694bb1@mail.gmail.com>	<2my7wmz1ub.fsf@starship.python.net>
	<e5cj3n$2hl$1@sea.gmane.org> <2mu07ayuc9.fsf@starship.python.net>
Message-ID: <e5ckun$7lr$1@sea.gmane.org>

Michael Hudson wrote:
> Georg Brandl <g.brandl at gmx.net> writes:
> 
>> Michael Hudson wrote:
>>
>>> So I think I'll be reading through exceptions.c pretty carefully.  I
>>> don't think Sean and Richard have acquired as much paranoid
>>> anal-mindedness and I have when hacking on Python C internals yet :)
>>
>> I intended to go through the code again today or tomorrow, looking for
>> things we missed, but I was on the plane until now, so obviously you
>> were faster.
> 
> Oh right.  I'd have preferred you didn't check in unfinished
> code to the trunk, I think...

I blame it on Tim since he said OK ;)

>> Thanks for doing our work, though ;)
> 
> It has to be said, the degree of the refleaks put me into emergency
> fire fighting mode...

I hadn't expected so many of them, but I also have to admit that we didn't have
that much time on Saturday and wanted the branch merged for obvious reasons...

[G]


From mwh at python.net  Sun May 28 19:44:40 2006
From: mwh at python.net (Michael Hudson)
Date: Sun, 28 May 2006 18:44:40 +0100
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
 refleak (101)
In-Reply-To: <e5ckun$7lr$1@sea.gmane.org> (Georg Brandl's message of "Sun,
	28 May 2006 17:00:56 +0000")
References: <20060528090534.GA5026@python.psfb.org>
	<1f7befae0605280423n1e378af0mf3dcb200c2a58c45@mail.gmail.com>
	<9e804ac0605280435v149c361dt222d155517694bb1@mail.gmail.com>
	<2my7wmz1ub.fsf@starship.python.net> <e5cj3n$2hl$1@sea.gmane.org>
	<2mu07ayuc9.fsf@starship.python.net> <e5ckun$7lr$1@sea.gmane.org>
Message-ID: <2mejyeyrg7.fsf@starship.python.net>

Georg Brandl <g.brandl at gmx.net> writes:

> Michael Hudson wrote:
>> Georg Brandl <g.brandl at gmx.net> writes:
>> 
>>> Michael Hudson wrote:
>>>
>>>> So I think I'll be reading through exceptions.c pretty carefully.  I
>>>> don't think Sean and Richard have acquired as much paranoid
>>>> anal-mindedness and I have when hacking on Python C internals yet :)
>>>
>>> I intended to go through the code again today or tomorrow, looking for
>>> things we missed, but I was on the plane until now, so obviously you
>>> were faster.
>> 
>> Oh right.  I'd have preferred you didn't check in unfinished
>> code to the trunk, I think...
>
> I blame it on Tim since he said OK ;)

I hope he feels bad about it too :)

>>> Thanks for doing our work, though ;)
>> 
>> It has to be said, the degree of the refleaks put me into emergency
>> fire fighting mode...
>
> I hadn't expected so many of them, but I also have to admit that we didn't have
> that much time on Saturday and wanted the branch merged for obvious reasons...

You can make it up to me by writing unit tests for the crash cases
I've fixed.

I think I've fixed the refleaks too, but running regrtest -R :: takes
raaaather a while.

Cheers,
mwh

-- 
  QNX... the OS that walks like a duck, quacks like a duck, but is,
  in fact, a platypus. ... the adventures of porting duck software 
  to the platypus were avoidable this time.
                                 -- Chris Klein, alt.sysadmin.recovery

From steven.bethard at gmail.com  Sun May 28 19:51:13 2006
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 28 May 2006 11:51:13 -0600
Subject: [Python-Dev] DRAFT: python-dev summary for 2006-03-16 to 2006-03-31
Message-ID: <d11dcfba0605281051o261b5036w4a8d4861a0c2614e@mail.gmail.com>

Sorry I'm so far behind -- I'm aiming to get mostly caught up this
week.  Please read through the summary if you have the time and send
me any comments or corrections.

Thanks!


=============
Announcements
=============

-------------------
Python 2.5 schedule
-------------------

Python 2.5 is moving forward along its release schedule.  A few issues
still remain; check `PEP 356`_ for details.

.. _PEP 356: http://www.python.org/dev/peps/pep-0356/

Contributing thread:

- `Python 2.5 Schedule
<http://mail.python.org/pipermail/python-dev/2006-March/062560.html>`__

-----------------------------
Python 3000 gets its own list
-----------------------------

Serious work on Python 3000 has started now, with a new `python-3000
mailing list`_ for general discussion, and a `python-3000-checkins
mailing list`_ for checkins to the p3yk branch.  Right now, Guido
wants patches to focus mainly on ripping out old code (e.g. classic
classes) and wants the discussion to focus primarily on formalizing
the processes for introducing Python 3000.

Note that processes is what's important right now

.. _python-3000 mailing list:
http://mail.python.org/mailman/listinfo/python-3000
.. _python-3000-checkins mailing list:
http://mail.python.org/mailman/listinfo/python-3000-checkins

Contributing threads:

- `Python 3000 Process
<http://mail.python.org/pipermail/python-dev/2006-March/062644.html>`__
- `svn checkins are now split
<http://mail.python.org/pipermail/python-dev/2006-March/062734.html>`__
- `Discussing the Great Library Reorganization
<http://mail.python.org/pipermail/python-dev/2006-March/063052.html>`__

-----------------------------------------------
Buildbot warnings redirected to python-checkins
-----------------------------------------------

The buildbot warnings, which for a short time this month appeared on
python-dev, have been redirected to the python-checkins list.

Contributing thread:

- `[Fwd: buildbot warnings in amd64 gentoo trunk]
<http://mail.python.org/pipermail/python-dev/2006-March/062754.html>`__

----------------------------------------------------------
Checkins that change behavior must be accompanied by tests
----------------------------------------------------------

Though it's always been good practice to check in a test with any
behavior-changing patch, Neal Norwitz has requested extra care in this
area as we approach the release of Python 2.5.  From this point on,
*any* checkin that could change behavior should be accompanied by a
corresponding test.

Contributing thread:

- `improving quality
<http://mail.python.org/pipermail/python-dev/2006-March/062904.html>`__

--------------------------------------
Email 4.0 merged into Python 2.5 trunk
--------------------------------------

Barry Warsaw has merged the Email 4.0 package into the Python 2.5
trunk. There's a minor incompatibility in that email.Parser is now a
placeholder object instead of a module, but this should not affect the
vast majority of users.

Contributing thread:

- `Merging email 4.0 to Python 2.5 svn trunk
<http://mail.python.org/pipermail/python-dev/2006-March/062512.html>`__

---------------------
Python 2.4.3 released
---------------------

`Python 2.4.3`_ was released on March 30th. This fixes over 50 bugs in
Python 2.4.2, so now's the time to upgrade.

.. _Python 2.4.3: http://www.python.org/2.4.3/

Contributing threads:

- `release24-maint FREEZE for 2.4.3c1, this Thursday
<http://mail.python.org/pipermail/python-dev/2006-March/062655.html>`__
- `RELEASED Python 2.4.3, release candidate 1
<http://mail.python.org/pipermail/python-dev/2006-March/062772.html>`__
- `release24-maint unfrozen, kinda
<http://mail.python.org/pipermail/python-dev/2006-March/062774.html>`__
- `BRANCH release24-maint FREEZE from 00:00 UTC on Wednesday 29th
<http://mail.python.org/pipermail/python-dev/2006-March/062852.html>`__
- `RELEASED Python 2.4.3, final.
<http://mail.python.org/pipermail/python-dev/2006-March/063060.html>`__


=========
Summaries
=========

--------------------------------
Including pysqlite in the stdlib
--------------------------------

Anthony Baxter asked if folks were still interested in including
pysqlite_ in Python 2.5 and got a pretty positive response. Gerhard
H?ring said he could release pysqlite 2.2 and then merge that into
Python SVN trunk.  In the future, he agreed to synchronize stable
changes in the standalone release with the Python core sqlite module.
Accompanying these agreements was a long discussion about the pros and
cons of including pysqlite, and of course an endless debate about what
the new module should be named.  In the end, Guido approved the
inclusion of pysqlite, and it was given the name sqlite3.

.. _pysqlite: http://initd.org/tracker/pysqlite/

Contributing thread:

- `pysqlite for 2.5?
<http://mail.python.org/pipermail/python-dev/2006-March/062905.html>`__

-----------------------------
Choosing a new tracker system
-----------------------------

After Guido noticed that SourceForge had stopped sending him emails
when a patch was assigned to him, a number of people took up a
discussion about how to replace the SourceForge tracker system. There
was already an existing PSF committee in charge of finding such a
replacement, and they were considering roundup_, trac_ and jira_,
though Bugzilla_ and RT_ had already been vetoed.  There was already
an instance of roundup running on python.org, and Python-Hosting and
Atlassian had existing offers to host Python on Trac and Jira
(respectively).  There was some discussion about setting up one of
these for Python 3000, but Guido thought it was too early for this.

Getting the data out of SourceForge seemed to be the current sticking
point, and so Fredrik Lundh wrote up `some tools`_ to do this and
posted the results to http://effbot.org/tracker-20060403.zip.  The
discussion was then moved to the `infrastructure list`_.

.. _bugzilla: http://www.bugzilla.org/
.. _jira: http://www.atlassian.com/software/jira/
.. _roundup: http://roundup.sourceforge.net/
.. _rt: http://www.bestpractical.com/rt/
.. _trac: http://www.edgewall.com/trac/
.. _python-hosting: http://www.python-hosting.com/freetrac
.. _some tools:
http://effbot.python-hosting.com/browser/stuff/sandbox/sourceforge/
.. _infrastructure list: http://mail.python.org/mailman/listinfo/infrastructure/

Contributing threads:

- `I'm not getting email from SF when assigned a bug/patch
<http://mail.python.org/pipermail/python-dev/2006-March/062876.html>`__

-------------------
The C-level set API
-------------------

Barry Warsaw wanted to add PySet_Clear(), PySet_Update() and
PySet_Next() to the C-level set API. Raymond Hettinger objected,
saying that this functionality was already available through
PyObject_CallMethod(s, "clear", NULL), PyObject_CallMethod(s,
"update", "O", iterable) and PyObject_GetIter(s) respectively (with
the last being notably safer too).  Barry said that he was hoping to
get some static compiler checks and easier debugging that would be
lost by using PyObject_CallMethod, and to get some speedups over
PyObject_GetIter() in the same way that PyDict_Next() had.
Eventually, Raymond agreed to adding PySet_Clear() to the public API
and to adding _PySet_Next() and _PySet_Update() to the private API.

Contributing threads:

- `PySet API <http://mail.python.org/pipermail/python-dev/2006-March/062601.html>`__
- `PySet_Next (Was: PySet API)
<http://mail.python.org/pipermail/python-dev/2006-March/062842.html>`__

----------------
Class decorators
----------------

Greg Ewing suggested a use-case for class decorators that could not be
easily addressed with metaclasses: adding a class (but not its
subclasses) to a registry.  After Phillip J. Eby explained the hack
that PyProtocols currently uses to make something like class
decorators available, Guido suggested that someone write a PEP to add
class decorators to Python 2.6.  There was some discussion about
moving decorators for classes inside the class for readability
reasons, but Guido felt like this was too magical.  No PEP had yet
been produced at the time of this summary.

Contributing thread:

- `Class decorators
<http://mail.python.org/pipermail/python-dev/2006-March/062836.html>`__

--------------------------------------
Python 3.0: Exception hierarchy issues
--------------------------------------

A question from Nick Coghlan about where the new GeneratorExit
exception should be placed sparked a new discussion about how the
exception hierarchy should look.  Barry Warsaw suggested that
Exception should be at the top of the hierarchy with
KeyboardInterrupt, GeneratorExit, SystemExit, StopIteration, Error and
Warning being the next level down.  All user defined errors would
inherit from Error.  People seemed to like this idea, but it would
break backwards compatibility as users have been told for a while that
their exceptions should derive from Exception, not Error.  Guido voted
for keeping the `PEP 352`_ semantics (with BaseException at the top of
the hierarchy).

Barry promised to address the hierarchy issue again on the Python 3000 list.

.. _PEP 352: http://www.python.org/dev/peps/pep-0352/

Contributing thread:

- `GeneratorExit inheriting from Exception
<http://mail.python.org/pipermail/python-dev/2006-March/062568.html>`__

-------------------------------
Python 3.0: Exception statement
-------------------------------

Greg Ewing asked if the except clause in Python 3.0 could be changed to::

    except <type-or-tuple> as <value>:

instead of the current::

    except <type-or-tuple>, <value>:

which can result in confusing behavior when you accidentally write::

    except TypeError, ValueError:

and have your TypeError instance bound to the name ValueError. Guido
approved the proposal, and then the discussion veered off into the
usual syntax debate, this time about whether to use "with" instead,
whether to use commas or "or"s between the exception types, and
whether or not to require parentheses around multiple types. Guido
reiterated his support for the original proposal and shortly
afterwards, the thread died.

Contributing thread:

- `Py3k: Except clause syntax
<http://mail.python.org/pipermail/python-dev/2006-March/062446.html>`__

-------------------------------------------------
Adding an N-dimensional array interface to Python
-------------------------------------------------

Travis E. Oliphant asked about including Numpy's `N-dimensional array
interface`_ in Python core by adding a __array_struct__ member to
array.array that looks like::

    typedef struct {
        int version;          /* contains the integer 2 as a sanity check */
        int nd;               /* number of dimensions */
        char typekind;        /* kind in array --- character code of typestr */
        int itemsize;         /* size of each element */
        int flags;            /* flags indicating how the data should
be interpreted */
        Py_intptr_t *shape;   /* A length-nd array of shape information */
        Py_intptr_t *strides; /* A length-nd array of stride information */
        void *data;           /* A pointer to the first element of the array */
    } PyArrayInterface;

This struct is basically unchanged over Numpy's last 10 years of
evolution and Travis was hoping to finally get the interface blessed
by official inclusion in Python so that PIL, PyVox, WxPython,
PyOpenGL, etc. could all start using the same interface.  Guido asked
for a PEP, but no one seemed to feel comfortable taking the array
interface definition and producing a PEP out of it.

.. _N-dimensional array interface: http://numeric.scipy.org/array_interface.html

Contributing thread:

- `Expose the array interface in Python 2.5?
<http://mail.python.org/pipermail/python-dev/2006-March/062501.html>`__

------------------------------------------------
Inconsistent behavior between += and .__iadd__()
------------------------------------------------

Travis Oliphant noticed that ``a.__iadd__(b)`` and ``a += b``
currently do different things depending on the type of their
arguments::

    a = range(5)
    b = numpy.arange(5)
    a.__iadd__(b)
    assert a == [0, 1, 2, 3, 4, 0, 1, 2, 3, 4]

    a = range(5)
    b = numpy.arange(5)
    a += b
    assert a == numpy.array([0, 2, 4, 6, 8])

This apparently occurs because PyNumber_InPlaceAdd coerces ``a`` into
a numeric type (a numpy.array object) before trying
PySequence_InPlaceConcat.  Guido suggested that += (and \*=) should
check both the numeric and sequence slots before trying to coerce
either arguments.  Armin Rigo pointed out that this fix would require
that ``listobject.__add__(intobject)`` would have to return
NotImplemented instead of immediately raising an exception, but
promised to provide the fix when he could.

Contributing thread:

- `INPLACE_ADD and INPLACE_MULTIPLY oddities in ceval.c
<http://mail.python.org/pipermail/python-dev/2006-March/062885.html>`__

-------------------
Supported platforms
-------------------

A couple different threads asked about which of the lesser-known
platforms Python is supported on.  Andrew MacIntyre verified that OS/2
is still being supported, Donn Cave and Francois Revol maintain the
BeOS (a.k.a. Zeta OS) port, and Ralf W. Grosse-Kunstleve promised to
make Tru64 machines available for testing Python if people needed
them.

Contributing threads:

- `supported platforms OS2?
<http://mail.python.org/pipermail/python-dev/2006-March/062685.html>`__
- `r43214 - peps/trunk/pep-3000.txt
<http://mail.python.org/pipermail/python-dev/2006-March/062741.html>`__
- `[Fwd: Re: r43214 - peps/trunk/pep-3000.txt]
<http://mail.python.org/pipermail/python-dev/2006-March/062792.html>`__

----------------------------
Definition of sys.executable
----------------------------

Fredrik Lundh pointed out that when Python is embedded, the
documentation for sys.executable seems suggest that it should be
pointed to the executable used to start the program, not Python
itself.  Since this executable is no longer Python, you can no longer
run sys.executable to start another Python instance.  At least py2exe
and PyXPCOM both depend on the current behavior, so the implementation
can't be changed.  Fredrik suggested adding sys.python_executable so
that Python code would always be able to start another Python
instance, even when embedded, but the discussion concluded without any
final decisions.

Contributing thread:

- `towards a stricter definition of sys.executable
<http://mail.python.org/pipermail/python-dev/2006-March/062453.html>`__

----------------------------------------------------
Location of modules for multi-architecture platforms
----------------------------------------------------

Neal Becker reported that on x86_64, the Twisted package gets split
between an architecture independent directory and an architecture
dependent one, and so when Python searches for the package, it finds
only the first part of the installation. There was then some
discussion about how multi-architecture platforms should be properly
supported.  The conclusion seemed to be that the package should not be
split; instead two versions of the package, one for each architecture,
should be installed to two different directories.  But Martin v. L?wis
indicated that if anyone thought they knew better how this should
work, he would accept patches.

Contributing thread:

- `Problem with module loading on multi-arch?
<http://mail.python.org/pipermail/python-dev/2006-March/062462.html>`__

-----------------
PEP 299: rejected
-----------------

Guido rejected `PEP 299`_, which would have encouraged the definition
of a module-level __main__() function instead of the current::

    if __name__ == '__main__':
        ...

The rejection came after it was clear that writing code to work both
before and after the PEP was going to be difficult, that ``import
__main__`` would have started raising TypeErrors, and that the
proposal didn't particularly gain much over the status quo.

.. _PEP 299: http://www.python.org/dev/peps/pep-0299/

Contributing thread:

- `What about PEP 299?
<http://mail.python.org/pipermail/python-dev/2006-March/062951.html>`__

--------------------------------------
String exceptions in generator.throw()
--------------------------------------

In its implementation at the time, `PEP 342`_'s throw method on
generators did not support string exceptions.  Phillip J. Eby wanted
to add support for them, but suggested that string exceptions only be
allowed if accompanied by a traceback so that string exceptions could
only be passed on, not created.  Guido indicated that he'd rather keep
things simple and just allow string exceptions regardless of the
presence of a traceback.

.. _PEP 342: http://www.python.org/dev/peps/pep-0342/

Contributing thread:

- `PEP 342 support for string exceptions in throw()
<http://mail.python.org/pipermail/python-dev/2006-March/062798.html>`__

----------------------
Generator memory leaks
----------------------

Thomas Wouters looked into some new memory leaks with generators.
Previously, only instances of old- or new-style classes could have
__del__ methods, but with `PEP 342`_ generators gained them as well.
Tim Peters suggested that the best solution was simply to add a
special test for generators, and leave any generalizations of this
idea until they're actually needed.

Contributing threads:

- `[Python-checkins] r43358 - python/trunk/Modules/itertoolsmodule.c
<http://mail.python.org/pipermail/python-dev/2006-March/062870.html>`__
- `Fwd: [Python-checkins] r43358 -
python/trunk/Modules/itertoolsmodule.c
<http://mail.python.org/pipermail/python-dev/2006-March/062972.html>`__

-------------------------------------
PyMem and PyObject freeing mismatches
-------------------------------------

When Python's small-object memory allocator was introduced, Python
originally supported obtaining an object through PyObject_{New,NEW}
and freeing it using any one of PyObject_{Free,FREE}, PyMem_{Del,DEL}
or PyMem_{Free,FREE}.  The hacks to support the mismatched (PyMem_*)
frees were pretty horrible, and since using such mismatched calls has
been forbidden for years, Tim Peters started ripping out the support
code.  Turns out that Python core itself had a number of mismatched
calls, but Tim decided to leave the change checked in so that these
kinds of issues could get fixed before the 2.5 release.  Adam DePrince
promised to provide a patch to identify in a debug-build when an
object was freed with a mismatched call.

Contributing thread:

- `Prevalence of low-level memory abuse?
<http://mail.python.org/pipermail/python-dev/2006-March/062832.html>`__


================
Deferred Threads
================
- `reference leaks, __del__, and annotations
<http://mail.python.org/pipermail/python-dev/2006-March/063213.html>`__
- `New-style icons, .desktop file
<http://mail.python.org/pipermail/python-dev/2006-March/063235.html>`__


==================
Previous Summaries
==================
- `Still looking for volunteer to run Windows buildbot
<http://mail.python.org/pipermail/python-dev/2006-March/062445.html>`__
- `Making staticmethod objects callable?
<http://mail.python.org/pipermail/python-dev/2006-March/062450.html>`__
- `Switch to MS VC++ 2005 ?!
<http://mail.python.org/pipermail/python-dev/2006-March/062456.html>`__
- `_bsddb.c ownership
<http://mail.python.org/pipermail/python-dev/2006-March/063117.html>`__


===============
Skipped Threads
===============
- `[Python-checkins] r43041 - python/trunk/Modules/_ctypes/cfield.c
<http://mail.python.org/pipermail/python-dev/2006-March/062444.html>`__
- `open() mode is lax
<http://mail.python.org/pipermail/python-dev/2006-March/062447.html>`__
- `bytes thoughts
<http://mail.python.org/pipermail/python-dev/2006-March/062455.html>`__
- `Python Library Reference top page too long
<http://mail.python.org/pipermail/python-dev/2006-March/062465.html>`__
- `Weekly Python Patch/Bug Summary
<http://mail.python.org/pipermail/python-dev/2006-March/062496.html>`__
- `Bug 1184112 still valid
<http://mail.python.org/pipermail/python-dev/2006-March/062510.html>`__
- `Making runpy.run_module *not* thread-safe
<http://mail.python.org/pipermail/python-dev/2006-March/062517.html>`__
- `Py3K thought: use external library for client-side HTTP
<http://mail.python.org/pipermail/python-dev/2006-March/062543.html>`__
- `dealing with decorators hiding metadata of decorated functions
<http://mail.python.org/pipermail/python-dev/2006-March/062545.html>`__
- `All green! <http://mail.python.org/pipermail/python-dev/2006-March/062551.html>`__
- `Py_ssize_t backwards compatibility
<http://mail.python.org/pipermail/python-dev/2006-March/062561.html>`__
- `PY_SSIZE_T_CLEAN
<http://mail.python.org/pipermail/python-dev/2006-March/062562.html>`__
- `Patch or feature? Tix.Grid working for 2.5
<http://mail.python.org/pipermail/python-dev/2006-March/062577.html>`__
- `__dict__ strangeness
<http://mail.python.org/pipermail/python-dev/2006-March/062578.html>`__
- `Two buildbot slaves wedged
<http://mail.python.org/pipermail/python-dev/2006-March/062586.html>`__
- `pyexpat namespace problem (Was: libbzip2 version?)
<http://mail.python.org/pipermail/python-dev/2006-March/062598.html>`__
- `buildbot failure in sparc solaris10 gcc trunk
<http://mail.python.org/pipermail/python-dev/2006-March/062619.html>`__
- `Test <http://mail.python.org/pipermail/python-dev/2006-March/062620.html>`__
- `Py3K timescale and stdlib philosophy (was: Re: Py3K thought:
use...) <http://mail.python.org/pipermail/python-dev/2006-March/062625.html>`__
- `Py3K timescale and stdlib philosophy
<http://mail.python.org/pipermail/python-dev/2006-March/062630.html>`__
- `Long-time shy failure in test_socket_ssl
<http://mail.python.org/pipermail/python-dev/2006-March/062634.html>`__
- `Bug Day? <http://mail.python.org/pipermail/python-dev/2006-March/062646.html>`__
- `Bug Day on Friday, 31st of March
<http://mail.python.org/pipermail/python-dev/2006-March/062662.html>`__
- `buildbot failure in x86 W2k trunk
<http://mail.python.org/pipermail/python-dev/2006-March/062672.html>`__
- `buildbot failure in x86 XP-2 trunk
<http://mail.python.org/pipermail/python-dev/2006-March/062673.html>`__
- `buildbot failure in x86 XP 2.4
<http://mail.python.org/pipermail/python-dev/2006-March/062676.html>`__
- `Documenting the ssize_t Python C API changes
<http://mail.python.org/pipermail/python-dev/2006-March/062688.html>`__
- `r43041 - python/trunk/Modules/_ctypes/cfield.c
<http://mail.python.org/pipermail/python-dev/2006-March/062689.html>`__
- `buildbot warnings in x86 gentoo trunk
<http://mail.python.org/pipermail/python-dev/2006-March/062694.html>`__
- `segfaults on Mac (was Re: Long-time shy failure in
test_socket_ssl))
<http://mail.python.org/pipermail/python-dev/2006-March/062701.html>`__
- `buildbot warnings in amd64 gentoo trunk
<http://mail.python.org/pipermail/python-dev/2006-March/062714.html>`__
- `[Python-checkins] r43126 - in python/trunk: Doc/lib/libsocket.tex
Lib/socket.py Lib/test/test_socket.py Misc/NEWS Modules/socketmodule.c
<http://mail.python.org/pipermail/python-dev/2006-March/062723.html>`__
- `buildbot warnings in x86 XP-2 trunk
<http://mail.python.org/pipermail/python-dev/2006-March/062727.html>`__
- `buildbot warnings in g4 osx.4 trunk
<http://mail.python.org/pipermail/python-dev/2006-March/062730.html>`__
- `buildbot warnings in sparc solaris10 gcc trunk
<http://mail.python.org/pipermail/python-dev/2006-March/062749.html>`__
- `Building Python for AMD64 (Windows)
<http://mail.python.org/pipermail/python-dev/2006-March/062751.html>`__
- `buildbot warnings in x86 W2k trunk
<http://mail.python.org/pipermail/python-dev/2006-March/062777.html>`__
- `buildbot warnings in x86 XP trunk
<http://mail.python.org/pipermail/python-dev/2006-March/062778.html>`__
- `Patch to add timestamp() method to datetime objects
<http://mail.python.org/pipermail/python-dev/2006-March/062781.html>`__
- `test_quopri, test_wait3, and test_popen2
<http://mail.python.org/pipermail/python-dev/2006-March/062783.html>`__
- `[Python-3000] Iterators for dict keys, values, and items ==
annoying :) <http://mail.python.org/pipermail/python-dev/2006-March/062787.html>`__
- `howto return malloc()ed memory from C -&gt; Python
<http://mail.python.org/pipermail/python-dev/2006-March/062791.html>`__
- `PEP 343: A messy contextmanager corner case
<http://mail.python.org/pipermail/python-dev/2006-March/062797.html>`__
- `Another PEP 343 contextmanager glitch
<http://mail.python.org/pipermail/python-dev/2006-March/062800.html>`__
- `Pickling problems are hard to debug
<http://mail.python.org/pipermail/python-dev/2006-March/062828.html>`__
- `daily releases?
<http://mail.python.org/pipermail/python-dev/2006-March/062849.html>`__
- `Changing -Q to warn for 2.5?
<http://mail.python.org/pipermail/python-dev/2006-March/062850.html>`__
- `TRUNK FREEZE for 2.5a1: 0000 UTC, Thursday 30th
<http://mail.python.org/pipermail/python-dev/2006-March/062853.html>`__
- `refleaks in 2.4
<http://mail.python.org/pipermail/python-dev/2006-March/062856.html>`__
- `Inconsistency in 2.4.3 for __repr__() returning unicode
<http://mail.python.org/pipermail/python-dev/2006-March/062857.html>`__
- `Next PyPy Sprint: Tokyo 23/4 - 29/4
<http://mail.python.org/pipermail/python-dev/2006-March/062859.html>`__
- `[Python-checkins] TRUNK FREEZE for 2.5a1: 0000 UTC, Thursday 30th
<http://mail.python.org/pipermail/python-dev/2006-March/062867.html>`__
- `Libref sections to put new modules under?
<http://mail.python.org/pipermail/python-dev/2006-March/062871.html>`__
- `[Python-checkins] Cancelled! TRUNK FREEZE for 2.5a1: 0000 UTC,
Thursday 30th <http://mail.python.org/pipermail/python-dev/2006-March/062880.html>`__
- `Error msgs for new-style division
<http://mail.python.org/pipermail/python-dev/2006-March/062924.html>`__
- `Reminder: Bug Day this Friday, 31st of March
<http://mail.python.org/pipermail/python-dev/2006-March/062953.html>`__
- `warnings in libffi
<http://mail.python.org/pipermail/python-dev/2006-March/063079.html>`__
- `Name for python package repository
<http://mail.python.org/pipermail/python-dev/2006-March/063095.html>`__
- `alpha problems -- need input
<http://mail.python.org/pipermail/python-dev/2006-March/063116.html>`__
- `unicode vs buffer (array) design issue can crash interpreter
<http://mail.python.org/pipermail/python-dev/2006-March/063127.html>`__
- `building sql queries in python
<http://mail.python.org/pipermail/python-dev/2006-March/063138.html>`__
- `Win64 AMD64 (aka x64) binaries available64
<http://mail.python.org/pipermail/python-dev/2006-March/063143.html>`__
- `_xmlplus fixup for 2.5
<http://mail.python.org/pipermail/python-dev/2006-March/063159.html>`__
- `(finally) getting around to killing __private names from stdlib
<http://mail.python.org/pipermail/python-dev/2006-March/063161.html>`__
- `new article - MapPoint and Python
<http://mail.python.org/pipermail/python-dev/2006-March/063172.html>`__
- `xmlrpclib.binary missing?
<http://mail.python.org/pipermail/python-dev/2006-March/063195.html>`__
- `Fwd: Python 2.5 update
<http://mail.python.org/pipermail/python-dev/2006-March/063199.html>`__
- `2.5 trunk built for Windows available?
<http://mail.python.org/pipermail/python-dev/2006-March/063200.html>`__
- `Nasty trunk test failures
<http://mail.python.org/pipermail/python-dev/2006-March/063202.html>`__
- `x86 trunk MSI preview
<http://mail.python.org/pipermail/python-dev/2006-March/063242.html>`__
- `[Python-checkins] Python Regression Test Failures all (1)
<http://mail.python.org/pipermail/python-dev/2006-March/063243.html>`__
- `gmane.comp.python.devel.3000 has disappeared
<http://mail.python.org/pipermail/python-dev/2006-March/063251.html>`__

From arigo at tunes.org  Sun May 28 21:12:16 2006
From: arigo at tunes.org (Armin Rigo)
Date: Sun, 28 May 2006 21:12:16 +0200
Subject: [Python-Dev] dictionary order
Message-ID: <20060528191216.GA9792@code0.codespeak.net>

Hi all,

I'm playing with dicts that mangle the hash they receive before using it
for hashing.  The goal was to detect obscure dict order dependencies in
my own programs, but I couldn't resist and ran the Python test suite
with various mangling schemes.  As expected -- what is not tested is
broken -- I found and fixed tons of small dependencies in the tests
themselves, plus one in base64.py.

Now I'm stumbling upon this test for urllib2:

    >>> mgr = urllib2.HTTPPasswordMgr()
    >>> add = mgr.add_password
    >>> add("Some Realm", "http://example.com/", "joe", "password")
    >>> add("Some Realm", "http://example.com/ni", "ni", "ni")
    (...)
    
    Currently, we use the highest-level path where more than one
    match:

    >>> mgr.find_user_password("Some Realm", "http://example.com/ni")
    ('joe', 'password')

Returning the outermost path is a bit strange, if you ask me, but I am
no expert here.  Stranger is the fact that the actual implement actually
returns, not the outermost path at all -- there is no code to do that --
but a random pick, the first match in dictionary order.  The comment in
the test is just misleading.  I believe that urllib2 should be fixed to
always return the *innermost* path, but I need confirmation about
this...


A bientot,

Armin

From mwh at python.net  Sun May 28 22:42:10 2006
From: mwh at python.net (Michael Hudson)
Date: Sun, 28 May 2006 21:42:10 +0100
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
 refleak (101)
In-Reply-To: <2mejyeyrg7.fsf@starship.python.net> (Michael Hudson's message
	of "Sun, 28 May 2006 18:44:40 +0100")
References: <20060528090534.GA5026@python.psfb.org>
	<1f7befae0605280423n1e378af0mf3dcb200c2a58c45@mail.gmail.com>
	<9e804ac0605280435v149c361dt222d155517694bb1@mail.gmail.com>
	<2my7wmz1ub.fsf@starship.python.net> <e5cj3n$2hl$1@sea.gmane.org>
	<2mu07ayuc9.fsf@starship.python.net> <e5ckun$7lr$1@sea.gmane.org>
	<2mejyeyrg7.fsf@starship.python.net>
Message-ID: <2mac91zxst.fsf@starship.python.net>

Michael Hudson <mwh at python.net> writes:

> I think I've fixed the refleaks too, but running regrtest -R :: takes
> raaaather a while.

I hadn't: test_codecs and test_codeccallbacks both leak, the latter
quite spectacularly (9000+ refleaks a run).  The test_codecs leaks
come from calls to codecs.decode_charmap, I'm not sure that it's to do
with the exceptions.c changes.  The test_codeccallbacks leaks seem
more like they could be exception related: when a unicode decoding
fails, two references are leaked.

I've run out of energy for these tonight.

Cheers,
mwh

-- 
  To summarise the summary of the summary:- people are a problem.
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 12

From g.brandl at gmx.net  Sun May 28 23:01:49 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 28 May 2006 21:01:49 +0000
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
	refleak (101)
In-Reply-To: <2mac91zxst.fsf@starship.python.net>
References: <20060528090534.GA5026@python.psfb.org>	<1f7befae0605280423n1e378af0mf3dcb200c2a58c45@mail.gmail.com>	<9e804ac0605280435v149c361dt222d155517694bb1@mail.gmail.com>	<2my7wmz1ub.fsf@starship.python.net>
	<e5cj3n$2hl$1@sea.gmane.org>	<2mu07ayuc9.fsf@starship.python.net>
	<e5ckun$7lr$1@sea.gmane.org>	<2mejyeyrg7.fsf@starship.python.net>
	<2mac91zxst.fsf@starship.python.net>
Message-ID: <e5d31v$j6e$1@sea.gmane.org>

Michael Hudson wrote:
> Michael Hudson <mwh at python.net> writes:
> 
>> I think I've fixed the refleaks too, but running regrtest -R :: takes
>> raaaather a while.
> 
> I hadn't: test_codecs and test_codeccallbacks both leak, the latter
> quite spectacularly (9000+ refleaks a run).  The test_codecs leaks
> come from calls to codecs.decode_charmap, I'm not sure that it's to do
> with the exceptions.c changes.  The test_codeccallbacks leaks seem
> more like they could be exception related: when a unicode decoding
> fails, two references are leaked.
> 
> I've run out of energy for these tonight.

I repaired that, it was in the GetStart/GetEnd functions for UnicodeErrors.

Currently running -R ::, so let's see...

Georg


From martin at v.loewis.de  Sun May 28 23:59:52 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 28 May 2006 23:59:52 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <e5cjgn$3pg$1@sea.gmane.org>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>
	<e5cjgn$3pg$1@sea.gmane.org>
Message-ID: <447A1D58.1020302@v.loewis.de>

Georg Brandl wrote:
>> There's still a ton used under Modules.  Also, if no flag is
>> specified, it will default to 0 (ie, METH_OLDARGS).  I wonder how many
>> third party modules use METH_OLDARGS directly or more likely
>> indirectly.
> 
> These modules can be converted.

I think that should be done for 2.5, but nothing else. Or, perhaps
a deprecation warning could be generated if it is still used.

Regards,
Martin

From jjl at pobox.com  Mon May 29 00:53:44 2006
From: jjl at pobox.com (John J Lee)
Date: Sun, 28 May 2006 22:53:44 +0000 (UTC)
Subject: [Python-Dev] dictionary order
In-Reply-To: <20060528191216.GA9792@code0.codespeak.net>
References: <20060528191216.GA9792@code0.codespeak.net>
Message-ID: <Pine.LNX.4.64.0605282245370.17302@localhost>

On Sun, 28 May 2006, Armin Rigo wrote:
[...]
> Now I'm stumbling upon this test for urllib2:
>
>    >>> mgr = urllib2.HTTPPasswordMgr()
>    >>> add = mgr.add_password
>    >>> add("Some Realm", "http://example.com/", "joe", "password")
>    >>> add("Some Realm", "http://example.com/ni", "ni", "ni")
>    (...)
>
>    Currently, we use the highest-level path where more than one
>    match:
>
>    >>> mgr.find_user_password("Some Realm", "http://example.com/ni")
>    ('joe', 'password')
>
> Returning the outermost path is a bit strange, if you ask me, but I am
> no expert here.  Stranger is the fact that the actual implement actually
> returns, not the outermost path at all -- there is no code to do that --
> but a random pick, the first match in dictionary order.  The comment in
> the test is just misleading.  I believe that urllib2 should be fixed to
> always return the *innermost* path, but I need confirmation about
> this...

I noticed the same things, and in fact I think this was fixed before you 
posted :-)

FWIW, here are the details:

The checkin was from Georg as -r 46509, patch was 
http://python.org/sf/1496206

Part of the comment on the patch:

"""
The patch also comments out one test which was testing
something not actually guaranteed by the code at all --
it was passing by fluke. The code it's trying to test
could do with some review, which is why I left this
test commented out rather than deleting the test (but
that is a long-standing issue unrelated to this patch,
so should not block this patch from being applied).
"""

Recently I have been slowly working my way through urllib2 auth, fixing 
bugs and adding tests as I go.  This particular issue is horribly unclear 
in the RFC, though, and I haven't yet got round to the necessary checking 
of real-world behaviour.


John


From rogermiller at alum.mit.edu  Mon May 29 01:32:04 2006
From: rogermiller at alum.mit.edu (Roger Miller)
Date: Sun, 28 May 2006 13:32:04 -1000
Subject: [Python-Dev] Syntax errors on continuation lines
Message-ID: <447A32F4.4080307@alum.mit.edu>

I am a recent subscriber to this list.  A couple of months ago
I downloaded the Python source out of curiosity, then decided
to see if I could scratch a couple of small itches.

One of them was syntax errors reported on one line but actually
resulting from unbalanced brackets on the previous line. (The
folks on this list may not make such silly mistakes any more,
but several postings on comp.lang.python suggest that I am not
entirely alone in finding it an annoyance.)  I have modified
the parser and error reporting code to provide more explicit
information in such cases, and am now looking for some guidance
as to whether it is worth submitting a patch.

With my change a syntax error on a continuation line looks like
this:

      File "test.py", line 42 (statement started on line 41)
        y = 0
          ^
    SyntaxError: invalid syntax

The patch adds a new attribute to the SytaxError exception
object (the line number on which the statement started).  I
don't have the experience to judge whether this would be
considered a compatibility problem.  An alternative that does
not require modifying the exception would be to change the
message text without identifying the specific line number:

      File "test.py", line 42
        y = 0
          ^
    SyntaxError: invalid syntax in continuation of prior line

However I think the continuation indication properly belongs
with the location information rather than with the message.
(The loss of the starting line number itself is no big deal;
it will usually be one less than the error line number.)



From bob at redivi.com  Mon May 29 01:48:27 2006
From: bob at redivi.com (Bob Ippolito)
Date: Sun, 28 May 2006 16:48:27 -0700
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
Message-ID: <DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com>


On May 28, 2006, at 4:31 AM, Thomas Wouters wrote:

>
> I'm seeing a dubious failure of test_gzip and test_tarfile on my  
> AMD64 machine. It's triggered by the recent struct changes, but I'd  
> say it's probably caused by a bug/misfeature in zlibmodule:  
> zlib.crc32 is the result of a zlib 'crc32' functioncall, which  
> returns an unsigned long. zlib.crc32 turns that unsigned long into  
> a (signed) Python int, which means a number beyond 1<<31 goes  
> negative on 32-bit systems and other systems with 32-bit longs, but  
> stays positive on systems with 64-bit longs:
>
> (32-bit)
> >>> zlib.crc32("foobabazr")
> -271938108
>
> (64-bit)
> >>> zlib.crc32("foobabazr")
> 4023029188
>
> The old structmodule coped with that:
> >>> struct.pack("<l", -271938108)
> '\xc4\x8d\xca\xef'
> >>> struct.pack("<l", 4023029188)
> '\xc4\x8d\xca\xef'
>
> The new one does not:
> >>> struct.pack("<l", -271938108)
> '\xc4\x8d\xca\xef'
> >>> struct.pack("<l", 4023029188)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "Lib/struct.py", line 63, in pack
>     return o.pack(*args)
> struct.error: 'l' format requires -2147483647 <= number <= 2147483647
>
> The structmodule should be fixed (and a test added ;) but I'm also  
> wondering if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my  
> suggested fix would be to change the PyInt_FromLong() call to  
> PyLong_FromUnsignedLong(), making zlib always return positive  
> numbers -- it might break some code on 32-bit platforms, but that  
> code is already broken on 64-bit platforms. But I guess I'm okay  
> with the long being changed into an actual 32-bit signed number on  
> 64-bit platforms, too.

The struct module isn't what's broken here. All of the struct types  
have always had well defined bit sizes and alignment if you  
explicitly specify an endian, >I and >L are 32-bits everywhere, and  
 >Q is supported on platforms that don't have long long. The only  
thing that's changed is that it actually checks for errors  
consistently now.

-bob


From thomas at python.org  Mon May 29 02:34:45 2006
From: thomas at python.org (Thomas Wouters)
Date: Mon, 29 May 2006 02:34:45 +0200
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com>
Message-ID: <9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com>

On 5/29/06, Bob Ippolito <bob at redivi.com> wrote:
>
>
> On May 28, 2006, at 4:31 AM, Thomas Wouters wrote:
> >
> > I'm seeing a dubious failure of test_gzip and test_tarfile on my
> > AMD64 machine. It's triggered by the recent struct changes, but I'd
> > say it's probably caused by a bug/misfeature in zlibmodule:
> > zlib.crc32 is the result of a zlib 'crc32' functioncall, which
> > returns an unsigned long. zlib.crc32 turns that unsigned long into
> > a (signed) Python int, which means a number beyond 1<<31 goes
> > negative on 32-bit systems and other systems with 32-bit longs, but
> > stays positive on systems with 64-bit longs:
> >
> > (32-bit)
> > >>> zlib.crc32("foobabazr")
> > -271938108
> >
> > (64-bit)
> > >>> zlib.crc32("foobabazr")
> > 4023029188
> >
> > The old structmodule coped with that:
> > >>> struct.pack("<l", -271938108)
> > '\xc4\x8d\xca\xef'
> > >>> struct.pack("<l", 4023029188)
> > '\xc4\x8d\xca\xef'
> >
> > The new one does not:
> > >>> struct.pack("<l", -271938108)
> > '\xc4\x8d\xca\xef'
> > >>> struct.pack("<l", 4023029188)
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in <module>
> >   File "Lib/struct.py", line 63, in pack
> >     return o.pack(*args)
> > struct.error: 'l' format requires -2147483647 <= number <= 2147483647
> >
> > The structmodule should be fixed (and a test added ;) but I'm also
> > wondering if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my
> > suggested fix would be to change the PyInt_FromLong() call to
> > PyLong_FromUnsignedLong(), making zlib always return positive
> > numbers -- it might break some code on 32-bit platforms, but that
> > code is already broken on 64-bit platforms. But I guess I'm okay
> > with the long being changed into an actual 32-bit signed number on
> > 64-bit platforms, too.
>
> The struct module isn't what's broken here. All of the struct types
> have always had well defined bit sizes and alignment if you
> explicitly specify an endian, >I and >L are 32-bits everywhere, and
> >Q is supported on platforms that don't have long long. The only
> thing that's changed is that it actually checks for errors
> consistently now.


Yes. And that breaks things. I'm certain the behaviour is used in real-world
code (and I don't mean just the gzip module.) It has always, as far as I can
remember, accepted 'unsigned' values for the signed versions of ints, longs
and long-longs (but not chars or shorts.) I agree that that's wrong, but I
don't think changing struct to do the right thing should be done in 2.5. I
don't even think it should be done in 2.6 -- although 3.0 is fine.

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/3bc0bb68/attachment-0001.html 

From nnorwitz at gmail.com  Mon May 29 05:42:02 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 28 May 2006 20:42:02 -0700
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
	refleak (101)
In-Reply-To: <2mejyeyrg7.fsf@starship.python.net>
References: <20060528090534.GA5026@python.psfb.org>
	<1f7befae0605280423n1e378af0mf3dcb200c2a58c45@mail.gmail.com>
	<9e804ac0605280435v149c361dt222d155517694bb1@mail.gmail.com>
	<2my7wmz1ub.fsf@starship.python.net> <e5cj3n$2hl$1@sea.gmane.org>
	<2mu07ayuc9.fsf@starship.python.net> <e5ckun$7lr$1@sea.gmane.org>
	<2mejyeyrg7.fsf@starship.python.net>
Message-ID: <ee2a432c0605282042y1f22149ava3ef363b95fc408b@mail.gmail.com>

On 5/28/06, Michael Hudson <mwh at python.net> wrote:
>
> I think I've fixed the refleaks too, but running regrtest -R :: takes
> raaaather a while.

FYI, I typically run -R 4:3: since it seems to do a pretty good job
and takes 20% less time.  I never run -R with -u all, only a normal
run or specific tests.  (-u all and -R can be done by the automated
run.)

n

From nnorwitz at gmail.com  Mon May 29 06:37:34 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 28 May 2006 21:37:34 -0700
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <e5cjgn$3pg$1@sea.gmane.org>
References: <e5bhs7$509$1@sea.gmane.org>
	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>
	<e5cjgn$3pg$1@sea.gmane.org>
Message-ID: <ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>

On 5/28/06, Georg Brandl <g.brandl at gmx.net> wrote:
> Neal Norwitz wrote:
> > On 5/28/06, Georg Brandl <g.brandl at gmx.net> wrote:
> >> In the process of reviewing and possibly extending getargs.c, I stumbled
> >> over the "compatibility" flag supposedly used for METH_OLDARGS functions.
> >> No code in the core uses this calling convention any more, and it has been
> >> deprecated for quite a long time (since 2.2), so would it be appropriate to end
> >> support for it in 2.5?
> >
> > There's still a ton used under Modules.  Also, if no flag is
> > specified, it will default to 0 (ie, METH_OLDARGS).  I wonder how many
> > third party modules use METH_OLDARGS directly or more likely
> > indirectly.
>
> These modules can be converted.

We can convert the one's in Python, but not the third party ones.
It's also very difficult to search for something that's not there.
You need to review every PyMethodDef entry to verify it has a METH_*
flag.  These implicit METH_OLDARGS may well exist in Python,
especially given who did the patch to add them in the first place. :-)

    http://mail.python.org/pipermail/python-dev/2002-March/022009.html
    http://mail.python.org/pipermail/patches/2002-July/009023.html

> > I would like to get rid of the flag, but I'm not sure we can do it
> > safely until 3.0.
>
> It has been deprecated since 2.2, so it'd be no surprise if it went away.

There's a difference between "it'd be no surprise" and "it should be
no surpise".  While I agree in theory they are the same, in practice,
they are not.  That's the difference between theory and practice. :-)

> Also, it would be easy to detect METH_OLDARGS in PyCFunction_New and raise
> an appropriate exception.

I agree with Martin this should raise a deprecation warning in 2.5.

n

From nnorwitz at gmail.com  Mon May 29 08:15:34 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 28 May 2006 23:15:34 -0700
Subject: [Python-Dev] ssize_t question: longs in header files
Message-ID: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>

$ grep long */*.h

minus comments, etc  yields several questions about whether some
values should use Py_ssize_t rather than C longs.  In particular:

 * hash values
Include/abstract.h:     long PyObject_Hash(PyObject *o);  // also in object.h
Include/object.h:typedef long (*hashfunc)(PyObject *);

 * ints:  Include/intobject.h:    long ob_ival;
 * thread values (probably no need to change since platform dependent)

Attached is the whole list.  This is just for C longs.  The int list
is much bigger and coming in a different msg.

n
-------------- next part --------------
A non-text attachment was scrubbed...
Name: header-longs
Type: application/octet-stream
Size: 5349 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20060528/0039611c/attachment.obj 

From nnorwitz at gmail.com  Mon May 29 08:52:13 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 28 May 2006 23:52:13 -0700
Subject: [Python-Dev] ssize_t: ints in header files
Message-ID: <ee2a432c0605282352y418a6e70i6c96350fa373146@mail.gmail.com>

Here's the int questions.  The entire list is too big to post (50k),
so it's here:

http://www.python.org/neal/header-ints

Should the following values be ints (limited to 2G)?

 * dict counts (ma_fill, ma_used, ma_mask)
 * line #s and column #s
 * AST (asdl.h) sequences
 * recursion limit
 * read/write/send/recv results and buf size
 * code object values like # args, # locals, stacksize, # instructions
 * sre (i think i have a patch to fix this somewhere)

Please try to review the APIs and let us know if there are any values
you think should be 64-bits on 64-bit platforms.

n

From michele.simionato at gmail.com  Mon May 29 09:29:53 2006
From: michele.simionato at gmail.com (Michele Simionato)
Date: Mon, 29 May 2006 07:29:53 +0000 (UTC)
Subject: [Python-Dev] feature request: inspect.isgenerator
Message-ID: <loom.20060529T092512-825@post.gmane.org>

Is there still time for new feature requests in Python 2.5?
I am missing a isgenerator function in the inspect module. Right now
I am using

def isgenerator(func):
    return func.func_code.co_flags == 99

but it is rather ugly (horrible indeed).


       Michele Simionato



From greg.ewing at canterbury.ac.nz  Mon May 29 11:33:45 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 29 May 2006 21:33:45 +1200
Subject: [Python-Dev] ssize_t: ints in header files
In-Reply-To: <ee2a432c0605282352y418a6e70i6c96350fa373146@mail.gmail.com>
References: <ee2a432c0605282352y418a6e70i6c96350fa373146@mail.gmail.com>
Message-ID: <447ABFF9.60104@canterbury.ac.nz>

Neal Norwitz wrote:

> Should the following values be ints (limited to 2G)?
> 
>  * line #s and column #s

Anyone with a source file more than 2G lines long
or 2G chars wide deserves whatever they get. :-)

Not-investing-in-a-1.3e10-pixel-wide-screen-any-time-soon-ly,
Greg

From bob at redivi.com  Mon May 29 11:56:14 2006
From: bob at redivi.com (Bob Ippolito)
Date: Mon, 29 May 2006 02:56:14 -0700
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com>
	<9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com>
Message-ID: <E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>

On May 28, 2006, at 5:34 PM, Thomas Wouters wrote:

> On 5/29/06, Bob Ippolito <bob at redivi.com> wrote:
>
> On May 28, 2006, at 4:31 AM, Thomas Wouters wrote:
> >
> > I'm seeing a dubious failure of test_gzip and test_tarfile on my
> > AMD64 machine. It's triggered by the recent struct changes, but I'd
> > say it's probably caused by a bug/misfeature in zlibmodule:
> > zlib.crc32 is the result of a zlib 'crc32' functioncall, which
> > returns an unsigned long. zlib.crc32 turns that unsigned long into
> > a (signed) Python int, which means a number beyond 1<<31 goes
> > negative on 32-bit systems and other systems with 32-bit longs, but
> > stays positive on systems with 64-bit longs:
> >
> > (32-bit)
> > >>> zlib.crc32("foobabazr")
> > -271938108
> >
> > (64-bit)
> > >>> zlib.crc32("foobabazr")
> > 4023029188
> >
> > The old structmodule coped with that:
> > >>> struct.pack("<l", -271938108)
> > '\xc4\x8d\xca\xef'
> > >>> struct.pack("<l", 4023029188)
> > '\xc4\x8d\xca\xef'
> >
> > The new one does not:
> > >>> struct.pack("<l", -271938108)
> > '\xc4\x8d\xca\xef'
> > >>> struct.pack("<l", 4023029188)
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in <module>
> >   File "Lib/struct.py", line 63, in pack
> >     return o.pack(*args)
> > struct.error: 'l' format requires -2147483647 <= number <=  
> 2147483647
> >
> > The structmodule should be fixed (and a test added ;) but I'm also
> > wondering if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my
> > suggested fix would be to change the PyInt_FromLong() call to
> > PyLong_FromUnsignedLong(), making zlib always return positive
> > numbers -- it might break some code on 32-bit platforms, but that
> > code is already broken on 64-bit platforms. But I guess I'm okay
> > with the long being changed into an actual 32-bit signed number on
> > 64-bit platforms, too.
>
> The struct module isn't what's broken here. All of the struct types
> have always had well defined bit sizes and alignment if you
> explicitly specify an endian, >I and >L are 32-bits everywhere, and
> >Q is supported on platforms that don't have long long. The only
> thing that's changed is that it actually checks for errors
> consistently now.
>
> Yes. And that breaks things. I'm certain the behaviour is used in  
> real-world code (and I don't mean just the gzip module.) It has  
> always, as far as I can remember, accepted 'unsigned' values for  
> the signed versions of ints, longs and long-longs (but not chars or  
> shorts.) I agree that that's wrong, but I don't think changing  
> struct to do the right thing should be done in 2.5. I don't even  
> think it should be done in 2.6 -- although 3.0 is fine.
>

Well, the behavior change is in response to a bug <http://python.org/ 
sf/1229380>. If nothing else, we should at least fix the standard  
library such that it doesn't depend on struct bugs. This is the only  
way to find them :)

Basically the struct module previously only checked for errors if you  
don't specify an endian. That's really strange and leads to very  
confusing results. The only code that really should be broken by this  
additional check is code that existed before Python had a long type  
and only signed values were available.

-bob

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/702d1314/attachment-0001.htm 

From thomas at python.org  Mon May 29 12:14:54 2006
From: thomas at python.org (Thomas Wouters)
Date: Mon, 29 May 2006 12:14:54 +0200
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com>
	<9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com>
	<E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>
Message-ID: <9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>

On 5/29/06, Bob Ippolito <bob at redivi.com> wrote:
>
> Well, the behavior change is in response to a bug <http://python.org/sf/1229380>. If
> nothing else, we should at least fix the standard library such that it
> doesn't depend on struct bugs. This is the only way to find them :)
>

Feel free to comment how the zlib.crc32/gzip co-operation should be fixed. I
don't see an obviously correct fix. The trunk is currently failing tests it
shouldn't fail. Also note that the error isn't with feeding signed values to
unsigned formats (which is what the bug is about) but the other way 'round,
although I do believe both should be accepted for the time being, while
generating a warning.

Basically the struct module previously only checked for errors if you don't
> specify an endian. That's really strange and leads to very confusing
> results. The only code that really should be broken by this additional check
> is code that existed before Python had a long type and only signed values
> were available.
>

Alas, reality is different. The fundamental difference between types in
Python and in C causes this, and code using struct is usually meant
specifically to bridge those two worlds. Furthermore, struct is often used
*fix* that issue, by flipping sign bits if necessary:

>>> struct.unpack("<l", struct.pack("<l", 3221225472))
(-1073741824,)
>>> struct.unpack("<l", struct.pack("<L", 3221225472))
(-1073741824,)
>>> struct.unpack("<l", struct.pack("<l", -1073741824))
(-1073741824,)
>>> struct.unpack("<l", struct.pack("<L", -1073741824))
(-1073741824,)

Before this change, you didn't have to check whether the value is negative
before the struct.unpack/pack dance, regardless of which format character
you used. This misfeature is used (and many would consider it convenient,
even Pythonic, for struct to DWIM), breaking it suddenly is bad.

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/f10c6e38/attachment.html 

From g.brandl at gmx.net  Mon May 29 16:11:42 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 29 May 2006 16:11:42 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>
	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>
Message-ID: <e5evds$34d$1@sea.gmane.org>

Neal Norwitz wrote:

>> These modules can be converted.
> 
> We can convert the one's in Python, but not the third party ones.
> It's also very difficult to search for something that's not there.
> You need to review every PyMethodDef entry to verify it has a METH_*
> flag.  These implicit METH_OLDARGS may well exist in Python,
> especially given who did the patch to add them in the first place. :-)
> 
>     http://mail.python.org/pipermail/python-dev/2002-March/022009.html
>     http://mail.python.org/pipermail/patches/2002-July/009023.html

I'll gradually try to convert them.

>> > I would like to get rid of the flag, but I'm not sure we can do it
>> > safely until 3.0.
>>
>> It has been deprecated since 2.2, so it'd be no surprise if it went away.
> 
> There's a difference between "it'd be no surprise" and "it should be
> no surpise".  While I agree in theory they are the same, in practice,
> they are not.  That's the difference between theory and practice. :-)

Yep, you're right.

>> Also, it would be easy to detect METH_OLDARGS in PyCFunction_New and raise
>> an appropriate exception.
> 
> I agree with Martin this should raise a deprecation warning in 2.5.

Patch is at #1496957 (another one for converting Tkinter is at #1496952).

Georg


From thomas at python.org  Mon May 29 17:17:41 2006
From: thomas at python.org (Thomas Wouters)
Date: Mon, 29 May 2006 17:17:41 +0200
Subject: [Python-Dev] ssize_t question: longs in header files
In-Reply-To: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>
References: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>
Message-ID: <9e804ac0605290817m1251ef5q448815d135d4da25@mail.gmail.com>

On 5/29/06, Neal Norwitz <nnorwitz at gmail.com> wrote:

> * ints:  Include/intobject.h:    long ob_ival;


I considered asking about this before, as it would give '64-bit power' to
Win64 integers. It's a rather big change, though (lots of code assumes
PyInts fit in signed longs, which would be untrue then.)

On the other hand, if a separate type was used for PyInts (Py_pyint_t or
whatever, defaulting to long, except on LLP64 systems), EWT's request for a
64-bit integer represented by C 'long long's would be a simple configure
switch.

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/5fe0e4ca/attachment.htm 

From bob at redivi.com  Mon May 29 17:45:55 2006
From: bob at redivi.com (Bob Ippolito)
Date: Mon, 29 May 2006 08:45:55 -0700
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com>
	<9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com>
	<E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>
	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>
Message-ID: <B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>


On May 29, 2006, at 3:14 AM, Thomas Wouters wrote:

>
>
> On 5/29/06, Bob Ippolito <bob at redivi.com> wrote:
> Well, the behavior change is in response to a bug <http:// 
> python.org/sf/1229380>. If nothing else, we should at least fix the  
> standard library such that it doesn't depend on struct bugs. This  
> is the only way to find them :)
>
> Feel free to comment how the zlib.crc32/gzip co-operation should be  
> fixed. I don't see an obviously correct fix. The trunk is currently  
> failing tests it shouldn't fail. Also note that the error isn't  
> with feeding signed values to unsigned formats (which is what the  
> bug is about) but the other way 'round, although I do believe both  
> should be accepted for the time being, while generating a warning.

Well, first I'm going to just correct the modules that are broken  
(zlib, gzip, tarfile, binhex and probably one or two others).

> Basically the struct module previously only checked for errors if  
> you don't specify an endian. That's really strange and leads to  
> very confusing results. The only code that really should be broken  
> by this additional check is code that existed before Python had a  
> long type and only signed values were available.
>
> Alas, reality is different. The fundamental difference between  
> types in Python and in C causes this, and code using struct is  
> usually meant specifically to bridge those two worlds. Furthermore,  
> struct is often used *fix* that issue, by flipping sign bits if  
> necessary:

Well, in C you get a compiler warning for stuff like this.

> >>> struct.unpack("<l", struct.pack("<l", 3221225472))
> (-1073741824,)
> >>> struct.unpack("<l", struct.pack("<L", 3221225472))
> (-1073741824,)
> >>> struct.unpack("<l", struct.pack("<l", -1073741824))
> (-1073741824,)
> >>> struct.unpack("<l", struct.pack("<L", -1073741824))
> (-1073741824,)
>
> Before this change, you didn't have to check whether the value is  
> negative before the struct.unpack/pack dance, regardless of which  
> format character you used. This misfeature is used (and many would  
> consider it convenient, even Pythonic, for struct to DWIM),  
> breaking it suddenly is bad.

struct doesn't really DWIM anyway, since integers are up-converted to  
longs and will overflow past what the (old or new) struct module will  
accept. Before there was a long type or automatic up-converting, the  
sign agnosticism worked.. but it doesn't really work correctly these  
days.

We have two choices, either fix it to behave consistently broken  
everywhere for numbers of every size (modulo every number that comes  
in so that it fits), or have it do proper range checking. A  
compromise is to do proper range checking as a warning, and do the  
modulo math anyway... but is that what we really want?

-bob

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/cbfc11ac/attachment.html 

From thomas at python.org  Mon May 29 18:03:57 2006
From: thomas at python.org (Thomas Wouters)
Date: Mon, 29 May 2006 18:03:57 +0200
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com>
	<9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com>
	<E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>
	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>
	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
Message-ID: <9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>

On 5/29/06, Bob Ippolito <bob at redivi.com> wrote:
>
> A compromise is to do proper range checking as a warning, and do the
> modulo math anyway... but is that what we really want?
>

I don't know about the rest of 'us', but that's what I want, yes: backward
compatibility, and a warning to tell people to fix their code 'or else'. The
prevalence of the warnings (outside of the stdlib) should give us a clue
whether to make it an exception in 2.6 or wait for 2.7/3.0.

Perhaps more people could chime in? Am I being too anal about backward
compatibility here?

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/0c6f3366/attachment-0001.htm 

From aahz at pythoncraft.com  Mon May 29 18:05:43 2006
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 29 May 2006 09:05:43 -0700
Subject: [Python-Dev] Syntax errors on continuation lines
In-Reply-To: <447A32F4.4080307@alum.mit.edu>
References: <447A32F4.4080307@alum.mit.edu>
Message-ID: <20060529160543.GA22567@panix.com>

On Sun, May 28, 2006, Roger Miller wrote:
>
> I am a recent subscriber to this list.  A couple of months ago
> I downloaded the Python source out of curiosity, then decided
> to see if I could scratch a couple of small itches.

Please post your patches to SourceForge, then post links back to the
list.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"I saw `cout' being shifted "Hello world" times to the left and stopped
right there."  --Steve Gonedes

From fuzzyman at voidspace.org.uk  Mon May 29 18:25:59 2006
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Mon, 29 May 2006 17:25:59 +0100
Subject: [Python-Dev] Syntax errors on continuation lines
In-Reply-To: <20060529160543.GA22567@panix.com>
References: <447A32F4.4080307@alum.mit.edu> <20060529160543.GA22567@panix.com>
Message-ID: <447B2097.20207@voidspace.org.uk>

Aahz wrote:
> On Sun, May 28, 2006, Roger Miller wrote:
>   
>> I am a recent subscriber to this list.  A couple of months ago
>> I downloaded the Python source out of curiosity, then decided
>> to see if I could scratch a couple of small itches.
>>     
+1 on the change, the error messages that report the wrong line can be 
very confusing.

Michael Foord

>
> Please post your patches to SourceForge, then post links back to the
> list.
>   


From steven.bethard at gmail.com  Mon May 29 19:05:32 2006
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon, 29 May 2006 11:05:32 -0600
Subject: [Python-Dev] DRAFT: python-dev summary for 2006-04-01 to 2006-04-15
Message-ID: <d11dcfba0605291005q2e3cece0u80a1cead2bce963c@mail.gmail.com>

Ok, here comes the first half of April.  Again, please let me know if
you have any comments or corrections (and thanks to those who already
got back to me on the last one).  I'm gonna try to keep these coming
over the next week or so until I'm all caught up.



=============
Announcements
=============

---------------------
Python 2.5a1 Released
---------------------

Python 2.5 alpha 1 was released on April 5th.  Please download it and
try it out, particularly if you are an extension writer or you embed
Python -- you may want to change things to support 64-bit sequences,
and if you have been using mismatched PyMem_* and PyObject_*
allocation calls, you will need to fix these as well.

Contributing threads:

- `TRUNK FREEZE. 2.5a1, 00:00 UTC, Wednesday 5th of April.
<http://mail.python.org/pipermail/python-dev/2006-April/063324.html>`__
- `outstanding items for 2.5
<http://mail.python.org/pipermail/python-dev/2006-April/063328.html>`__
- `chilling svn for 2.5a1
<http://mail.python.org/pipermail/python-dev/2006-April/063403.html>`__
- `Reminder: TRUNK FREEZE. 2.5a1, 00:00 UTC, Wednesday 5th of April.
<http://mail.python.org/pipermail/python-dev/2006-April/063426.html>`__
- `RELEASED Python 2.5 (alpha 1)
<http://mail.python.org/pipermail/python-dev/2006-April/063441.html>`__
- `TRUNK is UNFROZEN
<http://mail.python.org/pipermail/python-dev/2006-April/063443.html>`__
- `segfault (double free?) when '''-string crosses line
<http://mail.python.org/pipermail/python-dev/2006-April/063572.html>`__
- `pdb segfaults in 2.5 trunk?
<http://mail.python.org/pipermail/python-dev/2006-April/063591.html>`__
- `IMPORTANT 2.5 API changes for C Extension Modules and Embedders
<http://mail.python.org/pipermail/python-dev/2006-April/063637.html>`__

----------------------
Contributor agreements
----------------------

If you're listed in the `Python acknowledgments`_, and you haven't
signed a `contributor agreement`_, please submit one as soon as
possible.  Note that this includes even the folks that have been
around forever -- the PSF would like to be as careful as possible on
this one.

.. _Python acknowledgments:
http://svn.python.org/projects/python/trunk/Misc/ACKS
.. _contributor agreement: http://www.python.org/psf/contrib-form-python.html

Contributing threads:

- `PSF Contributor Agreement for pysqlite
<http://mail.python.org/pipermail/python-dev/2006-April/063578.html>`__

---------------------------
Firefox Python bug searches
---------------------------

Anthony Baxter has created a `Firefox searchbar`_ for finding Python
bugs by their SourceForge IDs.  There are also two Firefox sidebar
options: `Mark Hammond's`_ and `Daniel Lundin's`_.

.. _Firefox searchbar: http://www.python.org/~anthony/searchbar/
.. _Mark Hammond's: http://starship.python.net/~skippy/mozilla/
.. _Daniel Lundin's: http://projects.edgewall.com/python-sidebar/

Contributing thread:

- `Firefox searchbar engine for Python bugs
<http://mail.python.org/pipermail/python-dev/2006-April/063285.html>`__

-------------------------------------------------
Building Python with the free MS Toolkit compiler
-------------------------------------------------

Paul Moore documented how to build Python on Windows with the free MS
Toolkit C++ compiler `on the wiki`_.  The most up-to-date version of
these instructions is in `PCbuild/readme.txt`_.

.. _on the wiki:
http://wiki.python.org/moin/Building_Python_with_the_free_MS_C_Toolkit
.. _PCbuild/readme.txt:
http://svn.python.org/projects/python/trunk/PCbuild/readme.txt

Contributing thread:

- `Building Python with the free MS Toolkit compiler
<http://mail.python.org/pipermail/python-dev/2006-April/063719.html>`__

----------------------------------------------
Please login to the wiki when you make changes
----------------------------------------------

Skip Montanaro has requested that anyone posting to the wiki sign in
first, as this makes things easier for those monitoring changes to the
wiki. When you're logged in, your changes appear with your name, and
so can be immediately recognized as not being wiki-spam.

Contributing thread:

- `Suggestion: Please login to wiki when you make changes
<http://mail.python.org/pipermail/python-dev/2006-April/063447.html>`__

---------------------------
Checking an older blamelist
---------------------------

While the buildbot blamelist may scroll off the `main page`_ in a
matter of hours, you can still see the blamelist for a particular
revision by checking a particular build number, e.g. to see build 800
of the trunk on the G4 OSX, you could check
http://www.python.org/dev/buildbot/trunk/g4%20osx.4%20trunk/builds/800.

.. _main page: http://www.python.org/dev/buildbot/all/

Contributing threads:

- `Preserving the blamelist
<http://mail.python.org/pipermail/python-dev/2006-April/063633.html>`__
- `TODO Wiki (was: Preserving the blamelist)
<http://mail.python.org/pipermail/python-dev/2006-April/063673.html>`__

=========
Summaries
=========

-------------------------
Garbage collection issues
-------------------------

One of the problems with garbage-collecting generators (now that they
have a __del__ method) was that GC normally assumes that because the
*type* has a __del__ method, the *instance* needs finalization. Many
generator instances may not need finalization even though their type
has a __del__ slot, so generators have been special-cased in the
garbage-collection code so that GC can sometimes tell that a generator
does not need its finalizer called.

As a side note, Tim Peters pointed out that if you're worried about
cyclic-gc and __del__ methods, one of the ways to avoid problems is to
remove the __del__ method from the object you think might be included
in a cycle, and add an attribute to that object that holds a "closing"
object with a __del__ method that simply closes all your resources.
Since your main object won't have a __del__, it will be easily
collected, which should make the closing object collectable too.

Contributing threads:

- `reference leaks, __del__, and annotations
<http://mail.python.org/pipermail/python-dev/2006-March/063213.html>`__
- `reference leaks, __del__, and annotations
<http://mail.python.org/pipermail/python-dev/2006-April/063257.html>`__
- `Debugging opportunity :-)
<http://mail.python.org/pipermail/python-dev/2006-April/063746.html>`__

------------------
ElementTree naming
------------------

The elementtree package was included in Python 2.5 as xml.etree.
There were some complaints about the naming schemes (which aren't
quite `PEP 8`_-compliant) but changing these while elementtree is
still distributed as a standalone package seemed like a bad idea.
People generally felt that style-motivated renamings should all wait
until Python 3000.

.. _PEP 8: http://www.python.org/dev/peps/pep-0008/

Contributing thread:

- `elementtree in stdlib
<http://mail.python.org/pipermail/python-dev/2006-April/063477.html>`__

-----------------------------
Externally maintained modules
-----------------------------

Brett Cannon was putting together a PEP for externally maintained code
in the stdlib (e.g. ctypes and pysqlite).  The PEP will list all
modules and packages within the stdlib that are maintained externally,
as well as the contact info for their maintainers and the locations
where bugs and patches should be reported.  At the time of this
summary, he had not yet been assigned a PEP number.

Contributing threads:

- `PEP to list externally maintained modules and where to report bugs?
<http://mail.python.org/pipermail/python-dev/2006-April/063279.html>`__
- `need info for externally maintained modules PEP
<http://mail.python.org/pipermail/python-dev/2006-April/063549.html>`__

--------------------------------------
String prefix for internationalization
--------------------------------------

Martin Blais proposed an ``i`` prefix for internationalized strings to
get rid of the ``_()`` required by pygettext.  This would allow
pygettext to more easily identify internationalized strings, and
reduce the number of parentheses necessary in internationalized code.
However, it would only have saved two key-strokes, it would have
required the introduction of ``iu`` and ``ir`` prefixes, and it would
have forced some rewriting of the tools that currently do string
extraction using ``_()``, so the idea was rejected.

Contributing thread:

- `The &quot;i&quot; string-prefix: I18n'ed strings
<http://mail.python.org/pipermail/python-dev/2006-April/063488.html>`__

---------------------------
PEP 359: The make statement
---------------------------

Steven Bethard introduced a PEP for the make statement which would
have made the statement::

    make <callable> <name> <tuple>:
        <block>

syntactic sugar for::

    class <name> <tuple>:
        __metaclass__ = <callable>
        <block>

The goal was to allow the creation of non-class objects from a
namespace without requiring a misleading ``class`` statement and
``__metaclass__`` hook.  With appropriately defined objects, the make
statement would have supported syntax like::

    make block_property x:
        '''The x of the frobulation'''
        def fget(self):
            ...
        def fset(self):
            ...

    make Schema registration:
        make String name:
            max_length = 100
            not_empty = True
        make PostalCode postal_code:
            not_empty = True
        make Int age:
            min = 18

In current Python these would have to be created using class
statements which misleadingly created objects that were not classes.
Responses were mixed, and the discussion continued on into the next
fortnight.

.. _PEP 359: http://www.python.org/dev/peps/pep-0359/

Contributing thread:

- `PEP 359: The &quot;make&quot; Statement
<http://mail.python.org/pipermail/python-dev/2006-April/063694.html>`__

---------------------------------------------
Formatting exceptions with their module names
---------------------------------------------

Georg Brandl checked in a patch to make
traceback.format_exception_only() prepend the exception's module name
in the same way the interpreter does.  This caused a number of
doctests to fail because the exception module names were not included.
 After some discussion, it seemed like people agreed that even though
some doctests would be broken, it was more important to have the
interpreter and traceback.format_exception_only() produce the same
output.

Contributing thread:

- `[Python-checkins] r45321 - in python/trunk:
Lib/test/test_traceback.py Lib/traceback.py Misc/NEWS
<http://mail.python.org/pipermail/python-dev/2006-April/063662.html>`__

-------------------
Benchmarking python
-------------------

Benji York and a few others ran Python 2.5a1 through pystone and found
it mostly comparable to 2.4.2.  However, Raymond Hettinger pointed out
that pystone isn't really an appropriate benchmark for comparing
across versions -- it's intended more for comparing across
architectures and compilers.

Contributing threads:

- `2.5a1 Performance
<http://mail.python.org/pipermail/python-dev/2006-April/063450.html>`__
- `Buildbot slave locks (Was: 2.5a1 Performance)
<http://mail.python.org/pipermail/python-dev/2006-April/063481.html>`__

------------------------------------------------------
PySocketSockObject, socketmodule.c, _ssl.c and Windows
------------------------------------------------------

Tim Peters noticed that on Windows, socketmodule.c and _ssl.c
disagreed about the offset of the ``sock_timeout`` member of a
``PySocketSockObject``.  Turns out that since _socket was built by a
.vcproj but _ssl was built by _ssl.mak (which had forgotten to define
WIN32), doubles were aligned to an 8-byte boundary when socketmodule.c
was compiled but a 4-byte boundary when _ssl was compiled.

Contributing thread:

- `Who understands _ssl.c on Windows?
<http://mail.python.org/pipermail/python-dev/2006-April/063543.html>`__

------------------------------
Getting a dictionary of counts
------------------------------

Alex Martelli proposed adding a ``tally()`` function to the
collections module which would count the number of each value in an
iterable and produce a dictionary.  So for example::

    tally('abacaab') == {'a': 4, 'c': 1, 'b': 2}

People generally thought the function would be useful, but there was
some discussion as to whether or not it would be better to provide a
collections.bag object instead.  The discussion fizzled out without
any patches being produced.

Contributing thread:

- `tally (and other accumulators)
<http://mail.python.org/pipermail/python-dev/2006-April/063382.html>`__

------------------------
Building Python with C++
------------------------

Anthony Baxter has donated some of his recent time to getting Python
to compile with g++.  He got Python core to compile correctly, but
there were lots of errors in the Modules code that wasn't C++ safe.
If you'd like to help Anthony out and fix some bugs, try it yourself
using ``CC=g++ ./configure --with-cxx=g++``

Contributing thread:

- `building with C++
<http://mail.python.org/pipermail/python-dev/2006-April/063632.html>`__

----------------------
Adding PEP 302 support
----------------------

Phillip J. Eby has been working on adding `PEP 302`_ import loader
support to the necessary modules around Python.  In the process, he
noticed that both runpy and test.test_importhooks reimplement the base
`PEP 302`_ algorithm, and suggested adding functions to pkgutil that
would allow such modules to all share the same code for this kind of
thing.  The code appears to have been checked in, but docs do not
appear to be available yet.

.. _PEP 302: http://www.python.org/dev/peps/pep-0359/

Contributing threads:

- `PEP 302 support for traceback, inspect, site, warnings, doctest,
and linecache <http://mail.python.org/pipermail/python-dev/2006-April/063586.html>`__
- `Proposal: expose PEP 302 facilities via 'imp' and 'pkgutil'
<http://mail.python.org/pipermail/python-dev/2006-April/063623.html>`__
- `PEP 302 support for pydoc, runpy, etc.
<http://mail.python.org/pipermail/python-dev/2006-April/063724.html>`__

-----------------------------------------
Having Python use dlopen() on Darwin/OS X
-----------------------------------------

Zachary Pincus asked about using the normal code path for Unix-like
systems through dlopen() for Darwin/OS X instead of the officially
discouraged NeXT-derived APIs that Python was using at the time.  Bob
Ippolito approved the patch, and OS X users should hopefully see the
change in Python 2.5.

Contributing thread:

- `Use dlopen() on Darwin/OS X to load extensions?
<http://mail.python.org/pipermail/python-dev/2006-April/063336.html>`__

-------------------------------
Saving the hash value of tuples
-------------------------------

Noam Raphael suggested `caching the hash value of tuples`_ in a
similar way to what is done for strings now.  But without any
measurements showing that this improved performance, and with the
relative rareness of hashing tuples, Guido thought that this was a bad
idea.

.. _caching the hash value of tuples: http://bugs.python.org/1462796

Contributing thread:

- `Saving the hash value of tuples
<http://mail.python.org/pipermail/python-dev/2006-April/063275.html>`__

------------------
sqlite3.dll issues
------------------

If a Windows user tries ``import sqlite3`` and Python finds SQLite's
sqlite3.dll before it finds pysqlite's sqlite.py module, Python will
end up incorrectly raising an ImportError.  Martin v. L?wis suggested
that maybe Python should stop treating .dll files as extension modules
so conflicts like this could be avoided.  It was unclear at the end of
the thread if this (or any other) route was being persued.

Contributing thread:

- `Renaming sqlite3
<http://mail.python.org/pipermail/python-dev/2006-April/063327.html>`__

----------------
New Python icons
----------------

Andrew Clover made some `new Python icons`_ available based on the
logo on the new website.  People on python-dev really liked them, and
it looked like they'd probably make it into Python 2.5.

.. _new Python icons: http://doxdesk.com/img/software/py/icons2.png

Contributing threads:

- `New-style icons, .desktop file
<http://mail.python.org/pipermail/python-dev/2006-March/063235.html>`__
- `New-style icons, .desktop file
<http://mail.python.org/pipermail/python-dev/2006-April/063517.html>`__

----------------------------------------
Py_Initialize/Py_Finalize leaking memory
----------------------------------------

Martin v. L?wis corrected some documentation errors that claimed that
Py_Finalize would release all memory (it can't be guaranteed to do
so).  In the process, Martin and Tim Peters discussed a `recent bug`_
where running Py_Initialize/Py_Finalize in a loop left more and more
objects behind each time.  The hope was to get the number of added
objects after a Py_Initialize/Py_Finalize pair back down to zero if
possible, and Martin found at least one leak, but it was unclear at
the end of the thread how close they had gotten.

.. _recent bug: http://bugs.python.org/1445210

Contributing thread:

- `Py_Finalize does not release all memory, not even closely
<http://mail.python.org/pipermail/python-dev/2006-April/063618.html>`__

----------------------
Removing PyObject_REPR
----------------------

Thomas Wouters noticed that the ``PyObject_REPR()`` macro, which was
originally introduced as an internal debugging API, leaks a PyString
object.  It looked like the macro would either be removed entirely, or
redefined to call Py_DECREF appropriately and return a newly allocated
(and thus freeable) string.

Contributing thread:

- `PyObject_REPR()
<http://mail.python.org/pipermail/python-dev/2006-April/063628.html>`__

------------------------------------------
uriparse module to replace urlparse module
------------------------------------------

Paul Jimenez offered up his `uriparse module`_ which improves on
urlparse.  Currently, urlparse doesn't comply with STD66 (a.k.a.
RFC3986), as it hard-codes some URI schemes instead of applying the
same syntax to all of them.  Martin v. L?wis asked for more
documentation, and John J Lee suggested deprecating a few functions
from urllib and putting RFC-compliant versions in uriparse.  The
discussion then moved to the tracker, but at the time of this summary,
the remaining issues had not yet been resolved.

.. _uriparse module: http://bugs.python.org/1462525

Contributing thread:

- `New uriparse.py
<http://mail.python.org/pipermail/python-dev/2006-April/063294.html>`__

--------------------------------------
Line numbers with the new AST compiler
--------------------------------------

Jeremy Hylton noticed that with the new AST-based compiler, the line
numbers for things like the implicit ``return None`` at the end of a
function were occasionally different from previous versions of Python.
 The changes looked harmless, so Guido said it was fine to leave the
code as it was.  Jeremy promised to look into some of the other
special cases for alpha 2.

Contributing thread:

- `line numbers, pass statements, implicit returns
<http://mail.python.org/pipermail/python-dev/2006-April/063272.html>`__


================
Deferred Threads
================
- `PY_FORMAT_SIZE_T warnings on OS X
<http://mail.python.org/pipermail/python-dev/2006-April/063280.html>`__


==================
Previous Summaries
==================
- `Class decorators
<http://mail.python.org/pipermail/python-dev/2006-April/063253.html>`__
- `I'm not getting email from SF when assigned a bug/patch
<http://mail.python.org/pipermail/python-dev/2006-April/063255.html>`__
- `improving quality
<http://mail.python.org/pipermail/python-dev/2006-April/063269.html>`__


===============
Skipped Threads
===============
- `gmane.comp.python.devel.3000 has disappeared
<http://mail.python.org/pipermail/python-dev/2006-April/063256.html>`__
- `refleaks in 2.4
<http://mail.python.org/pipermail/python-dev/2006-April/063268.html>`__
- `Name for python package repository
<http://mail.python.org/pipermail/python-dev/2006-April/063271.html>`__
- `[Python-checkins] r43545 - in python/trunk: Doc/lib/libcalendar.tex
Lib/calendar.py
<http://mail.python.org/pipermail/python-dev/2006-April/063278.html>`__
- `Bug Day on Friday, 31st of March
<http://mail.python.org/pipermail/python-dev/2006-April/063298.html>`__
- `String formating in python 3000
<http://mail.python.org/pipermail/python-dev/2006-April/063305.html>`__
- `SF #1462485 - StopIteration raised in body of 'with' statement
suppressed <http://mail.python.org/pipermail/python-dev/2006-April/063312.html>`__
- `SF #1462700 - Errors in PCbuild
<http://mail.python.org/pipermail/python-dev/2006-April/063313.html>`__
- `Whole bunch of test failures on OSX
<http://mail.python.org/pipermail/python-dev/2006-April/063318.html>`__
- `SF:1463370 add .format() method to str and unicode
<http://mail.python.org/pipermail/python-dev/2006-April/063329.html>`__
- `Need Py3k group in trackers
<http://mail.python.org/pipermail/python-dev/2006-April/063344.html>`__
- `posixmodule.c patch- revision 43586
<http://mail.python.org/pipermail/python-dev/2006-April/063349.html>`__
- `Twisted and Python 2.5a0r43587
<http://mail.python.org/pipermail/python-dev/2006-April/063370.html>`__
- `r43613 - python/trunk/Doc/lib/libcsv.tex
<http://mail.python.org/pipermail/python-dev/2006-April/063386.html>`__
- `current 2.5 status
<http://mail.python.org/pipermail/python-dev/2006-April/063411.html>`__
- `Should issubclass() be more like isinstance()?
<http://mail.python.org/pipermail/python-dev/2006-April/063422.html>`__
- `The &quot;Need for Speed&quot; Sprint, Reykjavik, Iceland, May
21-28, 2006 <http://mail.python.org/pipermail/python-dev/2006-April/063428.html>`__
- `suggest: nowait option in subprocess.communicate
<http://mail.python.org/pipermail/python-dev/2006-April/063445.html>`__
- `strftime/strptime locale funnies...
<http://mail.python.org/pipermail/python-dev/2006-April/063448.html>`__
- `PY_SSIZE_T_MIN?
<http://mail.python.org/pipermail/python-dev/2006-April/063449.html>`__
- `How to determine if char is signed or unsigned?
<http://mail.python.org/pipermail/python-dev/2006-April/063456.html>`__
- `Possible issue with 2.5a1 Win32 binary
<http://mail.python.org/pipermail/python-dev/2006-April/063469.html>`__
- `Don Beaudry <http://mail.python.org/pipermail/python-dev/2006-April/063485.html>`__
- `Default Locale, was; Re: strftime/strptime locale funnies...
<http://mail.python.org/pipermail/python-dev/2006-April/063487.html>`__
- `dis module and new-style classes
<http://mail.python.org/pipermail/python-dev/2006-April/063489.html>`__
- `module aliasing
<http://mail.python.org/pipermail/python-dev/2006-April/063491.html>`__
- `str.partition?
<http://mail.python.org/pipermail/python-dev/2006-April/063499.html>`__
- `subprocess maintenance - SVN write access
<http://mail.python.org/pipermail/python-dev/2006-April/063502.html>`__
- `[IronPython] base64 module
<http://mail.python.org/pipermail/python-dev/2006-April/063503.html>`__
- `base64 module
<http://mail.python.org/pipermail/python-dev/2006-April/063504.html>`__
- `packaging/bootstrap issue
<http://mail.python.org/pipermail/python-dev/2006-April/063511.html>`__
- `Patch or feature? Tix.Grid working for 2.5
<http://mail.python.org/pipermail/python-dev/2006-April/063532.html>`__
- `Weekly Python Patch/Bug Summary
<http://mail.python.org/pipermail/python-dev/2006-April/063542.html>`__
- `Subversion downtime today
<http://mail.python.org/pipermail/python-dev/2006-April/063557.html>`__
- `Win64 AMD64 (aka x64) binaries available64
<http://mail.python.org/pipermail/python-dev/2006-April/063558.html>`__
- `int()'s ValueError behaviour
<http://mail.python.org/pipermail/python-dev/2006-April/063559.html>`__
- `threadless brownian.py
<http://mail.python.org/pipermail/python-dev/2006-April/063563.html>`__
- `Subversion repository back up
<http://mail.python.org/pipermail/python-dev/2006-April/063565.html>`__
- `DRAFT: python-dev summary for 2006-02-01 to 2006-02-15
<http://mail.python.org/pipermail/python-dev/2006-April/063588.html>`__
- `Failing &quot;inspect&quot; test: test_getargspec_sublistofone
<http://mail.python.org/pipermail/python-dev/2006-April/063589.html>`__
- `updating PyExpat (Was: need info for externally maintained modules
PEP) <http://mail.python.org/pipermail/python-dev/2006-April/063600.html>`__
- `DRAFT: python-dev summary for 2006-02-16 to 2006-02-28
<http://mail.python.org/pipermail/python-dev/2006-April/063605.html>`__
- `DRAFT: python-dev summary for 2006-03-01 to 2006-03-15
<http://mail.python.org/pipermail/python-dev/2006-April/063611.html>`__
- `Checking assigned bugs/patches
<http://mail.python.org/pipermail/python-dev/2006-April/063640.html>`__
- `String initialization (was: The &quot;i&quot; string-prefix:
I18n'ed strings)
<http://mail.python.org/pipermail/python-dev/2006-April/063644.html>`__
- `Request for review
<http://mail.python.org/pipermail/python-dev/2006-April/063652.html>`__
- `cleanup list
<http://mail.python.org/pipermail/python-dev/2006-April/063655.html>`__
- `unicode vs buffer (array) design issue can crash interpreter
<http://mail.python.org/pipermail/python-dev/2006-April/063675.html>`__
- `int vs ssize_t in unicode
<http://mail.python.org/pipermail/python-dev/2006-April/063676.html>`__
- `Any reason that any()/all() do not take a predicate argument?
<http://mail.python.org/pipermail/python-dev/2006-April/063680.html>`__
- `Any reason that any()/all() do not take a predicateargument?
<http://mail.python.org/pipermail/python-dev/2006-April/063688.html>`__
- `Exceptions doctest Re: Request for review
<http://mail.python.org/pipermail/python-dev/2006-April/063689.html>`__
- `Is test_sundry really expected to succeed on Windows?
<http://mail.python.org/pipermail/python-dev/2006-April/063690.html>`__
- `Checkin 45232: Patch #1429775
<http://mail.python.org/pipermail/python-dev/2006-April/063701.html>`__
- `[Python-checkins] r45334 -
python/trunk/Lib/test/leakers/test_gen1.py
python/trunk/Lib/test/leakers/test_generator_cycle.py
python/trunk/Lib/test/leakers/test_tee.py
<http://mail.python.org/pipermail/python-dev/2006-April/063702.html>`__
- `ia64 debian buildbot
<http://mail.python.org/pipermail/python-dev/2006-April/063704.html>`__
- `[Python-3000] Removing 'self' from method definitions
<http://mail.python.org/pipermail/python-dev/2006-April/063709.html>`__
- `Procedure for sandbox branches in SVN?
<http://mail.python.org/pipermail/python-dev/2006-April/063725.html>`__
- `Calling Thomas Heller and Richard Jones
<http://mail.python.org/pipermail/python-dev/2006-April/063729.html>`__

From barry at barrys-emacs.org  Mon May 29 19:09:13 2006
From: barry at barrys-emacs.org (Barry Scott)
Date: Mon, 29 May 2006 18:09:13 +0100
Subject: [Python-Dev] epoll implementation
In-Reply-To: <D493B3B9-53D9-42F8-B2EB-AF04F10E34DC@gmail.com>
References: <20060526163415.GD18740@snurgle.org>	<ca471dc20605261007s67df2b29r557f712df9a0659e@mail.gmail.com>	<e57dsb$vjd$1@sea.gmane.org>
	<4477A9E6.2020108@canterbury.ac.nz> <e589te$nbd$1@sea.gmane.org>
	<D493B3B9-53D9-42F8-B2EB-AF04F10E34DC@gmail.com>
Message-ID: <2902885C-7F07-4192-8AFD-76D9F76313A2@barrys-emacs.org>


On May 27, 2006, at 04:59, Alex Martelli wrote:

>
> I believe it's: kqueue on FreeBSD (for recent-enough versions
> thereof), otherwise epoll where available and nonbuggy, otherwise
> poll ditto, otherwise select -- that's roughly what Twisted uses for

kqueue is not always faster. It depends on your application. The number
of FDs and the fd values affect the point that kqueue wins over select.

For a few FDs with low values select was always faster in my apps.

Barry


From arigo at tunes.org  Mon May 29 19:11:48 2006
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 29 May 2006 19:11:48 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
Message-ID: <20060529171147.GA1717@code0.codespeak.net>

Hi all,

I've finally come around to writing a patch that stops dict lookup from
eating all exceptions that occur during lookup, like rare bugs in user
__eq__() methods.  After another 2-hours long debugging session that
turned out to be caused by that, I had a lot of motivation.

  http://python.org/sf/1497053

The patch doesn't change the PyDict_GetItem() interface, which is the
historical core of the problem.  It works around this issue by just
moving the exception-eating bit there instead of in lookdict(), so it
gets away with changing only dictobject.c (plus ceval.c's direct usage
of ma_lookup for LOAD_GLOBAL).  The benefit of this patch is that all
other ways to work with dicts now correctly propagate exceptions, and
this includes all the direct manipulation from Python code (including
'x=d[key]').

The reason I bring this up here is that I'm going to check it in 2.5,
unless someone seriously objects.  About the objection "we need a better
fix, PyDict_GetItem() should really be fixed and all its usages
changed": this would be good, and also require some careful
compatibility considerations, and quite some work in total; it would
also give a patch which is basically a superset of mine, so I don't
think I'm going in the wrong direction there.


A bientot,

Armin

From tim.peters at gmail.com  Mon May 29 19:37:08 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 29 May 2006 13:37:08 -0400
Subject: [Python-Dev] ssize_t question: longs in header files
In-Reply-To: <9e804ac0605290817m1251ef5q448815d135d4da25@mail.gmail.com>
References: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>
	<9e804ac0605290817m1251ef5q448815d135d4da25@mail.gmail.com>
Message-ID: <1f7befae0605291037i42d7fa1m589a51c76880daa5@mail.gmail.com>

[Neal Norwitz]
>> * ints:  Include/intobject.h:    long ob_ival;

[Thomas Wouters]
> I considered asking about this before, as it would give '64-bit power' to
> Win64 integers. It's a rather big change, though (lots of code assumes
> PyInts fit in signed longs, which would be untrue then.)

I expect that "rather big" is understatement -- more that the sheer
bulk of it would make the Py_ssize_t changes look like a warmup
exercise.  Too late in the release cycle, IMO (and note that we're
still stumbling into Py_ssize_t glitches despite that it's been in
place for months).

> On the other hand, if a separate type was used for PyInts (Py_pyint_t or
> whatever, defaulting to long, except on LLP64 systems), EWT's request for a
> 64-bit integer represented by C 'long long's would be a simple configure
> switch.

But then _all_ "short ints" in Python would be 64 bits, and EWT did
not ask for a slowdown on all short int operations <0.5 wink>.

From tim.peters at gmail.com  Mon May 29 19:46:14 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 29 May 2006 13:46:14 -0400
Subject: [Python-Dev] ssize_t question: longs in header files
In-Reply-To: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>
References: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>
Message-ID: <1f7befae0605291046v65a11862w70c000f5e9cceabf@mail.gmail.com>

[Neal Norwitz]
>  * hash values
> Include/abstract.h:     long PyObject_Hash(PyObject *o);  // also in object.h
> Include/object.h:typedef long (*hashfunc)(PyObject *);

We should leave these alone for now.  There's no real connection
between the width of a hash value and the number of elements in a
container, and Py_ssize_t is conceptually only related to the latter.

As you noted later, there is an unusually strong connection in the
dict implementation, and it's been on my todo list for months to look
into that.  Certainly the struct _dictobject members need to change
from int to Py_ssize_t (including ma_mask), but that can't be done
mechanically (code changes may also be required, and this is one is
much more math- than compiler-warning- driven).

I'll note in passing that the debug-build obmalloc has no idea what to
do with byte counts that don't fit in 4 bytes (it will probably
left-truncate them in its internal debugging records).

From tim.peters at gmail.com  Mon May 29 20:05:39 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 29 May 2006 14:05:39 -0400
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com>
	<9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com>
	<E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>
	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>
	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
Message-ID: <1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>

[Thomas Wouters]
> ...
> Perhaps more people could chime in? Am I being too anal about backward
> compatibility here?

Yes and no ;-)  Backward compatibility _is_ important, but there seems
no way to know in this case whether struct's range-checking sloppiness
was accidental or deliberate.  Having fixed bugs in the old code
several times, and been reduced to writing crap like this in the old
test_struct.py:

    # XXX Most std integer modes fail to test for out-of-range.
    # The "i" and "l" codes appear to range-check OK on 32-bit boxes, but
    # fail to check correctly on some 64-bit ones (Tru64 Unix + Compaq C
    # reported by Mark Favas).
    BUGGY_RANGE_CHECK = "bBhHiIlL"

I can't help but note several things:

- If it _was_ intended that range-checking be sloppy, nobody
  bothered to document it.

- Or even to write a comment in the code explaining that obscure
  intent.

- When I implemented the Q (8-byte int) format code, I added
  correct range-checking in all cases, and nobody ever complained
  about that.

- As noted in the comment above, we have gotten complaints about
  failures of struct range-checking at other integer widths.

OTOH,  BUGGY_RANGE_CHECK existed because I was too timid to risk
making broken user code visibly broken.

So, in all, I'm 95% sure 2.4's behavior is buggy, but 50% unsure that
we need to warn about it before repairing it.  Since you (Thomas) want
warnings, and in theory it only affects the lightly-used "standard"
modes, I do lean in favor of leaving the standard modes that _are_
broken (as above, not all are) broken in 2.5 but warning that this
will change in 2.6.

From jcarlson at uci.edu  Mon May 29 20:11:59 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 29 May 2006 11:11:59 -0700
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
References: <B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
Message-ID: <20060529110917.6945.JCARLSON@uci.edu>


"Thomas Wouters" <thomas at python.org> wrote:
> On 5/29/06, Bob Ippolito <bob at redivi.com> wrote:
> >
> > A compromise is to do proper range checking as a warning, and do the
> > modulo math anyway... but is that what we really want?
> >
> 
> I don't know about the rest of 'us', but that's what I want, yes: backward
> compatibility, and a warning to tell people to fix their code 'or else'. The
> prevalence of the warnings (outside of the stdlib) should give us a clue
> whether to make it an exception in 2.6 or wait for 2.7/3.0.
> 
> Perhaps more people could chime in? Am I being too anal about backward
> compatibility here?

As a fairly heavy user of struct, I personally don't use struct to do
modulos and/or sign manipulation (I mask before I pass), but a change in
behavior seems foolish if people use that behavior.  So far, I'm not
aware of anyone complaining about Python 2.4's use, so it would seem to
suggest that the current behavior is not incorrect.

 - Josiah


From guido at python.org  Mon May 29 20:25:39 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 29 May 2006 11:25:39 -0700
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com>
	<9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com>
	<E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>
	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>
	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>
Message-ID: <ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>

On 5/29/06, Tim Peters <tim.peters at gmail.com> wrote:
> [Thomas Wouters]
> > ...
> > Perhaps more people could chime in? Am I being too anal about backward
> > compatibility here?
>
> Yes and no ;-)  Backward compatibility _is_ important, but there seems
> no way to know in this case whether struct's range-checking sloppiness
> was accidental or deliberate.

I'm pretty sure it was deliberate. I'm more than likely the original
author of this code (since the struct module was originally mine), and
I know I put in things like that in a few places to cope with hex/oct
constants pasted from C headers, and the general messiness that ensued
because of Python's lack of an unsigned int type.

This is probably a case that was missed by PEP 237.

I think we should do as Thomas proposes: plan to make it an error in
2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept
it with a warning in 2.5.

> - If it _was_ intended that range-checking be sloppy, nobody
>   bothered to document it.

Mea culpa. In those days we didn't document a lot of things.

> - Or even to write a comment in the code explaining that obscure
>   intent.

It never occurred to me that it was obscure; we did this all over the
place (in PyArg_Parse too).

> - When I implemented the Q (8-byte int) format code, I added
>   correct range-checking in all cases, and nobody ever complained
>   about that.

It's really only a practical concern for 32-bit values on 32-bit
machines, where reasonable people can disagree over whether 0xffffffff
is -1 or 4294967295.

> - As noted in the comment above, we have gotten complaints about
>   failures of struct range-checking at other integer widths.
>
> OTOH,  BUGGY_RANGE_CHECK existed because I was too timid to risk
> making broken user code visibly broken.
>
> So, in all, I'm 95% sure 2.4's behavior is buggy, but 50% unsure that
> we need to warn about it before repairing it.  Since you (Thomas) want
> warnings, and in theory it only affects the lightly-used "standard"
> modes, I do lean in favor of leaving the standard modes that _are_
> broken (as above, not all are) broken in 2.5 but warning that this
> will change in 2.6.

I'm not sure what we gain by leaving other std modules depending on
struct's brokenness broken. But I may be misinterpreting which modules
you're referring to.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From raymond.hettinger at verizon.net  Mon May 29 21:20:44 2006
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon, 29 May 2006 12:20:44 -0700
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
References: <20060529171147.GA1717@code0.codespeak.net>
Message-ID: <008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>

From: "Armin Rigo" <arigo at tunes.org>
 
> I've finally come around to writing a patch that stops dict lookup from
> eating all exceptions that occur during lookup, like rare bugs in user
> __eq__() methods. 

Is there a performance impact?


Raymond


From tim.peters at gmail.com  Mon May 29 21:27:57 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 29 May 2006 15:27:57 -0400
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com>
	<9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com>
	<E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>
	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>
	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>
	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
Message-ID: <1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>

[Guido]
> ...
> It's really only a practical concern for 32-bit values on 32-bit
> machines, where reasonable people can disagree over whether 0xffffffff
> is -1 or 4294967295.

Then maybe we should only let that one slide <0.5 wink>.

...

[Tim]
>> So, in all, I'm 95% sure 2.4's behavior is buggy, but 50% unsure that
>> we need to warn about it before repairing it.  Since you (Thomas) want
>> warnings, and in theory it only affects the lightly-used "standard"
>> modes, I do lean in favor of leaving the standard modes that _are_
>> broken (as above, not all are) broken in 2.5 but warning that this
>> will change in 2.6.

> I'm not sure what we gain by leaving other std modules depending on
> struct's brokenness broken. But I may be misinterpreting which modules
> you're referring to.

I think you're just reading "module" where I wrote "mode".  "Standard
mode" is struct-module terminology, as in

    ">b"
    "!b"
    "<b"

are standard modes but

    "b"

is not a standard mode (it's "native mode").  But I got it backwards
-- or maybe not ;-)  It's confusing because it's so inconsistent (this
under 2.4.3 on 32-bit Windows):

>>> struct.pack(">B", -32) # std mode doesn't complain
'\xe0'
>>> struct.pack("B", -32)  # native mode does
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
struct.error: ubyte format requires 0<=number<=255

>>> struct.pack(">b", 255)  # std mode doesn't complain
'\xff'
>>> struct.pack("b", 255) # native mode does
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
struct.error: byte format requires -128<=number<=127

On the other hand, as I noted last time, some standard modes _do_
range-check -- but not correctly on some 64-bit boxes -- and not
consistently across positive and negative out-of-range values, or
across input types.  Like:

>>> struct.pack(">i", 2**32-1)  # std and native modes complain
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: long int too large to convert to int
>>> struct.pack("i", 2**32-1)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: long int too large to convert to int

>>> struct.pack("I", -1)  # neither std nor native modes complain
'\xff\xff\xff\xff'
>>> struct.pack(">I", -1)
'\xff\xff\xff\xff'

>>> struct.pack("I", -1L)  # but both complain if the input is long
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: can't convert negative value to unsigned long
>>> struct.pack(">I", -1L)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: can't convert negative value to unsigned long

In short, there's no way to explain what struct checks for in 2.4.3
short of drawing up an exhaustive table of standard-vs-native mode,
format code, "which direction" a value may be out of range, and
whether the value is given as a Python int or a long.

At the sprint, I encouraged Bob to do complete range-checking.  That's
explainable.  If we have to back off from that, then since the new
code is consistent, I'm sure any warts he puts back in will be clearly
look like warts ;-)

> I think we should do as Thomas proposes: plan to make it an error in
> 2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept
> it with a warning in 2.5.

That's what I arrived at, although 2.4.3's checking behavior is
actually so inconsistent that "it" needs some defining (what exactly
are we trying to still accept?  e.g., that -1 doesn't trigger "I"
complaints but that -1L does above?  that one's surely a bug).   To be
clear, Thomas proposed "accepting it" (whatever that means) until 3.0.

From guido at python.org  Mon May 29 21:34:30 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 29 May 2006 12:34:30 -0700
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <20060529171147.GA1717@code0.codespeak.net>
References: <20060529171147.GA1717@code0.codespeak.net>
Message-ID: <ca471dc20605291234h171901b2xbcbec00db7b9f003@mail.gmail.com>

On 5/29/06, Armin Rigo <arigo at tunes.org> wrote:
> Hi all,
>
> I've finally come around to writing a patch that stops dict lookup from
> eating all exceptions that occur during lookup, like rare bugs in user
> __eq__() methods.  After another 2-hours long debugging session that
> turned out to be caused by that, I had a lot of motivation.
>
>   http://python.org/sf/1497053
>
> The patch doesn't change the PyDict_GetItem() interface, which is the
> historical core of the problem.  It works around this issue by just
> moving the exception-eating bit there instead of in lookdict(), so it
> gets away with changing only dictobject.c (plus ceval.c's direct usage
> of ma_lookup for LOAD_GLOBAL).  The benefit of this patch is that all
> other ways to work with dicts now correctly propagate exceptions, and
> this includes all the direct manipulation from Python code (including
> 'x=d[key]').
>
> The reason I bring this up here is that I'm going to check it in 2.5,
> unless someone seriously objects.

+1, as long as (as you seem to imply) PyDict_GetItem() still swallows
all exceptions.

> About the objection "we need a better
> fix, PyDict_GetItem() should really be fixed and all its usages
> changed": this would be good, and also require some careful
> compatibility considerations, and quite some work in total; it would
> also give a patch which is basically a superset of mine, so I don't
> think I'm going in the wrong direction there.

Fixing PyDict_GetItem() is a py3k issue, I think. Until then, there
are way too many uses. I wouldn't be surprised if after INCREF and
DECREF it's the most commonly used C API method...

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tjreedy at udel.edu  Mon May 29 21:35:08 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 29 May 2006 15:35:08 -0400
Subject: [Python-Dev] feature request: inspect.isgenerator
References: <loom.20060529T092512-825@post.gmane.org>
Message-ID: <e5fidd$aod$1@sea.gmane.org>


"Michele Simionato" <michele.simionato at gmail.com> wrote in message 
news:loom.20060529T092512-825 at post.gmane.org...
> Is there still time for new feature requests in Python 2.5?
> I am missing a isgenerator function in the inspect module. Right now
> I am using
>
> def isgenerator(func):
>    return func.func_code.co_flags == 99
>
> but it is rather ugly (horrible indeed).

To me, part of the beauty of Python is that is so easy to write such things 
so compactly, so that we do not need megaAPIs with hundreds of functions 
and methods.

tjr




From guido at python.org  Mon May 29 21:44:45 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 29 May 2006 12:44:45 -0700
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com>
	<9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com>
	<E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>
	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>
	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>
	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>
Message-ID: <ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>

On 5/29/06, Tim Peters <tim.peters at gmail.com> wrote:
> > I think we should do as Thomas proposes: plan to make it an error in
> > 2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept
> > it with a warning in 2.5.
>
> That's what I arrived at, although 2.4.3's checking behavior is
> actually so inconsistent that "it" needs some defining (what exactly
> are we trying to still accept?  e.g., that -1 doesn't trigger "I"
> complaints but that -1L does above?  that one's surely a bug).

No, it reflects that (up to 2.3 I believe) 0xffffffff was -1 but
0xffffffffL was 4294967295L.

> To be
> clear, Thomas proposed "accepting it" (whatever that means) until 3.0.

Fine with me.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tjreedy at udel.edu  Mon May 29 21:45:39 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 29 May 2006 15:45:39 -0400
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com><DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com><9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com><E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com><9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com><B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
Message-ID: <e5fj14$cua$1@sea.gmane.org>

>Perhaps more people could chime in? Am I being too anal about backward 
> >compatibility here?

As a sometimes bug report reviewer, I would like the reported discrepancy 
between the public docs and visible code behavior fixed one way or the 
other (by changing the docs or code) since that is my working definition 
for whether something is a bug or not.

Terry Jan Reedy





From guido at python.org  Mon May 29 21:47:30 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 29 May 2006 12:47:30 -0700
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <447A1D58.1020302@v.loewis.de>
References: <e5bhs7$509$1@sea.gmane.org>
	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>
	<e5cjgn$3pg$1@sea.gmane.org> <447A1D58.1020302@v.loewis.de>
Message-ID: <ca471dc20605291247m3412dabat8d1fb48043682431@mail.gmail.com>

On 5/28/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Georg Brandl wrote:
> >> There's still a ton used under Modules.  Also, if no flag is
> >> specified, it will default to 0 (ie, METH_OLDARGS).  I wonder how many
> >> third party modules use METH_OLDARGS directly or more likely
> >> indirectly.
> >
> > These modules can be converted.
>
> I think that should be done for 2.5, but nothing else. Or, perhaps
> a deprecation warning could be generated if it is still used.

I'll let Martin decide this one.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From arigo at tunes.org  Mon May 29 21:50:50 2006
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 29 May 2006 21:50:50 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <ca471dc20605291234h171901b2xbcbec00db7b9f003@mail.gmail.com>
References: <20060529171147.GA1717@code0.codespeak.net>
	<ca471dc20605291234h171901b2xbcbec00db7b9f003@mail.gmail.com>
Message-ID: <20060529195049.GA14908@code0.codespeak.net>

Hi Guido,

On Mon, May 29, 2006 at 12:34:30PM -0700, Guido van Rossum wrote:
> +1, as long as (as you seem to imply) PyDict_GetItem() still swallows
> all exceptions.

Yes.

> Fixing PyDict_GetItem() is a py3k issue, I think. Until then, there
> are way too many uses. I wouldn't be surprised if after INCREF and
> DECREF it's the most commonly used C API method...

Alternatively, we could add a new C API function, and gradually replace
PyDict_GetItem() uses with the the new one.  I can't think of an obvious
name, though...

Maybe code should just start using PyMapping_GetItem() instead.  It's
not incredibly slower than PyDict_GetItem(), at least in the
non-KeyError case.


A bientot,

Armin

From arigo at tunes.org  Mon May 29 21:54:53 2006
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 29 May 2006 21:54:53 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
References: <20060529171147.GA1717@code0.codespeak.net>
	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
Message-ID: <20060529195453.GB14908@code0.codespeak.net>

Hi Raymond,

On Mon, May 29, 2006 at 12:20:44PM -0700, Raymond Hettinger wrote:
> > I've finally come around to writing a patch that stops dict lookup from
> > eating all exceptions that occur during lookup, like rare bugs in user
> > __eq__() methods. 
> 
> Is there a performance impact?

I believe that this patch is good anyway, because I consider my (and
anybody's) debugging hours worth more than a few seconds of a
long-running process.  You get *really* obscure bugs this way.

I would also point out that this is the kind of feature that should not
be traded off for performance, otherwise we'd loose much of the point of
Python.  IMHO.

As it turns out, I measured only 0.5% performance loss in Pystone.


A bientot,

Armin

From guido at python.org  Mon May 29 21:59:32 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 29 May 2006 12:59:32 -0700
Subject: [Python-Dev] [Python-checkins] r46300 - in python/trunk:
	Lib/socket.py Lib/test/test_socket.py Lib/test/test_struct.py
	Modules/_struct.c Modules/arraymodule.c Modules/socketmodule.c
In-Reply-To: <CD67B679-50C4-4398-81BF-40F3ED6B106B@redivi.com>
References: <20060526120329.9A9671E400C@bag.python.org>
	<ca471dc20605260956t35aafa71tcad19b926c0c4b41@mail.gmail.com>
	<CD67B679-50C4-4398-81BF-40F3ED6B106B@redivi.com>
Message-ID: <ca471dc20605291259h561526cek9dad74288ae0976b@mail.gmail.com>

[python-checkins]
> >> * Added socket.recv_buf() and socket.recvfrom_buf() methods, that
> >> use the buffer
> >>   protocol (send and sendto already did).
> >>
> >> * Added struct.pack_to(), that is the corresponding buffer
> >> compatible method to
> >>   unpack_from().

[Guido]
> > Hm... The file object has a similar method readinto(). Perhaps the
> > methods introduced here could follow that lead instead of using two
> > different new naming conventions?

On 5/27/06, Bob Ippolito <bob at redivi.com> wrote:
> (speaking specifically about struct and not socket)
>
> pack_to and unpack_from are named as such because they work with objects
> that support the buffer API (not file-like-objects). I couldn't find any
> existing convention for objects that manipulate buffers in such a way.

Actually, <fileobject>.readinto(<bufferobject>) is the convention I
started long ago (at the time the buffer was introduced.

> If there is an existing convention then I'd be happy to rename these.
>
> "readinto" seems to imply that some kind of position is being
> incremented.

No -- it's like read() but instead of returning a new object it reads
"into" an object you pass.

> Grammatically it only works if it's implemented on all buffer
> objects, but
> in this case it's implemented on the Struct type.

It looks like <structobject>.pack_to(<bufferobject>, <other args>) is
similar to <fileobject>.readinto(<bufferobject>) in that the buffer
object receives the result of packing / reading.

(Files don't have a writefrom() method because write() can be
polymorphic. read() can't be due to the asymmetry in the API. I guess
struct objects are different.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From fredrik at pythonware.com  Mon May 29 21:57:43 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 29 May 2006 21:57:43 +0200
Subject: [Python-Dev] feature request: inspect.isgenerator
In-Reply-To: <e5fidd$aod$1@sea.gmane.org>
References: <loom.20060529T092512-825@post.gmane.org>
	<e5fidd$aod$1@sea.gmane.org>
Message-ID: <e5fjnq$f8j$1@sea.gmane.org>

Terry Reedy wrote:

>> def isgenerator(func):
>>    return func.func_code.co_flags == 99
>>
>> but it is rather ugly (horrible indeed).
> 
> To me, part of the beauty of Python is that is so easy to write such things 
> so compactly, so that we do not need megaAPIs with hundreds of functions 
> and methods.

so what's so "easy" about the magic constant 99 ?

is there some natural and obvious connection between generators and that 
number that I'm missing, or is that constant perhaps best hidden inside 
some introspection support module?

(I'm still not sure about the use case, though.  shouldn't a program 
treat a generator just like any other callable that returns an iterator?)

</F>


From guido at python.org  Mon May 29 22:02:51 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 29 May 2006 13:02:51 -0700
Subject: [Python-Dev] PEP 42 - New kind of Temporary file
In-Reply-To: <20060528152731.GB20739@yapgi.hopto.org>
References: <20060528152731.GB20739@yapgi.hopto.org>
Message-ID: <ca471dc20605291302u6a163e70s9b45deb8c0dd9857@mail.gmail.com>

I think it's a very minor feature at best -- but I do admit to having
implemented this at least once so I'm not against adding it as a new
feature to the tempfile module. It's getting awfully late for 2.5
though... (feature freeze is at 2.5b1 which is planned for June 14
IIRC).

--Guido

On 5/28/06, Andrew Chambers <andychambers2002 at yahoo.co.uk> wrote:
> The mailing-list bot said introduce myself so...
>
> I'm Andy Chambers.  I've been lurking on the web-interface to the list
> for a while.  I recently started trying to implement one of the items on
> PEP 42 so I thought I should join the list and make myself known.
>
> The item I'm working on is the new kind of temporary file that only
> turns into a file when required (before which time presumably its a
> StringIO).
>
> Do we still want this?  I suggest it be exposed as a class TempFile in
> the tempfile module.  If yes, I'll write the test cases and submit a
> patch.
>
> Regards,
> Andy
>
>
>
> ___________________________________________________________
> Inbox full of spam? Get leading spam protection and 1GB storage with All New Yahoo! Mail. http://uk.docs.yahoo.com/nowyoucan.html
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From fredrik at pythonware.com  Mon May 29 22:06:45 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 29 May 2006 22:06:45 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>
	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>
Message-ID: <e5fk8j$h8f$1@sea.gmane.org>

Neal Norwitz wrote:

>> Also, it would be easy to detect METH_OLDARGS in PyCFunction_New and raise
>> an appropriate exception.
> 
> I agree with Martin this should raise a deprecation warning in 2.5.

why?

this is a clear case of unnecessary meddling.  removing it won't remove 
much code (a whopping 11 lines is dedicated to this), nor give any speed 
ups whatsoever; all you're doing is generating extra work and support 
issues for a bunch of third-party developers.  trust me, we have better 
things to do with our time.

-1 on meddling with this before 3.0.

</F>


From g.brandl at gmx.net  Mon May 29 22:09:21 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 29 May 2006 22:09:21 +0200
Subject: [Python-Dev] feature request: inspect.isgenerator
In-Reply-To: <e5fjnq$f8j$1@sea.gmane.org>
References: <loom.20060529T092512-825@post.gmane.org>	<e5fidd$aod$1@sea.gmane.org>
	<e5fjnq$f8j$1@sea.gmane.org>
Message-ID: <e5fkdh$gps$1@sea.gmane.org>

Fredrik Lundh wrote:
> Terry Reedy wrote:
> 
>>> def isgenerator(func):
>>>    return func.func_code.co_flags == 99
>>>
>>> but it is rather ugly (horrible indeed).
>> 
>> To me, part of the beauty of Python is that is so easy to write such things 
>> so compactly, so that we do not need megaAPIs with hundreds of functions 
>> and methods.
> 
> so what's so "easy" about the magic constant 99 ?
> 
> is there some natural and obvious connection between generators and that 
> number that I'm missing, or is that constant perhaps best hidden inside 
> some introspection support module?

It seems to be a combination of CO_NOFREE, CO_GENERATOR, CO_OPTIMIZED and
CO_NEWLOCALS.

The first four CO_ constants are already in inspect.py, the newer ones
(like CO_GENERATOR) aren't.

I wonder whether a check shouldn't just return (co_flags & 0x20), which
is CO_GENERATOR.

Georg


From fredrik at pythonware.com  Mon May 29 22:09:30 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 29 May 2006 22:09:30 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <20060529195453.GB14908@code0.codespeak.net>
References: <20060529171147.GA1717@code0.codespeak.net>	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
	<20060529195453.GB14908@code0.codespeak.net>
Message-ID: <e5fkdo$h8f$2@sea.gmane.org>

Armin Rigo wrote:

> As it turns out, I measured only 0.5% performance loss in Pystone.

umm.  does Pystone even call lookdict?

</F>


From bob at redivi.com  Mon May 29 22:18:10 2006
From: bob at redivi.com (Bob Ippolito)
Date: Mon, 29 May 2006 13:18:10 -0700
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com>
	<9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com>
	<E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>
	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>
	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>
	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>
	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>
Message-ID: <9ED42665-07AC-4819-B2B2-D899A56038B1@redivi.com>


On May 29, 2006, at 12:44 PM, Guido van Rossum wrote:

> On 5/29/06, Tim Peters <tim.peters at gmail.com> wrote:
>>> I think we should do as Thomas proposes: plan to make it an error in
>>> 2.6 (or 2.7 if there's a big outcry, which I don't expect) and  
>>> accept
>>> it with a warning in 2.5.
>>
>> That's what I arrived at, although 2.4.3's checking behavior is
>> actually so inconsistent that "it" needs some defining (what exactly
>> are we trying to still accept?  e.g., that -1 doesn't trigger "I"
>> complaints but that -1L does above?  that one's surely a bug).
>
> No, it reflects that (up to 2.3 I believe) 0xffffffff was -1 but
> 0xffffffffL was 4294967295L.

Python 2.3 did a FutureWarning on 0xffffffff but its value was -1.

Anyway, my plan is to make it such that all non-native format codes  
will behave exactly like C casting, but will do a DeprecationWarning  
for input numbers that were initially out of bounds. This behavior  
will be consistent across (python) int and long, and will be easy  
enough to explain in the docs (but still more complicated than  
"values not representable by this data type will raise struct.error").

This means that I'm also changing it so that struct.pack will not  
raise OverflowError for some longs, it will always raise struct.error  
or do a warning (as long as the input is int or long).

Pseudocode looks kinda like this:

def wrap_unsigned(x, CTYPE):
	if not (0 <= x <= CTYPE_MAX):
		DeprecationWarning()
		x &= CTYPE_MAX
	return x

Actually, should this be a FutureWarning or a DeprecationWarning?

-bob


From tjreedy at udel.edu  Mon May 29 22:23:01 2006
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 29 May 2006 16:23:01 -0400
Subject: [Python-Dev] Contributor agreements (was Re: DRAFT: python-dev
	summary for 2006-04-01 to 2006-04-15)
References: <d11dcfba0605291005q2e3cece0u80a1cead2bce963c@mail.gmail.com>
Message-ID: <e5fl76$k7o$1@sea.gmane.org>

----------------------
Contributor agreements
----------------------

If you're listed in the `Python acknowledgments`_, and you haven't
signed a `contributor agreement`_, please submit one as soon as
possible.  Note that this includes even the folks that have been
around forever -- the PSF would like to be as careful as possible on
this one.
-----------
I believe I did submit something some time ago, but I am not positive, and 
even if I did, I don't know whether it was a recent enough version.  So I 
suggest that

>.. _Python acknowledgments:
>  http://svn.python.org/projects/python/trunk/Misc/ACKS

put a star besides names for which there is a recent enough form.
Another problem with that list is that it apparently includes people who 
have not actually contributed text or code in the distribution and who 
therefor do not (?) need to send in an agreement.

Also, it says 'not in any useful order', which is weird, because it is 
alphabetical, which is quite useful for finding a particular name (or not 
;-).

>.. _contributor agreement: 
>http://www.python.org/psf/contrib-form-python.html

By itself, this link is inadequate since there is a blank for 'Initial 
License' with no choices indicated.  So the summary needs a link to an 
instruction page also.  The form is also a bit confusing since it appears 
to talk about future contributions but the last line tacked on to the 
bottom refers to the past.  If it means both, then why not say so in the 
main text in a way that non-lawyers can understand?

Terry Jan Reedy




From tim.peters at gmail.com  Mon May 29 22:31:08 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 29 May 2006 16:31:08 -0400
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com>
	<E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>
	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>
	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>
	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>
	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>
Message-ID: <1f7befae0605291331o593307a8kf2c54cfc1ef95eeb@mail.gmail.com>

[Guido]
>>> I think we should do as Thomas proposes: plan to make it an error in
>>> 2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept
>>> it with a warning in 2.5.

[Tim]
>> That's what I arrived at, although 2.4.3's checking behavior is
>> actually so inconsistent that "it" needs some defining (what exactly
>> are we trying to still accept?  e.g., that -1 doesn't trigger "I"
>> complaints but that -1L does above?  that one's surely a bug).

[Guido]
> No, it reflects that (up to 2.3 I believe) 0xffffffff was -1 but
> 0xffffffffL was 4294967295L.

But is it "a bug" _now_?  That's what I mean by "a bug".  To me, this
is simply inexplicable in 2.4 (and 2.5+):

>>> struct.pack("I", -1)  # neither std nor native modes complain
'\xff\xff\xff\xff'
>>> struct.pack(">I", -1)
'\xff\xff\xff\xff'

>>> struct.pack("I", -1L)  # but both complain if the input is long
Traceback (most recent call last):
 File "<stdin>", line 1, in ?
OverflowError: can't convert negative value to unsigned long
>>> struct.pack(">I", -1L)
Traceback (most recent call last):
 File "<stdin>", line 1, in ?
OverflowError: can't convert negative value to unsigned long

Particulary for the standard modes, the behavior also varies across
platforms (i.e., it wasn't true in any version of Python that
0xffffffff == -1 on most 64-bit boxes, and to have "standard mode"
behavior vary according to platform isn't particularly standard :-)).

>> To be clear, Thomas proposed "accepting it" (whatever that means) until 3.0.

> Fine with me.

So who has a definition for what "it" means?  I don't.  Does every
glitch have to be reproduced across the cross product of platform X
format-code X input-type X native-vs-standard X
direction-of-out-of-range?

Or would people be happier if struct simply never checked for
out-of-bounds?  At least the latter is doable with finite effort, and
is also explainable ("for an integral code of N bytes, pack() stores
the least-significant N bytes of the input viewed as a 2's-complement
integer").  I'd be happier with that than with something that can't be
explained short of exhaustive listing of cases across 5 dimensions.

Ditto with saying that for an integral type using N bytes, pack()
accepts any integer in -(2**(8*N-1)) through 2**(8*N)-1, complains
outside that range, and makes no complaining distinctions based on
platform, input type, standard-vs-native mode, signed-or-unsigned
format code, or direction of out-of-range.  That's also explainable.

From ncoghlan at gmail.com  Mon May 29 22:38:45 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 30 May 2006 06:38:45 +1000
Subject: [Python-Dev] Contributor agreements (was Re: DRAFT: python-dev
 summary for 2006-04-01 to 2006-04-15)
In-Reply-To: <e5fl76$k7o$1@sea.gmane.org>
References: <d11dcfba0605291005q2e3cece0u80a1cead2bce963c@mail.gmail.com>
	<e5fl76$k7o$1@sea.gmane.org>
Message-ID: <447B5BD5.2000805@gmail.com>

Terry Reedy wrote:
>> .. _contributor agreement: 
>> http://www.python.org/psf/contrib-form-python.html
> 
> By itself, this link is inadequate since there is a blank for 'Initial 
> License' with no choices indicated.  So the summary needs a link to an 
> instruction page also.  The form is also a bit confusing since it appears 
> to talk about future contributions but the last line tacked on to the 
> bottom refers to the past.  If it means both, then why not say so in the 
> main text in a way that non-lawyers can understand?

That's the modified version of the form for people like me that have 
contributed stuff before sending the form in (I have one sitting on my desk 
that will be going in the mail real soon now, honest. . .).

There's another page somewhere in the PSF section that talks about contributor 
agreements in general - maybe that would be a better link for the summary?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From thomas at python.org  Mon May 29 22:45:09 2006
From: thomas at python.org (Thomas Wouters)
Date: Mon, 29 May 2006 22:45:09 +0200
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <1f7befae0605291331o593307a8kf2c54cfc1ef95eeb@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>
	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>
	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>
	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>
	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>
	<1f7befae0605291331o593307a8kf2c54cfc1ef95eeb@mail.gmail.com>
Message-ID: <9e804ac0605291345x303f1182p2d186846665dcb64@mail.gmail.com>

On 5/29/06, Tim Peters <tim.peters at gmail.com> wrote:
>
> [Tim]
> >> To be clear, Thomas proposed "accepting it" (whatever that means) until
> 3.0.
> [Guido]
> > Fine with me.
>
> So who has a definition for what "it" means?


I know which 'it' I meant: the same 'it' as struct already accepts in Python
2.4 and before. Yes, it's inconsistent between formatcodes and valuetypes --
fixing that the whole point of the change -- but that's how you define
'compatibility'; struct, by default, should do what it did for Python 2.4,
for all operating modes. It doesn't have to be more liberal than 2.4 (and
preferably shouldn't, as that could break backward compatibility of some
code -- much less common, though.)

Making a list of which formatcodes accept what values (for what valuetypes)
for 2.4 is easy enough (and should be added to the test suite, too ;-) -- I
can do that in a few days if no one gets to it.

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/4779c301/attachment.html 

From thomas at python.org  Mon May 29 22:50:50 2006
From: thomas at python.org (Thomas Wouters)
Date: Mon, 29 May 2006 22:50:50 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <e5fk8j$h8f$1@sea.gmane.org>
References: <e5bhs7$509$1@sea.gmane.org>
	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>
	<e5cjgn$3pg$1@sea.gmane.org>
	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>
	<e5fk8j$h8f$1@sea.gmane.org>
Message-ID: <9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>

On 5/29/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
>
>
> this is a clear case of unnecessary meddling.  removing it won't remove
> much code (a whopping 11 lines is dedicated to this), nor give any speed
> ups whatsoever; all you're doing is generating extra work and support
> issues for a bunch of third-party developers.  trust me, we have better
> things to do with our time.
>
> -1 on meddling with this before 3.0.


-1 from me, too, for the same reason. It would be nice if the use of
PyArg_Parse could generate a (C) compile-time warning, but since it can't, I
think a runtime warning for this minor case is just overkill. (The C
compile-time warning would be useful in other cases, too... hmm, perhaps we
can do some post-processing of .o files, looking for deprecated symbols left
undefined...)

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/fdbc1524/attachment.htm 

From fredrik at pythonware.com  Mon May 29 22:52:03 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 29 May 2006 22:52:03 +0200
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <9e804ac0605291345x303f1182p2d186846665dcb64@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>	<E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>	<1f7befae0605291331o593307a8kf2c54cfc1ef95eeb@mail.gmail.com>
	<9e804ac0605291345x303f1182p2d186846665dcb64@mail.gmail.com>
Message-ID: <e5fmth$ouu$1@sea.gmane.org>

Thomas Wouters wrote:

> I know which 'it' I meant: the same 'it' as struct already accepts in 
> Python 2.4 and before. Yes, it's inconsistent between formatcodes and 
> valuetypes -- fixing that the whole point of the change -- but that's 
> how you define 'compatibility'; struct, by default, should do what it 
> did for Python 2.4, for all operating modes.

that's how you define compatibility for bug fix releases in Python.  2.5 
is not 2.4.4.

</F>


From g.brandl at gmx.net  Mon May 29 22:57:57 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 29 May 2006 22:57:57 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>	<e5fk8j$h8f$1@sea.gmane.org>
	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>
Message-ID: <e5fn8l$ppo$1@sea.gmane.org>

Thomas Wouters wrote:
> 
> On 5/29/06, *Fredrik Lundh* <fredrik at pythonware.com
> <mailto:fredrik at pythonware.com>> wrote:
> 
> 
>     this is a clear case of unnecessary meddling.  removing it won't remove
>     much code (a whopping 11 lines is dedicated to this), nor give any speed
>     ups whatsoever; all you're doing is generating extra work and support
>     issues for a bunch of third-party developers.  trust me, we have better
>     things to do with our time.
> 
>     -1 on meddling with this before 3.0.
> 
> 
> -1 from me, too, for the same reason. It would be nice if the use of
> PyArg_Parse could generate a (C) compile-time warning, but since it
> can't, I think a runtime warning for this minor case is just overkill.
> (The C compile-time warning would be useful in other cases, too... hmm,
> perhaps we can do some post-processing of .o files, looking for
> deprecated symbols left undefined...)

Isn't there at least a GCC __attribute__((deprecated))?

Georg


From thomas at python.org  Mon May 29 23:00:07 2006
From: thomas at python.org (Thomas Wouters)
Date: Mon, 29 May 2006 23:00:07 +0200
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <e5fmth$ouu$1@sea.gmane.org>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>
	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>
	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>
	<1f7befae0605291331o593307a8kf2c54cfc1ef95eeb@mail.gmail.com>
	<9e804ac0605291345x303f1182p2d186846665dcb64@mail.gmail.com>
	<e5fmth$ouu$1@sea.gmane.org>
Message-ID: <9e804ac0605291400k28a05947hd953a1c1f3ed2245@mail.gmail.com>

On 5/29/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
>
> Thomas Wouters wrote:
>
> > I know which 'it' I meant: the same 'it' as struct already accepts in
> > Python 2.4 and before. Yes, it's inconsistent between formatcodes and
> > valuetypes -- fixing that the whole point of the change -- but that's
> > how you define 'compatibility'; struct, by default, should do what it
> > did for Python 2.4, for all operating modes.
>
> that's how you define compatibility for bug fix releases in Python.  2.5
> is not 2.4.4.


Correct, and it is also not 2.6. Breaking perfectly working (and more to the
point, non-complaining) code, even if it is for a good reason, is bad. It
means, for instance, that I can not upgrade Python on any of the servers I
manage, where Python gets used by clients. This is not a hypothetical
problem, I've had it happen too often (although fortunately not often with
Python, because of the sane policy on backward compatibility.)

If 2.5 warns and does the old thing, the upgrade path is easy and
defendable. This is also why there are future statements -- I distinctly
recall making the same argument back then :-) The cost of continuing the
misfeatures in struct for one release does not weigh up to the cost of
breaking compatibility unwarned.

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/c103f799/attachment.html 

From raymond.hettinger at verizon.net  Mon May 29 23:02:25 2006
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon, 29 May 2006 14:02:25 -0700
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
References: <20060529171147.GA1717@code0.codespeak.net>
	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
	<20060529195453.GB14908@code0.codespeak.net>
Message-ID: <00de01c68363$308d37c0$dc00000a@RaymondLaptop1>

> On Mon, May 29, 2006 at 12:20:44PM -0700, Raymond Hettinger wrote:
>> > I've finally come around to writing a patch that stops dict lookup from
>> > eating all exceptions that occur during lookup, like rare bugs in user
>> > __eq__() methods.
>>
>> Is there a performance impact?
>
> I believe that this patch is good anyway, because I consider my (and
> anybody's) debugging hours worth more than a few seconds of a
> long-running process.  You get *really* obscure bugs this way.
>
> I would also point out that this is the kind of feature that should not
> be traded off for performance, otherwise we'd loose much of the point of
> Python.  IMHO.
>
> As it turns out, I measured only 0.5% performance loss in Pystone.

Please run some better benchmarks and do more extensive assessments on the 
performance impact.

The kind of obscure bug you're trying to kill does not affect 99.9% of Python 
users; however, loss of performance will affect everyone.  This is arguably the 
most actively exercised part of the Python and should not be changed without 
carefully weighing purity vs practicality.

FWIW, I applied a similar patch to set objects in Py2.5 so that they wouldn't 
eat exceptions.  So, I'm truly sympathetic to the cause.  However, dicts are 
much more critical.  There needs to be a careful judgement based on measurements 
and assessments of whether there are real benefits for everyday Python users.


Raymond









From martin at v.loewis.de  Mon May 29 23:16:32 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 May 2006 23:16:32 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <e5fn8l$ppo$1@sea.gmane.org>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>	<e5fk8j$h8f$1@sea.gmane.org>	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>
	<e5fn8l$ppo$1@sea.gmane.org>
Message-ID: <447B64B0.3080107@v.loewis.de>

Georg Brandl wrote:
> Isn't there at least a GCC __attribute__((deprecated))?

Yes, but it applies to functions only. I guess you could make
a __deprecated__ function, and then expand METH_OLDARGS to a
call of that function. That won't help in cases where there are
no flags given at all in the method structures.

Regards,
Martin

From martin at v.loewis.de  Mon May 29 23:25:11 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 May 2006 23:25:11 +0200
Subject: [Python-Dev] ssize_t question: longs in header files
In-Reply-To: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>
References: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>
Message-ID: <447B66B7.90107@v.loewis.de>

Neal Norwitz wrote:
> minus comments, etc  yields several questions about whether some
> values should use Py_ssize_t rather than C longs.  In particular:
> 
> * hash values
> Include/abstract.h:     long PyObject_Hash(PyObject *o);  // also in
> object.h
> Include/object.h:typedef long (*hashfunc)(PyObject *);

I don't think these should be Py_ssize_t. Of course, a pointer type
cannot naturally convert to a long on some systems, so where we use
a pointer as a hash, we should do so carefully if sizeof(void*) >
sizeof(long); an xor of low and high word is probably good enough.

Currently, the code does something way more complex (actually
creating a long object); that should be replaced either by inlining
long_hash, or by using an even simpler hash function for 64-bit
pointers.

> * ints:  Include/intobject.h:    long ob_ival;

As Tim says, this is way out of scope for 2.5. Guido said it is ok
to change this to 64-bit ints in 2.6, but I expect some embedded
system developers will start screaming when they hear that: 64-bit
arithmetic is expensive on a 32-bit machine.

Regards,
Martin

From g.brandl at gmx.net  Mon May 29 23:26:07 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 29 May 2006 23:26:07 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <447B64B0.3080107@v.loewis.de>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>	<e5fk8j$h8f$1@sea.gmane.org>	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>	<e5fn8l$ppo$1@sea.gmane.org>
	<447B64B0.3080107@v.loewis.de>
Message-ID: <e5fotf$v27$1@sea.gmane.org>

Martin v. L?wis wrote:
> Georg Brandl wrote:
>> Isn't there at least a GCC __attribute__((deprecated))?
> 
> Yes, but it applies to functions only. I guess you could make
> a __deprecated__ function, and then expand METH_OLDARGS to a
> call of that function. That won't help in cases where there are
> no flags given at all in the method structures.

I thought more about PyArg_Parse for __deprecated__.

If you look at the patch on SF, it's creating warnings whenever
a new CFunction is created with a METH_OLDARGS flag. I'm not sure
this is the right way since it may mean that users of extensions
using METH_OLDARGS might get a warning each time they get a
bound method using it.

Georg


From arigo at tunes.org  Mon May 29 23:34:28 2006
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 29 May 2006 23:34:28 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <00de01c68363$308d37c0$dc00000a@RaymondLaptop1>
References: <20060529171147.GA1717@code0.codespeak.net>
	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
	<20060529195453.GB14908@code0.codespeak.net>
	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>
Message-ID: <20060529213428.GA20141@code0.codespeak.net>

Hi Raymond,

On Mon, May 29, 2006 at 02:02:25PM -0700, Raymond Hettinger wrote:
> Please run some better benchmarks and do more extensive assessments on the 
> performance impact.

At the moment, I'm trying to, but 2.5 HEAD keeps failing mysteriously on
the tests I try to time, and even going into an infinite loop consuming
all my memory - since the NFS sprint.  Am I allowed to be grumpy here,
and repeat that speed should not be used to justify bugs?  I'm proposing
a bug fix, I honestly don't care about 0.5% of speed.

With a benchmark that passed, and that heavily uses instance and class
attribute look-ups (richards), I don't even see any relevant difference.

> and assessments of whether there are real benefits for everyday Python
> users.

It would have saved me two hours-long debugging sessions, and I consider
myself an everyday Python user, so yes, I think so.


Grumpy-ly yours,

Armin

From thomas at python.org  Mon May 29 23:33:36 2006
From: thomas at python.org (Thomas Wouters)
Date: Mon, 29 May 2006 23:33:36 +0200
Subject: [Python-Dev] ssize_t question: longs in header files
In-Reply-To: <447B66B7.90107@v.loewis.de>
References: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>
	<447B66B7.90107@v.loewis.de>
Message-ID: <9e804ac0605291433x54dd64e7o10acfc3076f5ffb0@mail.gmail.com>

On 5/29/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>
> Neal Norwitz wrote:
> > minus comments, etc  yields several questions about whether some
> > values should use Py_ssize_t rather than C longs.  In particular:
>
> > * ints:  Include/intobject.h:    long ob_ival;
>
> As Tim says, this is way out of scope for 2.5. Guido said it is ok
> to change this to 64-bit ints in 2.6, but I expect some embedded
> system developers will start screaming when they hear that: 64-bit
> arithmetic is expensive on a 32-bit machine.


Well, those systems shouldn't have a 64-bit Py_ssize_t anyway, should they?

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/93fa1e3f/attachment.htm 

From martin at v.loewis.de  Mon May 29 23:35:45 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 May 2006 23:35:45 +0200
Subject: [Python-Dev] ssize_t: ints in header files
In-Reply-To: <ee2a432c0605282352y418a6e70i6c96350fa373146@mail.gmail.com>
References: <ee2a432c0605282352y418a6e70i6c96350fa373146@mail.gmail.com>
Message-ID: <447B6931.9010207@v.loewis.de>

Neal Norwitz wrote:
> Should the following values be ints (limited to 2G)?
> 
>  * dict counts (ma_fill, ma_used, ma_mask)

I think Tim said he'll look into them.

>  * line #s and column #s

I think we should really formalize a limit, and then enforce it
throughout. column numbers shouldn't exceed 16-bits, and line #s
shouldn't exceed 31 bits.

>  * AST (asdl.h) sequences

we should first limit all the others, and then it should come
out naturally that AST sequences can be happily limited to 31 bits.

>  * recursion limit

This should be Py_ssize_t. While the stack is typically more limited,
it should be possible to configure it to exceed 4GiB on a 64-bit
machine.

>  * read/write/send/recv results and buf size

This is very tricky. Often, the underlying C library has int there
(e.g. msvcrt). Eventually, we should get rid of msvcrt, and then
hope that the underlying system API can deal with larger buffers.

>  * code object values like # args, # locals, stacksize, # instructions

IMO, they should all get formally limited to 15 bits (i.e. short).
I think some are already thus limited through the byte code format.
Somebody would have to check, and document what the limits are.

Either we end up with different limits for each one, or, more likely,
the same limit, in which case I would suggest to introduce another
symbolic type (e.g. Py_codesize_t or some such). Then we should
consistently reject source code that exceeds such a limit, and
elsewhere rely on the guarantee that the values will be always
limited much more than the underlying data structures.

>  * sre (i think i have a patch to fix this somewhere)

This is a huge set of changes, I think. SRE *should* support strings
larger than 4GiB. I could happily accept limitations for the regexes
themselves (e.g. size of the compiled expression, number of repeats,
etc).

Regards,
Martin

From steven.bethard at gmail.com  Mon May 29 23:40:57 2006
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon, 29 May 2006 15:40:57 -0600
Subject: [Python-Dev] Contributor agreements (was Re: DRAFT: python-dev
	summary for 2006-04-01 to 2006-04-15)
In-Reply-To: <447B5BD5.2000805@gmail.com>
References: <d11dcfba0605291005q2e3cece0u80a1cead2bce963c@mail.gmail.com>
	<e5fl76$k7o$1@sea.gmane.org> <447B5BD5.2000805@gmail.com>
Message-ID: <d11dcfba0605291440n4c236ffh8cb3cf3a1ba7deef@mail.gmail.com>

On 5/29/06, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Terry Reedy wrote:
> >> .. _contributor agreement:
> >> http://www.python.org/psf/contrib-form-python.html
> >
> > By itself, this link is inadequate since there is a blank for 'Initial
> > License' with no choices indicated.  So the summary needs a link to an
> > instruction page also.  The form is also a bit confusing since it appears
> > to talk about future contributions but the last line tacked on to the
> > bottom refers to the past.  If it means both, then why not say so in the
> > main text in a way that non-lawyers can understand?
>
> That's the modified version of the form for people like me that have
> contributed stuff before sending the form in (I have one sitting on my desk
> that will be going in the mail real soon now, honest. . .).
>
> There's another page somewhere in the PSF section that talks about contributor
> agreements in general - maybe that would be a better link for the summary?

Sounds like it.  (That wasn't anywhere in the thread.)  Does
http://www.python.org/psf/contrib/ seem like the right page to link
to?

Thanks,

STeVe
-- 
Grammar am for people who can't think for myself.
        --- Bucky Katt, Get Fuzzy

From martin at v.loewis.de  Mon May 29 23:43:54 2006
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 29 May 2006 23:43:54 +0200
Subject: [Python-Dev] ssize_t question: longs in header files
In-Reply-To: <9e804ac0605291433x54dd64e7o10acfc3076f5ffb0@mail.gmail.com>
References: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>	<447B66B7.90107@v.loewis.de>
	<9e804ac0605291433x54dd64e7o10acfc3076f5ffb0@mail.gmail.com>
Message-ID: <447B6B1A.2020107@v.loewis.de>

Thomas Wouters wrote:
>     Neal Norwitz wrote:
>     > minus comments, etc  yields several questions about whether some
>     > values should use Py_ssize_t rather than C longs.  In particular:
> 
>     > * ints:  Include/intobject.h:    long ob_ival;
> 
>     As Tim says, this is way out of scope for 2.5. Guido said it is ok
>     to change this to 64-bit ints in 2.6, but I expect some embedded
>     system developers will start screaming when they hear that: 64-bit
>     arithmetic is expensive on a 32-bit machine. 
> 
> 
> Well, those systems shouldn't have a 64-bit Py_ssize_t anyway, should they?

No - but Guido said he wanted Python int to have a 64-bit representation
everywhere, not a Py_ssize_t representation.

I agree using Py_ssize_t would be a "smaller" change, and one that
likely has less performance impact. It would still be a large change,
and should be only done if we know for sure we don't want it to be
a 64-bit type always the next day.

Regards,
Martin

From martin at v.loewis.de  Mon May 29 23:45:27 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 May 2006 23:45:27 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <e5fotf$v27$1@sea.gmane.org>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>	<e5fk8j$h8f$1@sea.gmane.org>	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>	<e5fn8l$ppo$1@sea.gmane.org>	<447B64B0.3080107@v.loewis.de>
	<e5fotf$v27$1@sea.gmane.org>
Message-ID: <447B6B77.3050608@v.loewis.de>

Georg Brandl wrote:
> I thought more about PyArg_Parse for __deprecated__.

Ah, PyArg_Parse can, of course. Having it actually do that should not
hurt, either - but you need a reliable check whether the compiler
supports __deprecated__.

Regards,
Martin

From martin at v.loewis.de  Mon May 29 23:50:53 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 May 2006 23:50:53 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <ca471dc20605291247m3412dabat8d1fb48043682431@mail.gmail.com>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>
	<447A1D58.1020302@v.loewis.de>
	<ca471dc20605291247m3412dabat8d1fb48043682431@mail.gmail.com>
Message-ID: <447B6CBD.4060100@v.loewis.de>

Guido van Rossum wrote:
>> I think that should be done for 2.5, but nothing else. Or, perhaps
>> a deprecation warning could be generated if it is still used.
> 
> I'll let Martin decide this one.

I just sent a message to some that producing a DeprecationWarning is
fine, but now I read some opposition; given that this really just needs
very little code to support it, and given that run-time warnings annoy
the users, not the module authors, I agree that this should stay
silently until 3.0.

We should still continue to remove all uses we have ourselves.

Regards,
Martin

From martin at v.loewis.de  Mon May 29 23:51:47 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 29 May 2006 23:51:47 +0200
Subject: [Python-Dev] Contributor agreements (was Re: DRAFT: python-dev
 summary for 2006-04-01 to 2006-04-15)
In-Reply-To: <d11dcfba0605291440n4c236ffh8cb3cf3a1ba7deef@mail.gmail.com>
References: <d11dcfba0605291005q2e3cece0u80a1cead2bce963c@mail.gmail.com>	<e5fl76$k7o$1@sea.gmane.org>
	<447B5BD5.2000805@gmail.com>
	<d11dcfba0605291440n4c236ffh8cb3cf3a1ba7deef@mail.gmail.com>
Message-ID: <447B6CF3.6070808@v.loewis.de>

Steven Bethard wrote:
>> There's another page somewhere in the PSF section that talks about contributor
>> agreements in general - maybe that would be a better link for the summary?
> 
> Sounds like it.  (That wasn't anywhere in the thread.)  Does
> http://www.python.org/psf/contrib/ seem like the right page to link
> to?

That is the page giving instructions on how to fill it out, yes.

Regards,
Martin

From thomas at python.org  Mon May 29 23:57:37 2006
From: thomas at python.org (Thomas Wouters)
Date: Mon, 29 May 2006 23:57:37 +0200
Subject: [Python-Dev] ssize_t question: longs in header files
In-Reply-To: <447B6B1A.2020107@v.loewis.de>
References: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>
	<447B66B7.90107@v.loewis.de>
	<9e804ac0605291433x54dd64e7o10acfc3076f5ffb0@mail.gmail.com>
	<447B6B1A.2020107@v.loewis.de>
Message-ID: <9e804ac0605291457s784e6089ge84e6315feb2c3cd@mail.gmail.com>

On 5/29/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:

> I agree using Py_ssize_t would be a "smaller" change, and one that
> likely has less performance impact. It would still be a large change,
> and should be only done if we know for sure we don't want it to be
> a 64-bit type always the next day.


Well, implementing PyInt in terms of a new symbolic type, even if it
defaults to 64-bit, should make choosing it to be a 32-bit type a lot
easier, shouldn't it?
Not that I think defaulting PyInt to use 'long long' is a good thing,
considering we still only compile 'long long' support optionally. Better to
stick with a common native type until all machines have 64-bit registers.
Better for performance, too.

But switching PyInts to use (a symbolic type of the same size as) Py_ssize_t
means that, when the time comes that 32-bit architectures are rare, Win64
isn't left as the only platform (barring other LLP64 systems) that has slow
33-to-64-bit Python numbers (because they'd be PyLongs there, even though
the platform has 64-bit registers.) Given the timeframe and the impact,
though, perhaps we should just do it -- now -- in the p3yk branch and forget
about 2.x; gives people all the more reason to switch, two years from now.

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/5a5311f3/attachment.html 

From fredrik at pythonware.com  Tue May 30 00:01:46 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 May 2006 00:01:46 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <20060529213428.GA20141@code0.codespeak.net>
References: <20060529171147.GA1717@code0.codespeak.net>	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>	<20060529195453.GB14908@code0.codespeak.net>	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>
	<20060529213428.GA20141@code0.codespeak.net>
Message-ID: <e5fr07$6ak$1@sea.gmane.org>

Armin Rigo wrote:

> At the moment, I'm trying to, but 2.5 HEAD keeps failing mysteriously on
> the tests I try to time, and even going into an infinite loop consuming
> all my memory - since the NFS sprint.  Am I allowed to be grumpy here,
> and repeat that speed should not be used to justify bugs?

not unless you can produce some code.  unfounded accusations don't 
belong on this list (it's not like the sprinters didn't test the code on 
a whole bunch of platforms), and neither does lousy benchmarks (why are 
you repeating the 0.5% figure when pystone doesn't even test non-string 
dictionary behaviour?  PyString_Eq cannot fail...)

</F>


From arigo at tunes.org  Tue May 30 00:04:38 2006
From: arigo at tunes.org (Armin Rigo)
Date: Tue, 30 May 2006 00:04:38 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <20060529213428.GA20141@code0.codespeak.net>
References: <20060529171147.GA1717@code0.codespeak.net>
	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
	<20060529195453.GB14908@code0.codespeak.net>
	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>
	<20060529213428.GA20141@code0.codespeak.net>
Message-ID: <20060529220438.GA22838@code0.codespeak.net>

Re-hi,

On Mon, May 29, 2006 at 11:34:28PM +0200, Armin Rigo wrote:
> At the moment, I'm trying to, but 2.5 HEAD keeps failing mysteriously on
> the tests I try to time, and even going into an infinite loop consuming
> all my memory

Ah, it's a corner case of str.find() whose behavior just changed.
Previously, 'abc'.find('', 100) would return -1, and now it returns 100.
Just to confuse matters, the same test with unicode returns 100, and has
always done so in the past.  (Oh well, one of these again...)

So, we need to decide which behavior is "right".  One could argue that
the current 2.5 HEAD now has a consistent behavior, so it could be kept;
but there is an opposite argument as well, which is that some existing
programs like the one I was testing are now thrown into annoying
infinite loops because str.find() never returns -1 any more, even with
larger and larger start arguments.  I believe that it's a pattern that
could be common in string-processing scripts, trying to match substrings
at various points in a string trusting that str.find() will eventually
return -1.  It's harder to think of a case where a program previously
relied on unicode.find('',n) never returning -1.  Also, introducing a
new way for programs to be caught in an infinite loop is probably not a
good idea.

Hum, my apologies for my grumpy comments about the NFS sprint.  At
least, the unification of the string and unicode algorithm that was
started there is a good move, also because it exposes pre-existing
inconsistencies.


A bientot,

Armin.

From martin at v.loewis.de  Tue May 30 00:11:21 2006
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 30 May 2006 00:11:21 +0200
Subject: [Python-Dev] Integer representation (Was: ssize_t question:
 longs in header files)
In-Reply-To: <9e804ac0605291457s784e6089ge84e6315feb2c3cd@mail.gmail.com>
References: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>	
	<447B66B7.90107@v.loewis.de>	
	<9e804ac0605291433x54dd64e7o10acfc3076f5ffb0@mail.gmail.com>	
	<447B6B1A.2020107@v.loewis.de>
	<9e804ac0605291457s784e6089ge84e6315feb2c3cd@mail.gmail.com>
Message-ID: <447B7189.30408@v.loewis.de>

Thomas Wouters wrote:
> But switching PyInts to use (a symbolic type of the same size as)
> Py_ssize_t means that, when the time comes that 32-bit architectures are
> rare, Win64 isn't left as the only platform (barring other LLP64
> systems) that has slow 33-to-64-bit Python numbers (because they'd be
> PyLongs there, even though the platform has 64-bit registers.) Given the
> timeframe and the impact, though, perhaps we should just do it -- now --
> in the p3yk branch and forget about 2.x; gives people all the more
> reason to switch, two years from now.

I thought Py3k will have a single integer type whose representation
varies depending on the value being represented.

I haven't seen an actual proposal for such a type, so let me make
one:

struct PyInt{
  struct PyObject ob;
  Py_ssize_t value_or_size;
  char is_long;
  digit ob_digit[1];
};

If is_long is false, then value_or_size is the value (represented
as Py_ssize_t), else the value is in ob_digit, and value_or_size
is the size.

PyLong_* will be synonyms for PyInt_*. PyInt_FromLong/AsLong will
continue to exist; PyInt_AsLong will indicate an overflow with -1.
Likewise, PyArg_ParseTuple "i" will continue to produce int, and
raise an exception (OverflowError?) when the value is out of range.

C code can then decide whether to parse a Python integer as
C int, long, long long, or ssize_t.

Regards,
Martin

From g.brandl at gmx.net  Tue May 30 00:15:12 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 30 May 2006 00:15:12 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <447B6CBD.4060100@v.loewis.de>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>	<447A1D58.1020302@v.loewis.de>	<ca471dc20605291247m3412dabat8d1fb48043682431@mail.gmail.com>
	<447B6CBD.4060100@v.loewis.de>
Message-ID: <e5frpg$7p7$1@sea.gmane.org>

Martin v. L?wis wrote:
> Guido van Rossum wrote:
>>> I think that should be done for 2.5, but nothing else. Or, perhaps
>>> a deprecation warning could be generated if it is still used.
>> 
>> I'll let Martin decide this one.
> 
> I just sent a message to some that producing a DeprecationWarning is
> fine, but now I read some opposition; given that this really just needs
> very little code to support it, and given that run-time warnings annoy
> the users, not the module authors, I agree that this should stay
> silently until 3.0.
> 
> We should still continue to remove all uses we have ourselves.

The only modules METH_OLDARGS is now used in are fl, sv, gl, cl
and Tkinter (for the latter there is my patch on SF for review).

sv and cl are listed in PEP 356 as possibly being removed, I don't know
whether the other ones are used anymore.

Georg


From fredrik at pythonware.com  Tue May 30 00:23:04 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 May 2006 00:23:04 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <20060529220438.GA22838@code0.codespeak.net>
References: <20060529171147.GA1717@code0.codespeak.net>	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>	<20060529195453.GB14908@code0.codespeak.net>	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>	<20060529213428.GA20141@code0.codespeak.net>
	<20060529220438.GA22838@code0.codespeak.net>
Message-ID: <e5fs86$a47$1@sea.gmane.org>

Armin Rigo wrote:

> Ah, it's a corner case of str.find() whose behavior just changed.
> Previously, 'abc'.find('', 100) would return -1, and now it returns 100.
> Just to confuse matters, the same test with unicode returns 100, and has
> always done so in the past.  (Oh well, one of these again...)
> 
> So, we need to decide which behavior is "right".  One could argue that
> the current 2.5 HEAD now has a consistent behavior, so it could be kept;
> but there is an opposite argument as well, which is that some existing
> programs like the one I was testing are now thrown into annoying
> infinite loops because str.find() never returns -1 any more, even with
> larger and larger start arguments.  I believe that it's a pattern that
> could be common in string-processing scripts, trying to match substrings
> at various points in a string trusting that str.find() will eventually
> return -1.

well, the empty string is a valid substring of all possible strings 
(there are no "null" strings in Python).  you get the same behaviour 
from slicing, the "in" operator, "replace" (this was discussed on the 
list last week), "count", etc.

if you're actually searching for a *non-empty* string, find() will 
always return -1 sooner or later.

 > Hum, my apologies for my grumpy comments about the NFS sprint.  At
 > least, the unification of the string and unicode algorithm that was
 > started there is a good move, also because it exposes pre-existing
 > inconsistencies.

here's my current favourite:

 >>> "abc".count("", 100)
-96

</F>


From arigo at tunes.org  Tue May 30 00:27:09 2006
From: arigo at tunes.org (Armin Rigo)
Date: Tue, 30 May 2006 00:27:09 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <e5fr07$6ak$1@sea.gmane.org>
References: <20060529171147.GA1717@code0.codespeak.net>
	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
	<20060529195453.GB14908@code0.codespeak.net>
	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>
	<20060529213428.GA20141@code0.codespeak.net>
	<e5fr07$6ak$1@sea.gmane.org>
Message-ID: <20060529222709.GB22838@code0.codespeak.net>

Hi Fredrik,

On Tue, May 30, 2006 at 12:01:46AM +0200, Fredrik Lundh wrote:
> not unless you can produce some code.  unfounded accusations don't 
> belong on this list (it's not like the sprinters didn't test the code on 
> a whole bunch of platforms), and neither does lousy benchmarks (why are 
> you repeating the 0.5% figure when pystone doesn't even test non-string 
> dictionary behaviour?  PyString_Eq cannot fail...)

Sorry, I do apologize for my wording.  I must admit that I was a bit
apalled by the amount of reference leaks that Michael had to fix after
the sprint, so jumped to conclusions a bit too fast when I saw by 1GB
laptop swap after less than a minute.  See my e-mail, which crossed
yours, for the explanation.

The reason I did not quote examples involving non-string dicts is that
my patch makes the non-string case simpler, so -- as I expected, and as
I have now measured -- marginally faster.  All in all it's hard to say
if there is a global consistent performance change.  At this point I'd
rather like to spend my time more interestingly; this might be by
defending my point of view that very minor performance hits should not
get in the way of fixes that avoid very obscure bugs, even only
occasionally-occuring but still very obscure bugs.


A bientot,

Armin.

From rhettinger at ewtllc.com  Tue May 30 00:30:57 2006
From: rhettinger at ewtllc.com (Raymond Hettinger)
Date: Mon, 29 May 2006 15:30:57 -0700
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <20060529213428.GA20141@code0.codespeak.net>
References: <20060529171147.GA1717@code0.codespeak.net>	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>	<20060529195453.GB14908@code0.codespeak.net>	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>
	<20060529213428.GA20141@code0.codespeak.net>
Message-ID: <447B7621.3040100@ewtllc.com>

An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/75273bd4/attachment.htm 

From arigo at tunes.org  Tue May 30 00:35:57 2006
From: arigo at tunes.org (Armin Rigo)
Date: Tue, 30 May 2006 00:35:57 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <e5fs86$a47$1@sea.gmane.org>
References: <20060529171147.GA1717@code0.codespeak.net>
	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
	<20060529195453.GB14908@code0.codespeak.net>
	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>
	<20060529213428.GA20141@code0.codespeak.net>
	<20060529220438.GA22838@code0.codespeak.net>
	<e5fs86$a47$1@sea.gmane.org>
Message-ID: <20060529223557.GC22838@code0.codespeak.net>

Hi Fredrik,

On Tue, May 30, 2006 at 12:23:04AM +0200, Fredrik Lundh wrote:
> well, the empty string is a valid substring of all possible strings 
> (there are no "null" strings in Python).  you get the same behaviour 
> from slicing, the "in" operator, "replace" (this was discussed on the 
> list last week), "count", etc.
> 
> if you're actually searching for a *non-empty* string, find() will 
> always return -1 sooner or later.

I know this.  These corner cases are debatable and different answers
could be seen as correct, as I think is the case for find().  My point
was different: I was worrying that the recent change in str.find() would
needlessly send existing and working programs into infinite loops, which
can be a particularly bad kind of failure for some applications.

At least, it gave a 100% performance loss on the benchmark I was trying
to run :-)


A bientot,

Armin.

From bob at redivi.com  Tue May 30 01:06:48 2006
From: bob at redivi.com (Bob Ippolito)
Date: Mon, 29 May 2006 16:06:48 -0700
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <9ED42665-07AC-4819-B2B2-D899A56038B1@redivi.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<DB7C2662-6966-4157-A119-F216A5BC8221@redivi.com>
	<9e804ac0605281734r490d5dd5xe958e6f39b67073d@mail.gmail.com>
	<E575899E-744E-4033-884E-DBD26BBDFA11@redivi.com>
	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>
	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>
	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>
	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>
	<9ED42665-07AC-4819-B2B2-D899A56038B1@redivi.com>
Message-ID: <7BD6DC02-C281-4C52-B9C7-6A05E11E8A17@redivi.com>


On May 29, 2006, at 1:18 PM, Bob Ippolito wrote:

>
> On May 29, 2006, at 12:44 PM, Guido van Rossum wrote:
>
>> On 5/29/06, Tim Peters <tim.peters at gmail.com> wrote:
>>>> I think we should do as Thomas proposes: plan to make it an  
>>>> error in
>>>> 2.6 (or 2.7 if there's a big outcry, which I don't expect) and
>>>> accept
>>>> it with a warning in 2.5.
>>>
>>> That's what I arrived at, although 2.4.3's checking behavior is
>>> actually so inconsistent that "it" needs some defining (what exactly
>>> are we trying to still accept?  e.g., that -1 doesn't trigger "I"
>>> complaints but that -1L does above?  that one's surely a bug).
>>
>> No, it reflects that (up to 2.3 I believe) 0xffffffff was -1 but
>> 0xffffffffL was 4294967295L.
>
> Python 2.3 did a FutureWarning on 0xffffffff but its value was -1.
>
> Anyway, my plan is to make it such that all non-native format codes
> will behave exactly like C casting, but will do a DeprecationWarning
> for input numbers that were initially out of bounds. This behavior
> will be consistent across (python) int and long, and will be easy
> enough to explain in the docs (but still more complicated than
> "values not representable by this data type will raise struct.error").
>
> This means that I'm also changing it so that struct.pack will not
> raise OverflowError for some longs, it will always raise struct.error
> or do a warning (as long as the input is int or long).
>
> Pseudocode looks kinda like this:
>
> def wrap_unsigned(x, CTYPE):
> 	if not (0 <= x <= CTYPE_MAX):
> 		DeprecationWarning()
> 		x &= CTYPE_MAX
> 	return x
>
> Actually, should this be a FutureWarning or a DeprecationWarning?

OK, this behavior is implemented in revision 46537:

(this is from ./python.exe -Wall)

 >>> import struct
 >>> struct.pack('B', 256)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/Users/bob/src/python/Lib/struct.py", line 63, in pack
     return o.pack(*args)
struct.error: ubyte format requires 0 <= number <= 255
 >>> struct.pack('B', -1)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/Users/bob/src/python/Lib/struct.py", line 63, in pack
     return o.pack(*args)
struct.error: ubyte format requires 0 <= number <= 255
 >>> struct.pack('<B', 256)
/Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B'  
format requires 0 <= number <= 255
   return o.pack(*args)
'\x00'
 >>> struct.pack('<B', -1)
/Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct  
integer wrapping is deprecated
   return o.pack(*args)
/Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B'  
format requires 0 <= number <= 255
   return o.pack(*args)
'\xff'
 >>> struct.pack('<B', 256L)
/Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B'  
format requires 0 <= number <= 255
   return o.pack(*args)
'\x00'
 >>> struct.pack('<B', -1L)
/Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct  
integer wrapping is deprecated
   return o.pack(*args)
/Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B'  
format requires 0 <= number <= 255
   return o.pack(*args)
'\xff'

In _struct.c, getting rid of the #define PY_STRUCT_WRAPPING 1 will  
turn off this warning+wrapping nonsense and just raise errors for out  
of range values. It'll also enable some additional performance hacks  
(swapping out the host-endian table's pack and unpack functions with  
the faster native versions).

-bob


From tdelaney at avaya.com  Tue May 30 01:35:26 2006
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Tue, 30 May 2006 09:35:26 +1000
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1E72C@au3010avexu1.global.avaya.com>

Thomas Wouters wrote:

> If 2.5 warns and does the old thing, the upgrade path is easy and
> defendable. This is also why there are future statements -- I
> distinctly recall making the same argument back then :-) The cost of
> continuing the misfeatures in struct for one release does not weigh
> up to the cost of breaking compatibility unwarned.

This really sounds to me like a __future__ import would be useful to get
the fixed behaviour.

Tim Delaney

From scott+python-dev at scottdial.com  Tue May 30 01:48:56 2006
From: scott+python-dev at scottdial.com (Scott Dial)
Date: Mon, 29 May 2006 19:48:56 -0400
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <447B7621.3040100@ewtllc.com>
References: <20060529171147.GA1717@code0.codespeak.net>	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>	<20060529195453.GB14908@code0.codespeak.net>	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>	<20060529213428.GA20141@code0.codespeak.net>
	<447B7621.3040100@ewtllc.com>
Message-ID: <447B8868.6030203@scottdial.com>

Raymond Hettinger wrote:
> If it is really 0.5%, then we're fine.  Just remember that PyStone is an 
> amazingly uninformative and crappy benchmark.

Since Armin seems to not like having to justify his patch with any 
performance testing, I wrote a handful of dict insertion exercises and 
could find no real performance impact. In the case of an exception, it 
was much faster, but that is because it is not inserting anything into 
the dictionary.

IOW if it is a bad change in behavior. Previously, the exception was 
swallowed and it was assumed to be a new key despite the exception. This 
is an obscure use case that could creep up it's ugly head.

class K(int):
    def __cmp__(self, o): raise Exception()

d = {}
for i in xrange(10):
     d[K()] = i
for k in d.keys():
     print d[k]

Despite the incomparability, this throws no error in previous versions 
and the dict is still usable for the expected purpose.

-- 
Scott Dial
scott at scottdial.com
scodial at indiana.edu

From bob at redivi.com  Tue May 30 02:17:58 2006
From: bob at redivi.com (Bob Ippolito)
Date: Mon, 29 May 2006 17:17:58 -0700
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1E72C@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5FF1E72C@au3010avexu1.global.avaya.com>
Message-ID: <CA694BC5-43C8-4CD7-9C0E-D8311E95A45F@redivi.com>


On May 29, 2006, at 4:35 PM, Delaney, Timothy (Tim) wrote:

> Thomas Wouters wrote:
>
>> If 2.5 warns and does the old thing, the upgrade path is easy and
>> defendable. This is also why there are future statements -- I
>> distinctly recall making the same argument back then :-) The cost of
>> continuing the misfeatures in struct for one release does not weigh
>> up to the cost of breaking compatibility unwarned.
>
> This really sounds to me like a __future__ import would be useful  
> to get
> the fixed behaviour.

That would be potentially useful but is definitely not implementable.  
__future__ statements only really affect the compiler, and the struct  
module doesn't have anything to do with the compiler.

The only alternative is to have an entirely separate API (or another  
module altogether) and then try to convince people to use that.  
Warnings on incorrect input is a pretty good compromise.

The pre-2.5 behavior is totally confusing and totally undefined by  
the docs. The trunk behavior is (should be) completely consistent and  
backwards compatible. It actually allows even more broken code to  
work than the 2.4.x struct module, since it provides consistent  
behavior across 32-bit and 64-bit platforms and does the same thing  
whether you give it long or int. The only real downside is that it  
rather harmlessly tells you when some code is taking advantage of  
struct misfeatures.

-bob


From greg.ewing at canterbury.ac.nz  Tue May 30 02:33:27 2006
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 30 May 2006 12:33:27 +1200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <e5fs86$a47$1@sea.gmane.org>
References: <20060529171147.GA1717@code0.codespeak.net>
	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
	<20060529195453.GB14908@code0.codespeak.net>
	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>
	<20060529213428.GA20141@code0.codespeak.net>
	<20060529220438.GA22838@code0.codespeak.net>
	<e5fs86$a47$1@sea.gmane.org>
Message-ID: <447B92D7.1060808@canterbury.ac.nz>

Fredrik Lundh wrote:

> well, the empty string is a valid substring of all possible strings 
> (there are no "null" strings in Python).  you get the same behaviour 
> from slicing, the "in" operator, "replace" (this was discussed on the 
> list last week), "count", etc.

Although Tim pointed out that replace() only regards
n+1 empty strings as existing in a string of lenth
n. So for consistency, find() should only find them
in those places, too.

--
Greg

From tim.peters at gmail.com  Tue May 30 05:00:09 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 29 May 2006 23:00:09 -0400
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <7BD6DC02-C281-4C52-B9C7-6A05E11E8A17@redivi.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>
	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>
	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>
	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>
	<9ED42665-07AC-4819-B2B2-D899A56038B1@redivi.com>
	<7BD6DC02-C281-4C52-B9C7-6A05E11E8A17@redivi.com>
Message-ID: <1f7befae0605292000k3ac8daa8n355fb03ffb74e00@mail.gmail.com>

[Bob Ippolito]
>> ...
>> Actually, should this be a FutureWarning or a DeprecationWarning?

Since it was never documented, UndocumentedBugGoingAwayError ;-)
Short of that, yes, DeprecationWarning.  FutureWarning is for changes
in non-exceptional behavior (.e.g, if we swapped the meanings of "<"
and ">" in struct format codes, that would rate a FutureWarning
subclass, line InsaneFutureWarning).

> OK, this behavior is implemented in revision 46537:
>
> (this is from ./python.exe -Wall)
>
>  >>> import struct

...

>  >>> struct.pack('<B', -1)
> /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct
> integer wrapping is deprecated
>    return o.pack(*args)
> /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B'
> format requires 0 <= number <= 255
>    return o.pack(*args)
> '\xff'

We certainly don't want to see two deprecation warnings for a single
deprecated behavior.  I suggest eliminating the "struct integer
wrapping" warning, mostly because I had no idea what it _meant_
before reading the comments in _struct.c  ("wrapping" is used most
often in a proxy or delegation context in Python these days).   "'B'
format requires 0 <= number <= 255" is perfectly clear all by itself.

> ...

From guido at python.org  Tue May 30 05:23:29 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 29 May 2006 20:23:29 -0700
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <447B92D7.1060808@canterbury.ac.nz>
References: <20060529171147.GA1717@code0.codespeak.net>
	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
	<20060529195453.GB14908@code0.codespeak.net>
	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>
	<20060529213428.GA20141@code0.codespeak.net>
	<20060529220438.GA22838@code0.codespeak.net>
	<e5fs86$a47$1@sea.gmane.org> <447B92D7.1060808@canterbury.ac.nz>
Message-ID: <ca471dc20605292023m6f5371bfo6a1e90c80f2386b0@mail.gmail.com>

On 5/29/06, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Fredrik Lundh wrote:
>
> > well, the empty string is a valid substring of all possible strings
> > (there are no "null" strings in Python).  you get the same behaviour
> > from slicing, the "in" operator, "replace" (this was discussed on the
> > list last week), "count", etc.
>
> Although Tim pointed out that replace() only regards
> n+1 empty strings as existing in a string of lenth
> n. So for consistency, find() should only find them
> in those places, too.

And "abc".count("") should return 4.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue May 30 06:00:04 2006
From: guido at python.org (Guido van Rossum)
Date: Mon, 29 May 2006 21:00:04 -0700
Subject: [Python-Dev] Integer representation (Was: ssize_t question:
	longs in header files)
In-Reply-To: <447B7189.30408@v.loewis.de>
References: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>
	<447B66B7.90107@v.loewis.de>
	<9e804ac0605291433x54dd64e7o10acfc3076f5ffb0@mail.gmail.com>
	<447B6B1A.2020107@v.loewis.de>
	<9e804ac0605291457s784e6089ge84e6315feb2c3cd@mail.gmail.com>
	<447B7189.30408@v.loewis.de>
Message-ID: <ca471dc20605292100q16f57fe1qd645596ad1492dae@mail.gmail.com>

[Adding the py3k list; please remove python-dev in followups.]

On 5/29/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> I thought Py3k will have a single integer type whose representation
> varies depending on the value being represented.

That's one proposal. Another is to have an abstract 'int' type with
two concrete subtypes, e.g. 'short' and 'long', corresponding to
today's int and long. At the C level the API should be unified so C
programmers are isolated from the difference (they aren't today).

> I haven't seen an actual proposal for such a type,

I'm not sure that my proposal above has ever been said out loud. I'm
also not partial; I think we may have to do an experiment to decide.

> so let me make one:
>
> struct PyInt{
>   struct PyObject ob;
>   Py_ssize_t value_or_size;
>   char is_long;
>   digit ob_digit[1];
> };
>
> If is_long is false, then value_or_size is the value (represented
> as Py_ssize_t), else the value is in ob_digit, and value_or_size
> is the size.

Nice. I guess if we store the long value in big-endian order we could
drop is_long, since the first digit of the long would always be
nonzero. This would save a byte (on average) for the longs, but it
would do nothing for the wasted space for short ints.

> PyLong_* will be synonyms for PyInt_*.

Why do we need to keep the PyLong_* APIs at all? Even at the Python
level we're not planning any backward compatibility features; at the C
level I like even more freedom to break things.

> PyInt_FromLong/AsLong will
> continue to exist; PyInt_AsLong will indicate an overflow with -1.
> Likewise, PyArg_ParseTuple "i" will continue to produce int, and
> raise an exception (OverflowError?) when the value is out of range.
>
> C code can then decide whether to parse a Python integer as
> C int, long, long long, or ssize_t.

Nice. I like the unified API and I like using Py_ssize_t instead of
long for the value; this ensures that an int can hold a pointer (if we
allow for signed pointers) and matches the native word size better on
Windows (I guess it makes no difference for any other platform, where
ssize_t and long already have the same size).

I worry about all the wasted space for alignment caused by the extra
flag byte though. That would be 4 byte per integer on 32-bit machines
(where they are currently 12 bytes) and 8 bytes on 64-bit machines
(where they are currently 24 bytes).

That's why I'd like my alternative proposal (int as ABC and two
subclasses that may remain anonymous to the Python user); it'll save
the alignment waste for short ints and will let us use a smaller int
type for the size for long ints (if we care about the latter).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nnorwitz at gmail.com  Tue May 30 06:06:30 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 29 May 2006 21:06:30 -0700
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <447B6B77.3050608@v.loewis.de>
References: <e5bhs7$509$1@sea.gmane.org>
	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>
	<e5cjgn$3pg$1@sea.gmane.org>
	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>
	<e5fk8j$h8f$1@sea.gmane.org>
	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>
	<e5fn8l$ppo$1@sea.gmane.org> <447B64B0.3080107@v.loewis.de>
	<e5fotf$v27$1@sea.gmane.org> <447B6B77.3050608@v.loewis.de>
Message-ID: <ee2a432c0605292106r57a44579h414ca2c6b8ed3ed8@mail.gmail.com>

On 5/29/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Georg Brandl wrote:
> > I thought more about PyArg_Parse for __deprecated__.
>
> Ah, PyArg_Parse can, of course. Having it actually do that should not
> hurt, either - but you need a reliable check whether the compiler
> supports __deprecated__.

Already done for gcc, see Py_DEPRECATED in pyport.h.  Would be nice if
someone could add support on Windows.

n

From tim.peters at gmail.com  Tue May 30 06:10:08 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 30 May 2006 00:10:08 -0400
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <ca471dc20605292023m6f5371bfo6a1e90c80f2386b0@mail.gmail.com>
References: <20060529171147.GA1717@code0.codespeak.net>
	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
	<20060529195453.GB14908@code0.codespeak.net>
	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>
	<20060529213428.GA20141@code0.codespeak.net>
	<20060529220438.GA22838@code0.codespeak.net>
	<e5fs86$a47$1@sea.gmane.org> <447B92D7.1060808@canterbury.ac.nz>
	<ca471dc20605292023m6f5371bfo6a1e90c80f2386b0@mail.gmail.com>
Message-ID: <1f7befae0605292110s12a7d709pbbb6a4486cb09146@mail.gmail.com>

[Greg Ewing]
>> Although Tim pointed out that replace() only regards
>> n+1 empty strings as existing in a string of lenth
>> n. So for consistency, find() should only find them
>> in those places, too.

[Guido]
> And "abc".count("") should return 4.

And it does, but too much context was missing in Greg's reply to make
sense of his comment.  In context Greg was seeming to say that

    "abc".count("", 100)

should return 0, and

    "abc".find("", 100)

should return -1, since "the only" empty substrings of "abc" are at
slices 0:0, 1:1, 2:2 and 3:3.  In fact we have

>>> "abc".count("", 100)
1
>>> u"abc".count("", 100)
1
>>> "abc".find("", 100)
100
>>> u"abc".find("", 100)
100

today, although the idea that find() can return an index that doesn't
exist in the string is particularly jarring.  Since we also have:

>>> "abc".rfind("", 100)
3
>>> u"abc".rfind("", 100)
3

it's twice as jarring that "searching from the right" finds the target
at a _smaller_ index than "searching from the left".

From nnorwitz at gmail.com  Tue May 30 06:15:53 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 29 May 2006 21:15:53 -0700
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>
References: <e5bhs7$509$1@sea.gmane.org>
	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>
	<e5cjgn$3pg$1@sea.gmane.org>
	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>
	<e5fk8j$h8f$1@sea.gmane.org>
	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>
Message-ID: <ee2a432c0605292115h125fa64kea216256f96f35e@mail.gmail.com>

On 5/29/06, Thomas Wouters <thomas at python.org> wrote:
> On 5/29/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
> >
> > this is a clear case of unnecessary meddling.  removing it won't remove
> > much code (a whopping 11 lines is dedicated to this), nor give any speed
> > ups whatsoever; all you're doing is generating extra work and support
> > issues for a bunch of third-party developers.  trust me, we have better
> > things to do with our time.
> >
> > -1 on meddling with this before 3.0.
>
> -1 from me, too, for the same reason. It would be nice if the use of
> PyArg_Parse could generate a (C) compile-time warning, but since it can't, I
> think a runtime warning for this minor case is just overkill. (The C
> compile-time warning would be useful in other cases, too... hmm, perhaps we
> can do some post-processing of .o files, looking for deprecated symbols left
> undefined...)

How can users find the implicit use of METH_OLDARGS in code like this:

 static struct PyMethodDef gestalt_methods[] = {
       {"gestalt", gestalt_gestalt},
       {NULL, NULL} /* Sentinel */
 };

 static PyMethodDef SwiMethods[]= {
 { "swi", swi_swi,0},
  { NULL, NULL}
};

The first one was taken straight from one of Georg's checkins, the
second one was modified.
How many people knew we were still using METH_OLDARGS in these places?
 I know this can be done with some simple scripts, but that won't
catch all cases.  And many people won't run such a script, much less
write one.

OTOH, perhaps a deprecation warning on PyArgs_Parse() is sufficient?
What about that?  It doesn't address other cases where OLDARGS are
used without PyArgs_Parse though.

n

From kbk at shore.net  Tue May 30 06:23:03 2006
From: kbk at shore.net (Kurt B. Kaiser)
Date: Tue, 30 May 2006 00:23:03 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200605300423.k4U4N3tN024355@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  375 open ( -3) /  3264 closed (+26) /  3639 total (+23)
Bugs    :  910 open ( +3) /  5851 closed (+20) /  6761 total (+23)
RFE     :  217 open ( -1) /   220 closed ( +3) /   437 total ( +2)

New / Reopened Patches
______________________

Minor Correction to urllib2 HOWTO  (2006-05-20)
CLOSED http://python.org/sf/1492147  opened by  Mike Foord

None missing from keyword module  (2006-05-20)
CLOSED http://python.org/sf/1492218  opened by  ?iga Seilnacht

Socket-object convenience function: getpeercred().  (2006-05-20)
       http://python.org/sf/1492240  opened by  Heiko Wundram

urllib2 HOWTO - Further (minor) Corrections  (2006-05-20)
CLOSED http://python.org/sf/1492255  opened by  Mike Foord

Windows CE support (part 1)  (2006-05-21)
CLOSED http://python.org/sf/1492356  opened by  Luke Dunstan

Windows CE support (part 2)  (2006-05-27)
       http://python.org/sf/1495999  opened by  Luke Dunstan

Unification of list-comp and for syntax  (2006-05-21)
       http://python.org/sf/1492509  opened by  Heiko Wundram

distinct error type from shutil.move()  (2006-05-22)
       http://python.org/sf/1492704  opened by  Zooko O'Whielacronx

Improvements to ceval.c  (2006-05-22)
       http://python.org/sf/1492828  opened by  mrjbq7

Speed up gzip.readline (~40%)  (2005-09-04)
CLOSED http://python.org/sf/1281707  reopened by  marumari

Speed up gzip.readline (~40%)  (2005-09-04)
CLOSED http://python.org/sf/1281707  reopened by  marumari

Allow build without tracing  (2006-05-22)
CLOSED http://python.org/sf/1493102  opened by  Steve Holden

Performance enhancements for struct module  (2006-05-23)
CLOSED http://python.org/sf/1493701  opened by  Bob Ippolito

Documentation for new Struct object  (2006-05-24)
       http://python.org/sf/1494140  opened by  Bob Ippolito

PyUnicode_Resize cannot resize shared unicode object  (2006-05-25)
CLOSED http://python.org/sf/1494487  opened by  Hirokazu Yamamoto

Numeric characters not recognized.  (2006-05-24)
CLOSED http://python.org/sf/1494554  opened by  Anders Chrigstr?m

BaseWidget.destroy updates master's childern too early  (2006-05-24)
       http://python.org/sf/1494750  opened by  Greg Couch

Scalable zipfile extension  (2003-09-27)
CLOSED http://python.org/sf/813436  reopened by  jafo

Remove types.InstanceType and new.instance  (2006-05-26)
CLOSED http://python.org/sf/1495675  opened by  Collin Winter

Fix test_exceptions.py  (2006-05-27)
       http://python.org/sf/1496135  opened by  Collin Winter

urllib2 HTTPPasswordMgr: default ports  (2006-05-28)
CLOSED http://python.org/sf/1496206  opened by  John J Lee

Convert Tkinter to METH_VARARGS style  (2006-05-29)
       http://python.org/sf/1496952  opened by  Georg Brandl

deprecate METH_OLDARGS  (2006-05-29)
CLOSED http://python.org/sf/1496957  opened by  Georg Brandl

urllib2: ensure digest auth happens in preference to basic  (2006-05-29)
CLOSED http://python.org/sf/1497027  opened by  John J Lee

Let dicts propagate the exceptions in user __eq__  (2006-05-29)
       http://python.org/sf/1497053  opened by  Armin Rigo

Patches Closed
______________

Minor Correction to urllib2 HOWTO  (2006-05-21)
       http://python.org/sf/1492147  closed by  quiver

None missing from keyword module  (2006-05-20)
       http://python.org/sf/1492218  closed by  gbrandl

urllib2 HOWTO - Further (minor) Corrections  (2006-05-21)
       http://python.org/sf/1492255  closed by  quiver

Windows CE support (part 1)  (2006-05-21)
       http://python.org/sf/1492356  closed by  loewis

PC new-logo-based icon set  (2006-05-17)
       http://python.org/sf/1490384  closed by  loewis

Cleaned up 16x16px icons for windows.  (2006-05-03)
       http://python.org/sf/1481304  closed by  loewis

property to get the docstring from fget  (2004-08-08)
       http://python.org/sf/1005461  closed by  gbrandl

Speed up gzip.readline (~40%)  (2005-09-04)
       http://python.org/sf/1281707  closed by  etrepum

Speed up gzip.readline (~40%)  (2005-09-04)
       http://python.org/sf/1281707  closed by  etrepum

Speed up gzip.readline (~40%)  (2005-09-04)
       http://python.org/sf/1281707  closed by  etrepum

scary frame speed hacks  (2004-01-13)
       http://python.org/sf/876206  closed by  tim_one

Allow build without tracing  (2006-05-22)
       http://python.org/sf/1493102  closed by  gbrandl

MacOSX: distutils support for -arch and -isysroot flags  (2006-05-13)
       http://python.org/sf/1488098  closed by  ronaldoussoren

Performance enhancements for struct module  (2006-05-23)
       http://python.org/sf/1493701  closed by  etrepum

Fix for int(string, base) wrong answers (take 2)  (2005-10-24)
       http://python.org/sf/1335972  closed by  tim_one

remove 4 ints from PyFrameObject  (2005-10-25)
       http://python.org/sf/1337051  closed by  tim_one

PyUnicode_Resize cannot resize shared unicode object  (2006-05-24)
       http://python.org/sf/1494487  closed by  doerwalter

Numeric characters not recognized.  (2006-05-24)
       http://python.org/sf/1494554  closed by  loewis

PyLong_FromString optimization  (2006-03-04)
       http://python.org/sf/1442927  closed by  tim_one

Remove some invariant conditions and assert in ceval  (2005-02-20)
       http://python.org/sf/1145039  closed by  tim_one

Scalable zipfile extension  (2003-09-27)
       http://python.org/sf/813436  closed by  jafo

Reduce number of open calls on startup  (2004-03-23)
       http://python.org/sf/921466  closed by  gbrandl

Remove types.InstanceType and new.instance  (2006-05-26)
       http://python.org/sf/1495675  closed by  gvanrossum

urllib2 HTTPPasswordMgr: default ports  (2006-05-27)
       http://python.org/sf/1496206  closed by  gbrandl

urllib2 handler naming convention collision  (2004-06-13)
       http://python.org/sf/972322  closed by  gbrandl

Rename functional to functools  (2006-04-29)
       http://python.org/sf/1478788  closed by  ncoghlan

deprecate METH_OLDARGS  (2006-05-29)
       http://python.org/sf/1496957  closed by  gbrandl

urllib2: ensure digest auth happens in preference to basic  (2006-05-29)
       http://python.org/sf/1497027  closed by  gbrandl

New / Reopened Bugs
___________________

keyword and topic help broken in Pythonwin IDE  (2006-05-15)
CLOSED http://python.org/sf/1489051  reopened by  gbrandl

Wierd Floating Point on FreeBSD4   (2006-05-20)
CLOSED http://python.org/sf/1492293  opened by  David Abrahams

bsddb: db never opened for writing forgets its size  (2006-05-22)
       http://python.org/sf/1493322  opened by  Steven Taschuk

time.strftime() %z error  (2006-05-23)
CLOSED http://python.org/sf/1493676  opened by  Cillian Sharkey

Setting a socket timeout breaks SSL  (2006-05-24)
CLOSED http://python.org/sf/1493922  opened by  Andrew Pollock

Cannot use high-numbered sockets in 2.4.3  (2006-05-24)
       http://python.org/sf/1494314  opened by  Michael Smith

SVN longobject.c compiler warnings  (2006-05-24)
CLOSED http://python.org/sf/1494387  opened by  Joel Konkle-Parker

unchecked PyObject_SetAttrString() in PyErr_SyntaxLocation()  (2006-05-24)
CLOSED http://python.org/sf/1494605  opened by  Brett Cannon

PyOS_ascii_strtod() uses malloc()  (2006-05-24)
CLOSED http://python.org/sf/1494671  opened by  Brett Cannon

PyOS_ascii_strtod() uses malloc() directly  (2006-05-25)
CLOSED http://python.org/sf/1494671  reopened by  gbrandl

pyclbr add whitespace chars to base class name  (2006-05-25)
CLOSED http://python.org/sf/1494787  opened by  Dmitry Vasiliev

int/long assume that the buffer ends in \0 => CRASH  (2006-05-25)
       http://python.org/sf/1495033  opened by  James Y Knight

sys.getfilesystemencoding  (2006-05-25)
CLOSED http://python.org/sf/1495089  opened by  Felix Wiemann

OverflowWarning still in 2.5 despite docstring  (2006-05-25)
CLOSED http://python.org/sf/1495175  opened by  Jim Jewett

Install under osx 10.4.6 breaks shell.  (2006-05-25)
       http://python.org/sf/1495210  opened by  ephantom

W3C <-> Python DOM type mapping docs need updating  (2006-05-25)
       http://python.org/sf/1495229  opened by  Mike Brown

make altinstall installs pydoc  (2006-05-26)
       http://python.org/sf/1495488  opened by  Tim Heaney

os.listdir(): inconsistent behavior with trailing spaces  (2006-05-26)
CLOSED http://python.org/sf/1495754  opened by  Kenneth J. Pronovici

Still a problem  (2006-05-26)
       http://python.org/sf/1495802  opened by  jpsc986_hj

test_float segfaults with SIGFPE on FreeBSD 6.0 / Alpha  (2006-05-27)
       http://python.org/sf/1496032  opened by  Bob Ippolito

Incorrect error message related to **kwargs  (2006-05-28)
CLOSED http://python.org/sf/1496278  opened by  Collin Winter

strptime: wrong default values used to fill in missing data  (2006-05-28)
       http://python.org/sf/1496315  opened by  Markus Gritsch

tarfile.py: dict order dependency  (2006-05-28)
       http://python.org/sf/1496501  opened by  Armin Rigo

Modules in cwd can't be imported in interactive mode  (2006-05-28)
CLOSED http://python.org/sf/1496539  opened by  ?iga Seilnacht

Building Python 2.4.3 on Solaris 9/10 with Sun Studio 11  (2006-05-28)
       http://python.org/sf/1496561  opened by  Andreas

Bugs Closed
___________

Segmentation fault while using Tkinter  (2006-05-01)
       http://python.org/sf/1479586  closed by  loewis

keyword and topic help broken in Pythonwin IDE  (2006-05-15)
       http://python.org/sf/1489051  closed by  gbrandl

Wierd Floating Point on FreeBSD4   (2006-05-21)
       http://python.org/sf/1492293  closed by  loewis

time.strftime() %z error  (2006-05-23)
       http://python.org/sf/1493676  closed by  bcannon

int(string, base) wrong answers  (2005-10-22)
       http://python.org/sf/1334662  closed by  tim_one

Setting a socket timeout breaks SSL  (2006-05-24)
       http://python.org/sf/1493922  closed by  loewis

SVN longobject.c compiler warnings  (2006-05-24)
       http://python.org/sf/1494387  closed by  tim_one

unchecked PyObject_SetAttrString() in PyErr_SyntaxLocation()  (2006-05-24)
       http://python.org/sf/1494605  closed by  gbrandl

PyOS_ascii_strtod() uses malloc() directly  (2006-05-25)
       http://python.org/sf/1494671  closed by  gbrandl

PyOS_ascii_strtod() uses malloc() directly  (2006-05-24)
       http://python.org/sf/1494671  closed by  bcannon

Erroneous \url command in python.sty  (2005-09-03)
       http://python.org/sf/1281291  closed by  fdrake

pyclbr add whitespace chars to base class name  (2006-05-25)
       http://python.org/sf/1494787  closed by  gbrandl

Setting socket timeout crashes SSL  (2005-02-27)
       http://python.org/sf/1153016  closed by  twouters

Multiple dots in relative import statement raise SyntaxError  (2006-05-15)
       http://python.org/sf/1488915  closed by  twouters

sys.getfilesystemencoding  (2006-05-25)
       http://python.org/sf/1495089  closed by  gbrandl

OverflowWarning still in 2.5 despite docstring  (2006-05-25)
       http://python.org/sf/1495175  closed by  gbrandl

os.listdir(): inconsistent behavior with trailing spaces  (2006-05-26)
       http://python.org/sf/1495754  closed by  gbrandl

urllib2 Request.get_host and proxies  (2003-03-05)
       http://python.org/sf/698374  closed by  jjlee

Incorrect error message related to **kwargs  (2006-05-28)
       http://python.org/sf/1496278  closed by  gbrandl

Modules in cwd can't be imported in interactive mode  (2006-05-28)
       http://python.org/sf/1496539  closed by  gbrandl

can't pickle slice objects  (2006-05-12)
       http://python.org/sf/1487420  closed by  gbrandl

New / Reopened RFE
__________________

Create "Performance" category?  (2006-05-22)
CLOSED http://python.org/sf/1492856  opened by  Sean Reifschneider

Integer bit operations performance improvement.  (2006-05-22)
       http://python.org/sf/1492860  opened by  Sean Reifschneider

sys.getrefcount should be in gc  (2006-05-24)
       http://python.org/sf/1494595  opened by  Jim Jewett

RFE Closed
__________

Create "Performance" category?  (2006-05-22)
       http://python.org/sf/1492856  closed by  tim_one

Create a fat build on darwin  (2005-06-10)
       http://python.org/sf/1218333  closed by  ronaldoussoren

Add encoding to DocFileSuite  (2004-12-08)
       http://python.org/sf/1080727  closed by  quiver


From tim.peters at gmail.com  Tue May 30 06:36:32 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 30 May 2006 00:36:32 -0400
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <20060529213428.GA20141@code0.codespeak.net>
References: <20060529171147.GA1717@code0.codespeak.net>
	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
	<20060529195453.GB14908@code0.codespeak.net>
	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>
	<20060529213428.GA20141@code0.codespeak.net>
Message-ID: <1f7befae0605292136h5dd6a37v79a84188bac957af@mail.gmail.com>

[Armin Rigo]
> ...
> ...
> Am I allowed to be grumpy here, and repeat that speed should not be
> used to justify bugs?

As a matter of fact, you are.  OTOH, nobody at the sprint made that
argument, so nobody actually feels shame on that count :-)

I apologize for the insufficiently reviewed exception-rework patch
anyway.  The massive try/raise/except slowdown in 2.5a2 (compared to
2.4.3) was depressing and surprising, and figuring it out & repairing
it consumed various peoples' time across the whole week.  By the time
Saturday came around, I was fried enough that "well, people have
hacked on this all week, the slowdown finally got replaced by a
speedup, and all the tests pass -- enough already, let's go home!"
ruled.  So, as everyone suspected all along, it's Neal's fault for not
running his refleak test more often ;-)

> ...
> It would have saved me two hours-long debugging sessions, and I consider
> myself an everyday Python user, so yes, I think so.

I'll try to review your patch carefully Tuesday, but I'm willing to
accept a slowdown for this too (although from a quick look I doubt
there's a measurable one, and your approach is wholly sane).

From nnorwitz at gmail.com  Tue May 30 06:48:00 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 29 May 2006 21:48:00 -0700
Subject: [Python-Dev] feature request: inspect.isgenerator
In-Reply-To: <e5fkdh$gps$1@sea.gmane.org>
References: <loom.20060529T092512-825@post.gmane.org>
	<e5fidd$aod$1@sea.gmane.org> <e5fjnq$f8j$1@sea.gmane.org>
	<e5fkdh$gps$1@sea.gmane.org>
Message-ID: <ee2a432c0605292148l1acd5d87t903261106b3eb7c9@mail.gmail.com>

On 5/29/06, Georg Brandl <g.brandl at gmx.net> wrote:
> Fredrik Lundh wrote:
> >
> > is there some natural and obvious connection between generators and that
> > number that I'm missing, or is that constant perhaps best hidden inside
> > some introspection support module?
>
> It seems to be a combination of CO_NOFREE, CO_GENERATOR, CO_OPTIMIZED and
> CO_NEWLOCALS.
>
> The first four CO_ constants are already in inspect.py, the newer ones
> (like CO_GENERATOR) aren't.

All these constants are declared in compiler.consts.  Some are also
defined in __future__.  It would be better to have them all in a
single place.

> I wonder whether a check shouldn't just return (co_flags & 0x20), which
> is CO_GENERATOR.

Makes more sense.

n

From martin at v.loewis.de  Tue May 30 07:26:10 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 30 May 2006 07:26:10 +0200
Subject: [Python-Dev] Integer representation (Was: ssize_t question:
 longs in header files)
In-Reply-To: <ca471dc20605292100q16f57fe1qd645596ad1492dae@mail.gmail.com>
References: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>	
	<447B66B7.90107@v.loewis.de>	
	<9e804ac0605291433x54dd64e7o10acfc3076f5ffb0@mail.gmail.com>	
	<447B6B1A.2020107@v.loewis.de>	
	<9e804ac0605291457s784e6089ge84e6315feb2c3cd@mail.gmail.com>	
	<447B7189.30408@v.loewis.de>
	<ca471dc20605292100q16f57fe1qd645596ad1492dae@mail.gmail.com>
Message-ID: <447BD772.3090303@v.loewis.de>

Guido van Rossum wrote:
>> struct PyInt{
>>   struct PyObject ob;
>>   Py_ssize_t value_or_size;
>>   char is_long;
>>   digit ob_digit[1];
>> };
>>
> 
> Nice. I guess if we store the long value in big-endian order we could
> drop is_long, since the first digit of the long would always be
> nonzero. This would save a byte (on average) for the longs, but it
> would do nothing for the wasted space for short ints.

Right; alternatively, the top-most bit of ob_digit[0] could also be
used, as longs have currently 15-bit digits.

> Why do we need to keep the PyLong_* APIs at all? Even at the Python
> level we're not planning any backward compatibility features; at the C
> level I like even more freedom to break things.

Indeed, they should get dropped.

> I worry about all the wasted space for alignment caused by the extra
> flag byte though. That would be 4 byte per integer on 32-bit machines
> (where they are currently 12 bytes) and 8 bytes on 64-bit machines
> (where they are currently 24 bytes).

I think ints should get managed by PyMalloc in Py3k. With my proposal,
an int has 16 bytes on a 32-bit machine, so there wouldn't be any
wastage for PyMalloc (which allocates 16 bytes for 12-byte objects,
anyway). On a 64-bit machine, it would indeed waste 8 bytes per
int.

> That's why I'd like my alternative proposal (int as ABC and two
> subclasses that may remain anonymous to the Python user); it'll save
> the alignment waste for short ints and will let us use a smaller int
> type for the size for long ints (if we care about the latter).

I doubt they can remain anonymous. People often dispatch by type
(e.g. pickle, xmlrpclib, ...), and need to put the type into a
dictionary. If the type is anonymous, they will do

   dispatch[type(0)] = marshal_int
   dispatch[type(sys.maxint+1)] = marshal_int

Plus, their current code as

   dispatch[int] = marshal_int

which will silently break (although it won't be silent if they also
have dispatch[long] = marshal_long).

Regards,
Martin


From martin at v.loewis.de  Tue May 30 07:45:02 2006
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 30 May 2006 07:45:02 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <ee2a432c0605292115h125fa64kea216256f96f35e@mail.gmail.com>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>	<e5fk8j$h8f$1@sea.gmane.org>	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>
	<ee2a432c0605292115h125fa64kea216256f96f35e@mail.gmail.com>
Message-ID: <447BDBDE.4080001@v.loewis.de>

Neal Norwitz wrote:
> How can users find the implicit use of METH_OLDARGS in code like this:
> 
>  static struct PyMethodDef gestalt_methods[] = {
>        {"gestalt", gestalt_gestalt},
>        {NULL, NULL} /* Sentinel */
>  };
> 
>  static PyMethodDef SwiMethods[]= {
>  { "swi", swi_swi,0},
>   { NULL, NULL}
> };

They can't know they do. Of course, if they do, they likely also use
PyArg_Parse to process the arguments.

> OTOH, perhaps a deprecation warning on PyArgs_Parse() is sufficient?
> What about that?  It doesn't address other cases where OLDARGS are
> used without PyArgs_Parse though.

What other cases remain? People might have complex argument processing
procedure not involving PyArg_Parse, these would just break with a
runtime error in Py3k. If the module is maintained, it should be easy
to fix it. If the module is unmaintained, producing a warning now
might not help, either.

Regards,
Martin

From fredrik at pythonware.com  Tue May 30 07:46:46 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 May 2006 07:46:46 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <ca471dc20605292023m6f5371bfo6a1e90c80f2386b0@mail.gmail.com>
References: <20060529171147.GA1717@code0.codespeak.net>	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>	<20060529195453.GB14908@code0.codespeak.net>	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>	<20060529213428.GA20141@code0.codespeak.net>	<20060529220438.GA22838@code0.codespeak.net>	<e5fs86$a47$1@sea.gmane.org>
	<447B92D7.1060808@canterbury.ac.nz>
	<ca471dc20605292023m6f5371bfo6a1e90c80f2386b0@mail.gmail.com>
Message-ID: <e5gm85$j1c$1@sea.gmane.org>

Guido van Rossum wrote:

>>> well, the empty string is a valid substring of all possible strings
>>> (there are no "null" strings in Python).  you get the same behaviour
>>> from slicing, the "in" operator, "replace" (this was discussed on the
>>> list last week), "count", etc.
 >
>> Although Tim pointed out that replace() only regards
>> n+1 empty strings as existing in a string of lenth
>> n. So for consistency, find() should only find them
>> in those places, too.

depends on how you interpret the reference to "slices" in the docs. 
"abc"[100:] is an empty string, and so is "abc"[100:100].

> And "abc".count("") should return 4.

it does, and has always done (afaik).  "abc".count("", 100) did use to 
return -96, though, which is hard to explain in terms of anything else.

</F>


From fredrik at pythonware.com  Tue May 30 07:48:50 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 May 2006 07:48:50 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <20060529223557.GC22838@code0.codespeak.net>
References: <20060529171147.GA1717@code0.codespeak.net>	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>	<20060529195453.GB14908@code0.codespeak.net>	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>	<20060529213428.GA20141@code0.codespeak.net>	<20060529220438.GA22838@code0.codespeak.net>	<e5fs86$a47$1@sea.gmane.org>
	<20060529223557.GC22838@code0.codespeak.net>
Message-ID: <e5gmc0$j1c$2@sea.gmane.org>

Armin Rigo wrote:

> I know this.  These corner cases are debatable and different answers
> could be seen as correct, as I think is the case for find().  My point
> was different: I was worrying that the recent change in str.find() would
> needlessly send existing and working programs into infinite loops, which
> can be a particularly bad kind of failure for some applications.

since "abc".find("", 0) == 0, I would have thought that a program that 
searched for an empty string in a loop wouldn't get anywhere at all.

</F>


From nnorwitz at gmail.com  Tue May 30 08:14:33 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 29 May 2006 23:14:33 -0700
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <447BDBDE.4080001@v.loewis.de>
References: <e5bhs7$509$1@sea.gmane.org>
	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>
	<e5cjgn$3pg$1@sea.gmane.org>
	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>
	<e5fk8j$h8f$1@sea.gmane.org>
	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>
	<ee2a432c0605292115h125fa64kea216256f96f35e@mail.gmail.com>
	<447BDBDE.4080001@v.loewis.de>
Message-ID: <ee2a432c0605292314y6bcc9dadhcd02b8f3b3144d3e@mail.gmail.com>

On 5/29/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>
> > OTOH, perhaps a deprecation warning on PyArgs_Parse() is sufficient?
> > What about that?  It doesn't address other cases where OLDARGS are
> > used without PyArgs_Parse though.
>
> What other cases remain? People might have complex argument processing
> procedure not involving PyArg_Parse, these would just break with a
> runtime error in Py3k. If the module is maintained, it should be easy
> to fix it. If the module is unmaintained, producing a warning now
> might not help, either.

One case would be if the function doesn't take any args, but the args
aren't checked, just ignored.  Another case I was thinking about was
PyArg_NoArgs which is a macro, but that just calls PyArg_Parse, so it
would get flagged.  I can't imagine there would be enough of these
cases to matter.

I'd be satisfied with a deprecation warning for PyArg_Parse, though we
(*) should really figure out how to make it work on Windows.  I
haven't seen anyone object to the C compiler deprecation warning.

n
--
* Where we means someone with a Windows box of course. :-)

From nnorwitz at gmail.com  Tue May 30 08:54:10 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 29 May 2006 23:54:10 -0700
Subject: [Python-Dev] fixing buildbots
Message-ID: <ee2a432c0605292354j4d454c1ehc55c8ee29dac9293@mail.gmail.com>

I've been starting to get some of the buildbots working again.  There
was some massive problem on May 25 where a ton of extra files were
left around.  I can't remember if I saw something about that at the
NFS sprint or not.

There is a lingering problem that I can't fix on all the boxes.  Namely:

  cp Modules/Setup{.dist,}

Should we always do that step before we build on the buildbots?  The
warning about Setup.dist being newer than Setup is displayed.  I can
understand why we wouldn't want to unconditionally overwrite a user's
modified Setup.  However, for the buildbots, it seems safer to always
overwrite it.

Any objections?  Any ideas for the best way to implement this?  A
separate BB step for Unix clients?   In the makefile?

Martin, I would have fixed it on your Solaris box, but I don't think I
can get access to the buildbot's account.

n

From g.brandl at gmx.net  Tue May 30 09:19:39 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 30 May 2006 09:19:39 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <ee2a432c0605292314y6bcc9dadhcd02b8f3b3144d3e@mail.gmail.com>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>	<e5fk8j$h8f$1@sea.gmane.org>	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>	<ee2a432c0605292115h125fa64kea216256f96f35e@mail.gmail.com>	<447BDBDE.4080001@v.loewis.de>
	<ee2a432c0605292314y6bcc9dadhcd02b8f3b3144d3e@mail.gmail.com>
Message-ID: <e5grmb$21r$1@sea.gmane.org>

Neal Norwitz wrote:
> On 5/29/06, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>>
>> > OTOH, perhaps a deprecation warning on PyArgs_Parse() is sufficient?
>> > What about that?  It doesn't address other cases where OLDARGS are
>> > used without PyArgs_Parse though.
>>
>> What other cases remain? People might have complex argument processing
>> procedure not involving PyArg_Parse, these would just break with a
>> runtime error in Py3k. If the module is maintained, it should be easy
>> to fix it. If the module is unmaintained, producing a warning now
>> might not help, either.
> 
> One case would be if the function doesn't take any args, but the args
> aren't checked, just ignored.  Another case I was thinking about was
> PyArg_NoArgs which is a macro, but that just calls PyArg_Parse, so it
> would get flagged.  I can't imagine there would be enough of these
> cases to matter.
> 
> I'd be satisfied with a deprecation warning for PyArg_Parse, though we
> (*) should really figure out how to make it work on Windows.  I
> haven't seen anyone object to the C compiler deprecation warning.

There is something at
http://msdn2.microsoft.com/en-us/library/044swk7y.aspx

but it says only "C++".

Georg


From toon.knapen at fft.be  Tue May 30 09:54:46 2006
From: toon.knapen at fft.be (Toon Knapen)
Date: Tue, 30 May 2006 09:54:46 +0200
Subject: [Python-Dev] problem with compilation flags used by distutils
Message-ID: <447BFA46.3060504@fft.be>

We are compiling python on a multitude of platforms using a multitude of 
compilers and encountered problems when building python and 
python-extensions when specific flags need to be used by the compiler.

For instance, currently we're building the trunk on a Sun Solaris 9 
machine using Sun Studio 11. To make python compile in 64bit we execute 
the following sequence:

<commands>
export CC=cc
export CFLAGS="-xarch=v9"
export EXTRA_CFLAGS="-xarch=v9"
export CXX=CC
export CXXFLAGS="-xarch=v9"
export F77=f77
export FFLAGS="-xarch=v9"
export LDFLAGS="-xarch=v9"
./configure
</commands>

Unless we define 'EXTRA_CFLAGS' (which is not documented in `configure 
--help`) we need to add the "-xarch=v9" command-line option also 
manually to the generated Makefile to make sure python compiles with 
this option.

Next when compiling python-extensions (e.g. numarray) using distutils, 
distutils does not know it needs to pass the "-xarch=v9" option when 
linking a dynamic library. Distutils will still consult the LDFLAGS 
env.var. but IMO it also needs to use the content of LDFLAGS at the time 
python itself was compiled to secure that extensions are built 
compatible with python itself.


toon



From arigo at tunes.org  Tue May 30 11:14:45 2006
From: arigo at tunes.org (Armin Rigo)
Date: Tue, 30 May 2006 11:14:45 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <e5gmc0$j1c$2@sea.gmane.org>
References: <20060529171147.GA1717@code0.codespeak.net>
	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
	<20060529195453.GB14908@code0.codespeak.net>
	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>
	<20060529213428.GA20141@code0.codespeak.net>
	<20060529220438.GA22838@code0.codespeak.net>
	<e5fs86$a47$1@sea.gmane.org>
	<20060529223557.GC22838@code0.codespeak.net>
	<e5gmc0$j1c$2@sea.gmane.org>
Message-ID: <20060530091445.GA2055@code0.codespeak.net>

Hi Fredrik,

On Tue, May 30, 2006 at 07:48:50AM +0200, Fredrik Lundh wrote:
> since "abc".find("", 0) == 0, I would have thought that a program that 
> searched for an empty string in a loop wouldn't get anywhere at all.

Indeed.  And when this bug was found in the program in question, a
natural fix was to add 1 to the start position if the searched string
was empty, which used to ensure that the loop terminates.


A bientot,

Armin

From bob at redivi.com  Tue May 30 11:16:09 2006
From: bob at redivi.com (Bob Ippolito)
Date: Tue, 30 May 2006 02:16:09 -0700
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <1f7befae0605292000k3ac8daa8n355fb03ffb74e00@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>
	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>
	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>
	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>
	<9ED42665-07AC-4819-B2B2-D899A56038B1@redivi.com>
	<7BD6DC02-C281-4C52-B9C7-6A05E11E8A17@redivi.com>
	<1f7befae0605292000k3ac8daa8n355fb03ffb74e00@mail.gmail.com>
Message-ID: <0486A093-ECAF-49DB-962B-9CFF47D7AAAB@redivi.com>


On May 29, 2006, at 8:00 PM, Tim Peters wrote:

> [Bob Ippolito]
>>> ...
>>> Actually, should this be a FutureWarning or a DeprecationWarning?
>
> Since it was never documented, UndocumentedBugGoingAwayError ;-)
> Short of that, yes, DeprecationWarning.  FutureWarning is for changes
> in non-exceptional behavior (.e.g, if we swapped the meanings of "<"
> and ">" in struct format codes, that would rate a FutureWarning
> subclass, line InsaneFutureWarning).
>
>> OK, this behavior is implemented in revision 46537:
>>
>> (this is from ./python.exe -Wall)
>>
>>  >>> import struct
>
> ...
>
>>  >>> struct.pack('<B', -1)
>> /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: struct
>> integer wrapping is deprecated
>>    return o.pack(*args)
>> /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B'
>> format requires 0 <= number <= 255
>>    return o.pack(*args)
>> '\xff'
>
> We certainly don't want to see two deprecation warnings for a single
> deprecated behavior.  I suggest eliminating the "struct integer
> wrapping" warning, mostly because I had no idea what it _meant_
> before reading the comments in _struct.c  ("wrapping" is used most
> often in a proxy or delegation context in Python these days).   "'B'
> format requires 0 <= number <= 255" is perfectly clear all by itself.

What should it be called instead of wrapping? When it says it's  
wrapping, it means that it's doing x &= (2 ^ (8 * n)) - 1 to force a  
number into meeting the expected range.

Reducing it to one warning instead of two is kinda difficult. Is it  
worth the trouble?

-bob


From ncoghlan at gmail.com  Tue May 30 11:41:23 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 30 May 2006 19:41:23 +1000
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <0486A093-ECAF-49DB-962B-9CFF47D7AAAB@redivi.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>	<9ED42665-07AC-4819-B2B2-D899A56038B1@redivi.com>	<7BD6DC02-C281-4C52-B9C7-6A05E11E8A17@redivi.com>	<1f7befae0605292000k3ac8daa8n355fb03ffb74e00@mail.gmail.com>
	<0486A093-ECAF-49DB-962B-9CFF47D7AAAB@redivi.com>
Message-ID: <447C1343.3020903@gmail.com>

Bob Ippolito wrote:
> On May 29, 2006, at 8:00 PM, Tim Peters wrote:
>> We certainly don't want to see two deprecation warnings for a single
>> deprecated behavior.  I suggest eliminating the "struct integer
>> wrapping" warning, mostly because I had no idea what it _meant_
>> before reading the comments in _struct.c  ("wrapping" is used most
>> often in a proxy or delegation context in Python these days).   "'B'
>> format requires 0 <= number <= 255" is perfectly clear all by itself.
> 
> What should it be called instead of wrapping? When it says it's  
> wrapping, it means that it's doing x &= (2 ^ (8 * n)) - 1 to force a  
> number into meeting the expected range.

"integer overflow masking" perhaps?

> Reducing it to one warning instead of two is kinda difficult. Is it  
> worth the trouble?

If there are cases where only one warning or the other triggers, it doesn't 
seem worth the effort to try and suppress one of them when they both trigger.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From p.f.moore at gmail.com  Tue May 30 12:14:36 2006
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 30 May 2006 11:14:36 +0100
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <e5grmb$21r$1@sea.gmane.org>
References: <e5bhs7$509$1@sea.gmane.org>
	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>
	<e5cjgn$3pg$1@sea.gmane.org>
	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>
	<e5fk8j$h8f$1@sea.gmane.org>
	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>
	<ee2a432c0605292115h125fa64kea216256f96f35e@mail.gmail.com>
	<447BDBDE.4080001@v.loewis.de>
	<ee2a432c0605292314y6bcc9dadhcd02b8f3b3144d3e@mail.gmail.com>
	<e5grmb$21r$1@sea.gmane.org>
Message-ID: <79990c6b0605300314ra840bdeg422c3067249a2c88@mail.gmail.com>

On 5/30/06, Georg Brandl <g.brandl at gmx.net> wrote:
> Neal Norwitz wrote:
> > I'd be satisfied with a deprecation warning for PyArg_Parse, though we
> > (*) should really figure out how to make it work on Windows.  I
> > haven't seen anyone object to the C compiler deprecation warning.
>
> There is something at
> http://msdn2.microsoft.com/en-us/library/044swk7y.aspx
>
> but it says only "C++".

I checked, and it works with C as well. (VC 7.1 toolkit compiler, but
I can't believe it makes a difference...)

Paul.

From fredrik at pythonware.com  Tue May 30 13:05:55 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 May 2006 13:05:55 +0200
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
References: <20060529171147.GA1717@code0.codespeak.net><008f01c68354$fc01dd70$dc00000a@RaymondLaptop1><20060529195453.GB14908@code0.codespeak.net><00de01c68363$308d37c0$dc00000a@RaymondLaptop1><20060529213428.GA20141@code0.codespeak.net><20060529220438.GA22838@code0.codespeak.net><e5fs86$a47$1@sea.gmane.org>
	<447B92D7.1060808@canterbury.ac.nz><ca471dc20605292023m6f5371bfo6a1e90c80f2386b0@mail.gmail.com>
	<1f7befae0605292110s12a7d709pbbb6a4486cb09146@mail.gmail.com>
Message-ID: <e5h8uj$ivl$1@sea.gmane.org>

Tim Peters wrote:

>>>> "abc".count("", 100)
> 1
>>>> u"abc".count("", 100)
> 1

which is the same as

>>> "abc"[100:].count("")
1

>>>> "abc".find("", 100)
> 100
>>>> u"abc".find("", 100)
> 100
>
> today, although the idea that find() can return an index that doesn't
> exist in the string is particularly jarring.  Since we also have:
>
>>>> "abc".rfind("", 100)
> 3
>>>> u"abc".rfind("", 100)
> 3
>
> it's twice as jarring that "searching from the right" finds the target
> at a _smaller_ index than "searching from the left".

well, string[pos:pos+len(substring)] == substring is true in both cases.

would "abc".find("", 100) == 3 be okay?  or should we switch to treating the optional
start and end positions as "return value boundaries" (used to filter the result) rather than
"slice directives" (used to process the source string before the operation)?  it's all trivial
to implement, and has no performance implications, but I'm not sure what the consensus
really is...

</F> 




From fredrik at pythonware.com  Tue May 30 13:17:58 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 May 2006 13:17:58 +0200
Subject: [Python-Dev] fixing buildbots
References: <ee2a432c0605292354j4d454c1ehc55c8ee29dac9293@mail.gmail.com>
Message-ID: <e5h9l6$lfm$1@sea.gmane.org>

Neal Norwitz wrote:

> I've been starting to get some of the buildbots working again.  There
> was some massive problem on May 25 where a ton of extra files were
> left around.  I can't remember if I saw something about that at the
> NFS sprint or not.

bob's new _struct module was checked in on May 25, and required some
changes to the build procedure (structmodule.c was replaced by _struct.c).

I'm not sure why such a relatively straightforward change would break the
buildbots, though...

</F> 




From fredrik at pythonware.com  Tue May 30 13:24:08 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 May 2006 13:24:08 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>	<e5fk8j$h8f$1@sea.gmane.org>	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>	<ee2a432c0605292115h125fa64kea216256f96f35e@mail.gmail.com>	<447BDBDE.4080001@v.loewis.de><ee2a432c0605292314y6bcc9dadhcd02b8f3b3144d3e@mail.gmail.com>
	<e5grmb$21r$1@sea.gmane.org>
Message-ID: <e5ha0o$mpd$1@sea.gmane.org>

Georg Brandl wrote:

> > I'd be satisfied with a deprecation warning for PyArg_Parse, though we
> > (*) should really figure out how to make it work on Windows.  I
> > haven't seen anyone object to the C compiler deprecation warning.
>
> There is something at
> http://msdn2.microsoft.com/en-us/library/044swk7y.aspx
>
> but it says only "C++".

and links to

    http://msdn2.microsoft.com/en-us/library/c8xdzzhh.aspx

which provides a pragma that does the same thing, and is documented to work
for both C and C++, and also works for macros.

</F> 




From fredrik at pythonware.com  Tue May 30 13:39:17 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 May 2006 13:39:17 +0200
Subject: [Python-Dev] Integer representation (Was: ssize_t question:
	longs in header files)
References: <ee2a432c0605282315s6cb5743bj4a37009a03ebbca2@mail.gmail.com>	<447B66B7.90107@v.loewis.de>	<9e804ac0605291433x54dd64e7o10acfc3076f5ffb0@mail.gmail.com>	<447B6B1A.2020107@v.loewis.de>	<9e804ac0605291457s784e6089ge84e6315feb2c3cd@mail.gmail.com>	<447B7189.30408@v.loewis.de><ca471dc20605292100q16f57fe1qd645596ad1492dae@mail.gmail.com>
	<447BD772.3090303@v.loewis.de>
Message-ID: <e5hat5$pt0$1@sea.gmane.org>

Martin v. Löwis wrote:

>> That's why I'd like my alternative proposal (int as ABC and two
>> subclasses that may remain anonymous to the Python user); it'll save
>> the alignment waste for short ints and will let us use a smaller int
>> type for the size for long ints (if we care about the latter).
>
> I doubt they can remain anonymous. People often dispatch by type
> (e.g. pickle, xmlrpclib, ...), and need to put the type into a
> dictionary. If the type is anonymous, they will do
>
>   dispatch[type(0)] = marshal_int
>   dispatch[type(sys.maxint+1)] = marshal_int

"may remain anonymous" may also mean that type(v) is type(int) for all integers.

but I think isinstance(v, int) should be good enough for Py3K; I really hope all
the talk about typing and multimethods and stuff results in tools that can be used
to implement explicit dispatch machineries such as the one in XML-RPC (why
not turn all the marshal methods into a single multimethod?)

</F> 




From benji at benjiyork.com  Tue May 30 15:19:47 2006
From: benji at benjiyork.com (Benji York)
Date: Tue, 30 May 2006 09:19:47 -0400
Subject: [Python-Dev] fixing buildbots
In-Reply-To: <ee2a432c0605292354j4d454c1ehc55c8ee29dac9293@mail.gmail.com>
References: <ee2a432c0605292354j4d454c1ehc55c8ee29dac9293@mail.gmail.com>
Message-ID: <447C4673.6050408@benjiyork.com>

Neal Norwitz wrote:
 > I can
 > understand why we wouldn't want to unconditionally overwrite a user's
 > modified Setup.  However, for the buildbots, it seems safer to always
 > overwrite it.

FWIW, with the buildbots I run, I use "clobber" mode, with which the
entire build directory is removed and rebuilt on each run.  It slows
things down a bit, but the results have been more consistent and it's 
easier to administer.
--
Benji York

From scott+python-dev at scottdial.com  Tue May 30 18:03:48 2006
From: scott+python-dev at scottdial.com (Scott Dial)
Date: Tue, 30 May 2006 12:03:48 -0400
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <ee2a432c0605292106r57a44579h414ca2c6b8ed3ed8@mail.gmail.com>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>	<e5fk8j$h8f$1@sea.gmane.org>	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>	<e5fn8l$ppo$1@sea.gmane.org>
	<447B64B0.3080107@v.loewis.de>	<e5fotf$v27$1@sea.gmane.org>
	<447B6B77.3050608@v.loewis.de>
	<ee2a432c0605292106r57a44579h414ca2c6b8ed3ed8@mail.gmail.com>
Message-ID: <447C6CE4.5020209@scottdial.com>

Neal Norwitz wrote:
> Already done for gcc, see Py_DEPRECATED in pyport.h.  Would be nice if
> someone could add support on Windows.
> 

The manner that macro is used can't be leveraged to work in the VC 
compiler. I admit to not having done an extensive search for the usage 
of Py_DEPRECATED, but to take from object.h:

typedef PyObject *(*intargfunc)(PyObject *, int) Py_DEPRECATED(2.5);

In GCC, you tag on __attribute__((__deprecated__)) but there is no 
equivalent tagging method in VC. In VC, you should instead put a pragma 
for the identifier: #pragma deprecated(intargfunc). AFAIK, you can't put 
a #pragma in a #define, so it seems wise to only mark functions 
deprecated (which you can do via __declspec(deprecated)).

-- 
Scott Dial
scott at scottdial.com
scodial at indiana.edu

From bob at redivi.com  Tue May 30 19:15:48 2006
From: bob at redivi.com (Bob Ippolito)
Date: Tue, 30 May 2006 10:15:48 -0700
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <447C1343.3020903@gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>	<9e804ac0605290314j26bcd22fsb93e649af47a76f9@mail.gmail.com>	<B6FD8FE9-93FE-43F0-8FBB-B3D1D6859E85@redivi.com>	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>	<9ED42665-07AC-4819-B2B2-D899A56038B1@redivi.com>	<7BD6DC02-C281-4C52-B9C7-6A05E11E8A17@redivi.com>	<1f7befae0605292000k3ac8daa8n355fb03ffb74e00@mail.gmail.com>
	<0486A093-ECAF-49DB-962B-9CFF47D7AAAB@redivi.com>
	<447C1343.3020903@gmail.com>
Message-ID: <0BF5CE8B-EBF2-4E87-8549-B8DE2E50DEEA@redivi.com>

On May 30, 2006, at 2:41 AM, Nick Coghlan wrote:

> Bob Ippolito wrote:
>> On May 29, 2006, at 8:00 PM, Tim Peters wrote:
>>> We certainly don't want to see two deprecation warnings for a single
>>> deprecated behavior.  I suggest eliminating the "struct integer
>>> wrapping" warning, mostly because I had no idea what it _meant_
>>> before reading the comments in _struct.c  ("wrapping" is used most
>>> often in a proxy or delegation context in Python these days).   "'B'
>>> format requires 0 <= number <= 255" is perfectly clear all by  
>>> itself.
>> What should it be called instead of wrapping? When it says it's   
>> wrapping, it means that it's doing x &= (2 ^ (8 * n)) - 1 to force  
>> a  number into meeting the expected range.
>
> "integer overflow masking" perhaps?

Sounds good enough, I'll go ahead and change the wording to that.

>> Reducing it to one warning instead of two is kinda difficult. Is  
>> it  worth the trouble?
>
> If there are cases where only one warning or the other triggers, it  
> doesn't seem worth the effort to try and suppress one of them when  
> they both trigger.

It works kinda like this:

def get_ulong(x):
	ulong_mask = (sys.maxint << 1L) | 1
	if is_unsigned and ((unsigned)x) > ulong_mask:
		x &= ulong_mask
		warning('integer overflow masking is deprecated')
	return x

def pack_ubyte(x):
	x = get_ulong(x)
	if not (0 <= x <= 255):
		warning("'B' format requires 0 <= number <= 255")
		x &= 0xff
	return chr(x)

Given the implementation, it will warn twice if sizeof(format) <  
sizeof(long) AND one of the following:
1. Negative numbers are given for an unsigned format
2. Input value is greater than ((sys.maxint << 1) | 1) for an  
unsigned format
3. Input value is not ((-sys.maxint - 1) <= x <= sys.maxint) for a  
signed format

-bob


From bob at redivi.com  Tue May 30 19:37:04 2006
From: bob at redivi.com (Bob Ippolito)
Date: Tue, 30 May 2006 10:37:04 -0700
Subject: [Python-Dev] Converting crc32 functions to use unsigned
Message-ID: <4CA734D0-6FEC-4A88-A00F-AB755716485D@redivi.com>

It seems that we should convert the crc32 functions in binascii,  
zlib, etc. to deal with unsigned integers. Currently it seems that 32- 
bit and 64-bit platforms are going to have different results for  
these functions.

Should we do the same as the struct module, and do DeprecationWarning  
when the input value is < 0? Do we have a PyArg_ParseTuple format  
code or a converter that would be suitable for this purpose?

None of the unit tests seem to exercise values where 32-bit and 64- 
bit platforms would have differing results, but that's easy enough to  
fix...

-bob



From tim.peters at gmail.com  Tue May 30 19:47:04 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 30 May 2006 13:47:04 -0400
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <0486A093-ECAF-49DB-962B-9CFF47D7AAAB@redivi.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>
	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>
	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>
	<9ED42665-07AC-4819-B2B2-D899A56038B1@redivi.com>
	<7BD6DC02-C281-4C52-B9C7-6A05E11E8A17@redivi.com>
	<1f7befae0605292000k3ac8daa8n355fb03ffb74e00@mail.gmail.com>
	<0486A093-ECAF-49DB-962B-9CFF47D7AAAB@redivi.com>
Message-ID: <1f7befae0605301047h5a3afc25ibb412b8a6df73335@mail.gmail.com>

[Bob Ippolito]
> What should it be called instead of wrapping?

I don't know -- I don't know what it's trying to _say_ that isn't
already said by saying that the input is out of bounds for the format
code.

> When it says it's wrapping, it means that it's doing x &= (2 ^ (8 * n)) - 1 to force
> a number into meeting the expected range.

How is that different from what it does in this case?:

>>> struct.pack('<B', 256L)
/Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format
requires 0 <= number <= 255
  return o.pack(*args)
'\x00'

That looks like "wrapping" to me too (256 & (2**(8*1)-1)== 0x00), but
in this case there is no deprecation warning about wrapping.  Because
of that, I'm afraid you're drawing distinctions that can't make sense
to users.

> Reducing it to one warning instead of two is kinda difficult. Is it
> worth the trouble?

I don't understand.  Every example you gave that showed a wrapping
warning also showed a "format requires i <= number <= j" warning.  Are
there cases in which a wrapping warning is given but not a "format
requires i <= number <= j" warning?  If so, I simply haven't seen one
(but I haven't tried all possible inputs ;-)).

Since the implementation appears (to judge from the examples) to
"wrap" in every case in which any warning is given (or are there cases
in which it doesn't?), I don't understand the point of distinguishing
between wrapping warnings and  "format requires i <= number <= j"
warnings either.  The latter are crystal clear.

From jcarlson at uci.edu  Tue May 30 19:58:51 2006
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 30 May 2006 10:58:51 -0700
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <e5h8uj$ivl$1@sea.gmane.org>
References: <1f7befae0605292110s12a7d709pbbb6a4486cb09146@mail.gmail.com>
	<e5h8uj$ivl$1@sea.gmane.org>
Message-ID: <20060530105503.6962.JCARLSON@uci.edu>


"Fredrik Lundh" <fredrik at pythonware.com> wrote:
> Tim Peters wrote:
> 
> >>>> "abc".count("", 100)
> > 1
> >>>> u"abc".count("", 100)
> > 1
> 
> which is the same as
> 
> >>> "abc"[100:].count("")
> 1
> 
> >>>> "abc".find("", 100)
> > 100
> >>>> u"abc".find("", 100)
> > 100
> >
> > today, although the idea that find() can return an index that doesn't
> > exist in the string is particularly jarring.  Since we also have:
> >
> >>>> "abc".rfind("", 100)
> > 3
> >>>> u"abc".rfind("", 100)
> > 3
> >
> > it's twice as jarring that "searching from the right" finds the target
> > at a _smaller_ index than "searching from the left".
> 
> well, string[pos:pos+len(substring)] == substring is true in both cases.
> 
> would "abc".find("", 100) == 3 be okay?

I would argue no.  Based on my experience with find, I would expect that
find produces values >= start, except when not found, then -1. (note
that I am talking about find and not rfind)


> or should we switch to treating the optional
> start and end positions as "return value boundaries" (used to filter the result) rather than
> "slice directives" (used to process the source string before the operation)?

Return value boundaries seem the more appealing of the two.

 - Josiah


From bob at redivi.com  Tue May 30 20:00:04 2006
From: bob at redivi.com (Bob Ippolito)
Date: Tue, 30 May 2006 11:00:04 -0700
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <1f7befae0605301047h5a3afc25ibb412b8a6df73335@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<9e804ac0605290903o2a52adb4y4086a9e5b92cb026@mail.gmail.com>
	<1f7befae0605291105t1c3e4265reeae0b771c171d4e@mail.gmail.com>
	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>
	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>
	<9ED42665-07AC-4819-B2B2-D899A56038B1@redivi.com>
	<7BD6DC02-C281-4C52-B9C7-6A05E11E8A17@redivi.com>
	<1f7befae0605292000k3ac8daa8n355fb03ffb74e00@mail.gmail.com>
	<0486A093-ECAF-49DB-962B-9CFF47D7AAAB@redivi.com>
	<1f7befae0605301047h5a3afc25ibb412b8a6df73335@mail.gmail.com>
Message-ID: <21F670FB-82CD-401C-92DC-23391ECCCE59@redivi.com>


On May 30, 2006, at 10:47 AM, Tim Peters wrote:

> [Bob Ippolito]
>> What should it be called instead of wrapping?
>
> I don't know -- I don't know what it's trying to _say_ that isn't
> already said by saying that the input is out of bounds for the format
> code.

The wrapping (now overflow masking) warning happens during conversion  
of PyObject* to long or unsigned long. It has no idea what the  
destination packing format is beyond whether it's signed or unsigned.  
If the packing format happens to be the same size as a long, it can't  
possibly trigger a range warning (unless range checks are moved up  
the stack and all of the function signatures and code get changed to  
accommodate that).

>> When it says it's wrapping, it means that it's doing x &= (2 ^ (8  
>> * n)) - 1 to force
>> a number into meeting the expected range.
>
> How is that different from what it does in this case?:
>
>>>> struct.pack('<B', 256L)
> /Users/bob/src/python/Lib/struct.py:63: DeprecationWarning: 'B' format
> requires 0 <= number <= 255
>  return o.pack(*args)
> '\x00'
>
> That looks like "wrapping" to me too (256 & (2**(8*1)-1)== 0x00), but
> in this case there is no deprecation warning about wrapping.  Because
> of that, I'm afraid you're drawing distinctions that can't make sense
> to users.

When it says "integer wrapping" it means that it's wrapping to fit in  
a long or unsigned long. n in this case is always 4 or 8 depending on  
the platform. The format-specific range check is separate. My  
description wasn't very good in the last email.

>> Reducing it to one warning instead of two is kinda difficult. Is it
>> worth the trouble?
>
> I don't understand.  Every example you gave that showed a wrapping
> warning also showed a "format requires i <= number <= j" warning.  Are
> there cases in which a wrapping warning is given but not a "format
> requires i <= number <= j" warning?  If so, I simply haven't seen one
> (but I haven't tried all possible inputs ;-)).
>
> Since the implementation appears (to judge from the examples) to
> "wrap" in every case in which any warning is given (or are there cases
> in which it doesn't?), I don't understand the point of distinguishing
> between wrapping warnings and  "format requires i <= number <= j"
> warnings either.  The latter are crystal clear.

A latter email in this thread enumerates exactly which circumstances  
should cause two warnings with the current implementation.

-bob


From rasky at develer.com  Tue May 30 20:12:17 2006
From: rasky at develer.com (Giovanni Bajo)
Date: Tue, 30 May 2006 20:12:17 +0200
Subject: [Python-Dev] Converting crc32 functions to use unsigned
References: <4CA734D0-6FEC-4A88-A00F-AB755716485D@redivi.com>
Message-ID: <015a01c68414$95f2af80$bf03030a@trilan>

Bob Ippolito wrote:

> It seems that we should convert the crc32 functions in binascii,
> zlib, etc. to deal with unsigned integers. 

+1!!
-- 
Giovanni Bajo

From nnorwitz at gmail.com  Tue May 30 20:13:24 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Tue, 30 May 2006 11:13:24 -0700
Subject: [Python-Dev] fixing buildbots
In-Reply-To: <e5h9l6$lfm$1@sea.gmane.org>
References: <ee2a432c0605292354j4d454c1ehc55c8ee29dac9293@mail.gmail.com>
	<e5h9l6$lfm$1@sea.gmane.org>
Message-ID: <ee2a432c0605301113w3b075ec6n82d666200ed92b3c@mail.gmail.com>

On 5/30/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
> Neal Norwitz wrote:
>
> > I've been starting to get some of the buildbots working again.  There
> > was some massive problem on May 25 where a ton of extra files were
> > left around.  I can't remember if I saw something about that at the
> > NFS sprint or not.
>
> bob's new _struct module was checked in on May 25, and required some
> changes to the build procedure (structmodule.c was replaced by _struct.c).
>
> I'm not sure why such a relatively straightforward change would break the
> buildbots, though...

It wasn't the buildbots exactly, but rather the test suite not being
robust enough.  There are several tests that create files/directories
on setup and remove them on teardown.  However, if there's some
exception and teardown doesn't do it's job completely, on the next
run, setup will fail.  I fixed one of these in test_repr last night
IIRC.  test_glob I tried to fix with a change to the Makefile to rm -f
areallylong*.  Not sure why the problems didn't appear on Windows.

The tests should really be made more robust.

n

From guido at python.org  Tue May 30 20:19:28 2006
From: guido at python.org (Guido van Rossum)
Date: Tue, 30 May 2006 11:19:28 -0700
Subject: [Python-Dev] Converting crc32 functions to use unsigned
In-Reply-To: <015a01c68414$95f2af80$bf03030a@trilan>
References: <4CA734D0-6FEC-4A88-A00F-AB755716485D@redivi.com>
	<015a01c68414$95f2af80$bf03030a@trilan>
Message-ID: <ca471dc20605301119u7aead937x7de7f4680ef85de5@mail.gmail.com>

Seems ok, except I don't know what the backwards incompatibilities would be...

On 5/30/06, Giovanni Bajo <rasky at develer.com> wrote:
> Bob Ippolito wrote:
>
> > It seems that we should convert the crc32 functions in binascii,
> > zlib, etc. to deal with unsigned integers.
>
> +1!!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From g.brandl at gmx.net  Tue May 30 20:35:27 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 30 May 2006 20:35:27 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <e5ha0o$mpd$1@sea.gmane.org>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>	<e5fk8j$h8f$1@sea.gmane.org>	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>	<ee2a432c0605292115h125fa64kea216256f96f35e@mail.gmail.com>	<447BDBDE.4080001@v.loewis.de><ee2a432c0605292314y6bcc9dadhcd02b8f3b3144d3e@mail.gmail.com>	<e5grmb$21r$1@sea.gmane.org>
	<e5ha0o$mpd$1@sea.gmane.org>
Message-ID: <e5i389$vmi$1@sea.gmane.org>

Fredrik Lundh wrote:
> Georg Brandl wrote:
> 
>> > I'd be satisfied with a deprecation warning for PyArg_Parse, though we
>> > (*) should really figure out how to make it work on Windows.  I
>> > haven't seen anyone object to the C compiler deprecation warning.
>>
>> There is something at
>> http://msdn2.microsoft.com/en-us/library/044swk7y.aspx
>>
>> but it says only "C++".
> 
> and links to
> 
>     http://msdn2.microsoft.com/en-us/library/c8xdzzhh.aspx
> 
> which provides a pragma that does the same thing, and is documented to work
> for both C and C++, and also works for macros.

But we'd have to use an #ifdef for every deprecated function.

Of course, there aren't so many of them.

Georg


From fredrik at pythonware.com  Tue May 30 20:40:28 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 30 May 2006 20:40:28 +0200
Subject: [Python-Dev] Remove METH_OLDARGS?
In-Reply-To: <e5i389$vmi$1@sea.gmane.org>
References: <e5bhs7$509$1@sea.gmane.org>	<ee2a432c0605280014s7f1dd85bo657d4df15eae1779@mail.gmail.com>	<e5cjgn$3pg$1@sea.gmane.org>	<ee2a432c0605282137j589191e7j713f5382ac6c77c4@mail.gmail.com>	<e5fk8j$h8f$1@sea.gmane.org>	<9e804ac0605291350v65769f3by45b2431a29c53e72@mail.gmail.com>	<ee2a432c0605292115h125fa64kea216256f96f35e@mail.gmail.com>	<447BDBDE.4080001@v.loewis.de><ee2a432c0605292314y6bcc9dadhcd02b8f3b3144d3e@mail.gmail.com>	<e5grmb$21r$1@sea.gmane.org>	<e5ha0o$mpd$1@sea.gmane.org>
	<e5i389$vmi$1@sea.gmane.org>
Message-ID: <e5i3iq$20v$1@sea.gmane.org>

Georg Brandl wrote:

>> and links to
>>
>>     http://msdn2.microsoft.com/en-us/library/c8xdzzhh.aspx
>>
>> which provides a pragma that does the same thing, and is documented to work
>> for both C and C++, and also works for macros.
> 
> But we'd have to use an #ifdef for every deprecated function.

or a pydeprecated.h file with all the relevant functions and macros 
inside single ifdef.

</F>



From bob at redivi.com  Tue May 30 21:05:51 2006
From: bob at redivi.com (Bob Ippolito)
Date: Tue, 30 May 2006 12:05:51 -0700
Subject: [Python-Dev] Converting crc32 functions to use unsigned
In-Reply-To: <ca471dc20605301119u7aead937x7de7f4680ef85de5@mail.gmail.com>
References: <4CA734D0-6FEC-4A88-A00F-AB755716485D@redivi.com>
	<015a01c68414$95f2af80$bf03030a@trilan>
	<ca471dc20605301119u7aead937x7de7f4680ef85de5@mail.gmail.com>
Message-ID: <393BF803-E543-4058-84F6-537A5AB38B56@redivi.com>

On May 30, 2006, at 11:19 AM, Guido van Rossum wrote:

> On 5/30/06, Giovanni Bajo <rasky at develer.com> wrote:
>> Bob Ippolito wrote:
>>
>> > It seems that we should convert the crc32 functions in binascii,
>> > zlib, etc. to deal with unsigned integers.
>>
>> +1!!
>
> Seems ok, except I don't know what the backwards incompatibilities  
> would be...
>

I think the only compatibility issues we're going to run into are  
with the struct module in the way of DeprecationWarning. If people  
are depending on specific negative values for these, then their code  
should already be broken on 64-bit platforms. The only potential  
breakage I can see is if they're passing these values to other  
functions written in C that expect PyInt_AsLong(n) to work with the  
values (on 32-bit platforms). I can't think of a use case for that  
beyond the functions themselves and the struct module.

-bob


From python at copton.net  Tue May 30 20:56:54 2006
From: python at copton.net (Alexander Bernauer)
Date: Tue, 30 May 2006 20:56:54 +0200
Subject: [Python-Dev] bug in PEP 318
Message-ID: <20060530185654.GG17288@copton.net>

Hi

I found two bugs in example 4 of the PEP 318 [1]. People on #python
pointed me to this list. So here is my report. Additionally I appended
an afaics correct implementation for this task.

[1] http://www.python.org/dev/peps/pep-0318/

Bug 1)
The decorator "accepts" gets the function which is returned by the
decorator "returns". This is the function "new_f" which is defined
differently from the function "func". Because the first has an
argument count of zero, the assertion on line 3 is wrong.

Bug 2)
The assertion on line 6 does not work correctly for tuples. If the
second argument of "isinstance" is a tuple, the function returns true,
if the first argument is an instance of either type of the tuple.
Unfortunately the example uses tuples.


Here goes my proposal for this decorators. Feel free to do anything you
like with it.

---8<---
def checktype(o, t): 
   """check if the type of the object "o" is "t" """
    # in case of a tuple descend
    if isinstance(t, tuple):
        assert isinstance(o, tuple), str(type(o))
        assert len(o) == len(t), "%d %d" % (len(o), len(t))
        for (o2, t2) in zip(o, t):
            checktype(o2, t2)

    # isinsance(None, None) raises a TypeError. 
    # so we need extra handling for this case
    elif t == None:  
        assert o == None
    else:
        assert isinstance(o, t), "%s %s" % (str(type(o)), str(t))

class accepts(object):
    def __init__(self, *types):
        self.types = types

    def __call__(self, func):
        self.func = func

        def check(*args, **kwds):
            checktype(args, self.types)
            self.func(*args, **kwds)
     
        return check

class returns(object):
    def __init__(self, *types):
        self.types = types

    def __call__(self, func):
        self.func = func

        def check(*args, **kwds):
            value = self.func(*args, **kwds)
            # if the function returns only on object the single object is no tuple
            # this extra case results in an extra handling here
            if isinstance(value, tuple):
                checktype(value, self.types)
            else:
                checktype((value,), self.types)
            return value

        return check

--->8---

To be honest, I didn't understand what the purpose of setting
"func_name" is, so I left it out.  If its neccessary please feel free to
correct me.

In contrast to tuples lists and dictionaries are not inspected. The
reason is that I don't know how to express: "the function accepts a list
of 3 or more integers" or alike. Perhaps somebody has an idea for this.

Here are some examples of what is possible:

---8<---
@accepts(int)
@returns(int)
def foo(a):
    return a

foo(3)


@accepts(int, (float, int))
@returns(int, str)
def foo(a, b):
    return (1, "asdf")

foo(1, (2.0, 1))


@returns(None)
def foo():
    pass

foo()


@accepts(int, int)
def foo(a, *args):
    pass

foo(3,3)


@accepts(tuple)
@returns(list)
def foo(a):
   return list(a)

foo((1,1))

--->8---

I wonder, how it can be, that those imho obvious mistakes go into a PEP
and stay undetected there for almost 3 years. Perhaps I misunderstood
something completly. In this case, please excuse me and ignore my mail.


The welcome message to this lists asked for introducing myself. Well,
there is nothing special to say about me concerning python. Let's say,
I'm just a fan. 

cu

-- 
Alexander Bernauer

From brett at python.org  Tue May 30 22:11:51 2006
From: brett at python.org (Brett Cannon)
Date: Tue, 30 May 2006 13:11:51 -0700
Subject: [Python-Dev] [Python-3000] stdlib reorganization
In-Reply-To: <d11dcfba0605301230k5e40cf41j5491de8727e6f6a2@mail.gmail.com>
References: <44716940.9000300@acm.org> <1148318480.19392.6.camel@fsol>
	<ca471dc20605221028y91d50e0n4dcc2ed7b186fc70@mail.gmail.com>
	<4472B196.7070506@acm.org>
	<ca471dc20605230817x331241e6r45e63c4c1c0eb8ed@mail.gmail.com>
	<447BC126.8050107@acm.org>
	<bbaeab100605300925k151a1437gea18eeaafe5c8068@mail.gmail.com>
	<d11dcfba0605301032m647094d2tcfe814ba190bfa21@mail.gmail.com>
	<bbaeab100605301054y73ff6741u6a388643b27083aa@mail.gmail.com>
	<d11dcfba0605301230k5e40cf41j5491de8727e6f6a2@mail.gmail.com>
Message-ID: <bbaeab100605301311w14c3b811x5d939c947f4eaafc@mail.gmail.com>

On 5/30/06, Steven Bethard <steven.bethard at gmail.com> wrote:
>
> On 5/30/06, Brett Cannon <brett at python.org> wrote:
> > So, first step in my mind is settling if we want to add one more depth
> to
> > the stdlib, and if so, how we want to group (not specific groupings,
> just
> > general guidelines).
>
> I think that having a package level that exactly matches the divisions
> in the Library Reference (http://docs.python.org/lib/lib.html ) would
> be great.



Makes sense to me.


Currently, that would mean packages for:
>
> 3. Python Runtime Services
> 4. String Services
> 5. Miscellaneous Services


I don't think we necessarily want a misc package.  Should stuff that falls
into here just be at the root level? Besides, some stuff, such as heapq,
bisect, collections, and the User* modules could got into a data structure
package.  I also think that a testing package would make sense.  Could also
have a math package.

6. Generic Operating System Services
> 7. Optional Operating System Services


This includes socket, which I would think would belong more in a
networking-centric package (not including web-specific stuff).  Plus I
believe a threading/concurrency package has been proposed before (which
included hiding 'thread' so that people wouldn't use such low-level stuff).

8. Unix Specific Services
> 9. The Python Debugger
> 10. The Python Profiler


Can't the pdb and profiling going into a developer package?

11. Internet Protocols and Support


Should xmlrpclib be in here, or in something more in line with RPC and
"concurrency"?

12. Internet Data Handling


Should we merge  this with a more generic Internet/Web package?  Have a
separate email package that includes 'email', smtp, etc?

13. Structured Markup Processing Tools
> 14. Multimedia Services
> 15. Cryptographic Services
> 16. Graphical User Interfaces with Tk
> 17. Restricted Execution


=)  This section's not really valid anymore (although I will be fixing that
at some point).

18. Python Language Services
> 19. Python compiler package
> 20. SGI IRIX Specific Services
> 21. SunOS Specific Services
> 22. MS Windows Specific Services
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060530/786c158c/attachment.html 

From pje at telecommunity.com  Tue May 30 23:17:53 2006
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 30 May 2006 17:17:53 -0400
Subject: [Python-Dev] bug in PEP 318
In-Reply-To: <20060530185654.GG17288@copton.net>
Message-ID: <5.1.1.6.0.20060530170728.0462d220@mail.telecommunity.com>

At 08:56 PM 5/30/2006 +0200, Alexander Bernauer wrote:
>Hi
>
>I found two bugs in example 4 of the PEP 318 [1]. People on #python
>pointed me to this list. So here is my report. Additionally I appended
>an afaics correct implementation for this task.
>
>[1] http://www.python.org/dev/peps/pep-0318/
>
>Bug 1)
>The decorator "accepts" gets the function which is returned by the
>decorator "returns". This is the function "new_f" which is defined
>differently from the function "func". Because the first has an
>argument count of zero, the assertion on line 3 is wrong.

The simplest fix for this would be to require that returns() be used 
*before* accepts().


>Bug 2)
>The assertion on line 6 does not work correctly for tuples. If the
>second argument of "isinstance" is a tuple, the function returns true,
>if the first argument is an instance of either type of the tuple.

This is intentional, to allow saying that the given argument may be (for 
example) an int or a float.  What it doesn't support is nested-tuple 
arguments, but that's a reasonable omission given the examples' nature as 
examples.


>         def check(*args, **kwds):
>             checktype(args, self.types)
>             self.func(*args, **kwds)

This needs a 'return', since it otherwise loses the function's return value.


>To be honest, I didn't understand what the purpose of setting
>"func_name" is, so I left it out.  If its neccessary please feel free to
>correct me.

It's needed for Python documentation tools such as help(), pydoc, and so on 
display something more correct for the decorated function's documentation, 
although also copying the __doc__ attribute would really also be required 
for that.


>In contrast to tuples lists and dictionaries are not inspected. The
>reason is that I don't know how to express: "the function accepts a list
>of 3 or more integers" or alike. Perhaps somebody has an idea for this.

I think perhaps you've mistaken the PEP examples for an attempt to 
implement some kind of typechecking feature.  They are merely examples to 
show an idea of what is possible with decorators, nothing more.  They are 
not even intended for anybody to actually use!


>I wonder, how it can be, that those imho obvious mistakes go into a PEP
>and stay undetected there for almost 3 years.

That's because nobody uses them; a PEP example is not intended or required 
to be a robust production implementation of the idea it sketches.  They are 
proofs-of-concept, not source code for a utility.  If they were intended to 
be used, they would be in a reference library or in the standard library, 
or otherwise offered in executable form.


From t-bruch at microsoft.com  Tue May 30 23:02:36 2006
From: t-bruch at microsoft.com (Bruce Christensen)
Date: Tue, 30 May 2006 14:02:36 -0700
Subject: [Python-Dev] Socket module corner cases
Message-ID: <3581AA168D87A2479D88EA319BDF7D325CA909@RED-MSG-80.redmond.corp.microsoft.com>

Hello,

I'm working on implementing a socket module for IronPython that aims to
be compatible with the standard CPython module documented at
http://docs.python.org/lib/module-socket.html. I have a few questions
about some corner cases that I've found. CPython results below are from
Python 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310 32 bit (Intel)] on
win32.

Without further ado, the questions:

 * getfqdn(): The module docs specify that if no FQDN can be found,
socket.getfqdn() should return the hostname as returned by
gethostname(). However, CPython seems to return the passed-in hostname
rather than the local machine's hostname (as would be expected from
gethostname()). What's the correct behavior?
>>> s.getfqdn(' asdlfk asdfsadf ')
'asdlfk asdfsadf'
# expected 'mybox.mydomain.com'

 * getfqdn(): The function seems to not always return the FQDN. For
example, if I run the following code from 'mybox.mydomain.com', I get
strange output. Does getfqdn() remove the common domain between my
hostname and the one that I'm looking up?
>>> socket.getfqdn('otherbox')
'OTHERBOX'
# expected 'otherbox.mydomain.com'
 
 * gethostbyname_ex(): The CPython implementation doesn't seem to
respect the '' == INADDR_ANY and '<broadcast>' == INADDR_BROADCAST
forms. '' is treated as localhost, and '<broadcast>' raises a "host not
found" error. Is this intentional? A quick check seems to reveal that
gethostbyname() is the only function that respects '' and '<broadcast>'.
Are the docs or the implementation wrong?

 * getprotobyname(): Only a few protocols seem to be supported. Why?
>>> for p in [a[8:] for a in dir(socket) if a.startswith('IPPROTO_')]:
...     try:
...             print p,
...             print socket.getprotobyname(p)
...     except socket.error:
...             print "(not handled)"
...
AH (not handled)
DSTOPTS (not handled)
ESP (not handled)
FRAGMENT (not handled)
GGP 3
HOPOPTS (not handled)
ICMP 1
ICMPV6 (not handled)
IDP (not handled)
IGMP (not handled)
IP 0
IPV4 (not handled)
IPV6 (not handled)
MAX (not handled)
ND (not handled)
NONE (not handled)
PUP 12
RAW (not handled)
ROUTING (not handled)
TCP 6
UDP 17

Thanks for your help!

--Bruce


From martin at v.loewis.de  Tue May 30 23:34:29 2006
From: martin at v.loewis.de (martin at v.loewis.de)
Date: Tue, 30 May 2006 23:34:29 +0200
Subject: [Python-Dev] fixing buildbots
In-Reply-To: <447C4673.6050408@benjiyork.com>
References: <ee2a432c0605292354j4d454c1ehc55c8ee29dac9293@mail.gmail.com>
	<447C4673.6050408@benjiyork.com>
Message-ID: <1149024869.447cba650ff0d@www.domainfactory-webmail.de>

Zitat von Benji York <benji at benjiyork.com>:

> FWIW, with the buildbots I run, I use "clobber" mode, with which the
> entire build directory is removed and rebuilt on each run.  It slows
> things down a bit, but the results have been more consistent and it's
> easier to administer.

We should not do this. Performing a checkout each time is too expensive.

Regards,
Martin




From steven.bethard at gmail.com  Tue May 30 23:36:02 2006
From: steven.bethard at gmail.com (Steven Bethard)
Date: Tue, 30 May 2006 15:36:02 -0600
Subject: [Python-Dev] [Python-3000] stdlib reorganization
In-Reply-To: <bbaeab100605301311w14c3b811x5d939c947f4eaafc@mail.gmail.com>
References: <44716940.9000300@acm.org>
	<ca471dc20605221028y91d50e0n4dcc2ed7b186fc70@mail.gmail.com>
	<4472B196.7070506@acm.org>
	<ca471dc20605230817x331241e6r45e63c4c1c0eb8ed@mail.gmail.com>
	<447BC126.8050107@acm.org>
	<bbaeab100605300925k151a1437gea18eeaafe5c8068@mail.gmail.com>
	<d11dcfba0605301032m647094d2tcfe814ba190bfa21@mail.gmail.com>
	<bbaeab100605301054y73ff6741u6a388643b27083aa@mail.gmail.com>
	<d11dcfba0605301230k5e40cf41j5491de8727e6f6a2@mail.gmail.com>
	<bbaeab100605301311w14c3b811x5d939c947f4eaafc@mail.gmail.com>
Message-ID: <d11dcfba0605301436o4058a383y41894e030410958b@mail.gmail.com>

[Steven Bethard]
> I think that having a package level that exactly matches the divisions
> in the Library Reference (http://docs.python.org/lib/lib.html) would
> be great.

[Brett Cannon]
> Makes sense to me.
>
[snip]
> > 5. Miscellaneous Services
>
> I don't think we necessarily want a misc package.  Should stuff that falls
> into here just be at the root level? Besides, some stuff, such as heapq,
> bisect, collections, and the User* modules could got into a data structure
> package.  I also think that a testing package would make sense.  Could also
> have a math package.

That sounds about reasonable.  One possible grouping:

Testing Stuff:
    * 5.2 doctest -- Test interactive Python examples
    * 5.3 unittest -- Unit testing framework
    * 5.4 test -- Regression tests package for Python
    * 5.5 test.test_support -- Utility functions for tests

Math/Numeric Stuff:
    * 5.6 decimal -- Decimal floating point arithmetic
    * 5.7 math -- Mathematical functions
    * 5.8 cmath -- Mathematical functions for complex numbers
    * 5.9 random -- Generate pseudo-random numbers
    * 5.10 whrandom -- Pseudo-random number generator

Data Structures/Collections Stuff:
    * 5.11 bisect -- Array bisection algorithm
    * 5.12 collections -- High-performance container datatypes
    * 5.13 heapq -- Heap queue algorithm
    * 5.14 array -- Efficient arrays of numeric values
    * 5.15 sets -- Unordered collections of unique elements
    * 5.16 itertools -- Functions creating iterators for efficient looping

Stuff I still don't know what to do with:
    * 5.1 pydoc -- Documentation generator and online help system
    * 5.17 ConfigParser -- Configuration file parser
    * 5.18 fileinput -- Iterate over lines from multiple input streams
    * 5.19 calendar -- General calendar-related functions
    * 5.20 cmd -- Support for line-oriented command interpreters
    * 5.21 shlex -- Simple lexical analysis

> > 8. Unix Specific Services
> > 9. The Python Debugger
> > 10. The Python Profiler
>
> Can't the pdb and profiling going into a developer package?

I thought the same thing when I copied the list over the first time.

Steve
-- 
Grammar am for people who can't think for myself.
        --- Bucky Katt, Get Fuzzy

From pj at place.org  Tue May 30 23:41:12 2006
From: pj at place.org (Paul Jimenez)
Date: Tue, 30 May 2006 16:41:12 -0500
Subject: [Python-Dev] uriparsing library (Patch #1462525)
Message-ID: <20060530214112.77805179C63@place.org>


http://sourceforge.net/tracker/index.php?func=detail&aid=1462525&group_id=5470&atid=305470 

This is just a note to ask when the best time to try and get this in is
- I've seen other new/changed libs going in for 2.5 and wanted to make
sure this didn't fall off the radar. If now's the wrong time, please let
me know when the right time is so I can stick my head up again then.

Thanks,

  --pj



From martin at v.loewis.de  Tue May 30 23:23:23 2006
From: martin at v.loewis.de (martin at v.loewis.de)
Date: Tue, 30 May 2006 23:23:23 +0200
Subject: [Python-Dev] fixing buildbots
In-Reply-To: <ee2a432c0605292354j4d454c1ehc55c8ee29dac9293@mail.gmail.com>
References: <ee2a432c0605292354j4d454c1ehc55c8ee29dac9293@mail.gmail.com>
Message-ID: <1149024203.447cb7cbd5f92@www.domainfactory-webmail.de>

Zitat von Neal Norwitz <nnorwitz at gmail.com>:

> I've been starting to get some of the buildbots working again.  There
> was some massive problem on May 25 where a ton of extra files were
> left around.  I can't remember if I saw something about that at the
> NFS sprint or not.

You can clean each buildbot remotely by forcing a build on a
(non-existing) branch. That removes the build tree, and then
tries to build the branch (which fails if the branch doesn't
exist). The next build will then start with a fresh checkout
of the trunk.

> There is a lingering problem that I can't fix on all the boxes.  Namely:
>
>   cp Modules/Setup{.dist,}
>
> Should we always do that step before we build on the buildbots?

If so, the easiest way to do it would be to "make distclean" in
the clean step: that removes Modules/Setup as well.

> Any objections?

See above. Rather than copying Modules/Setup over, I'd prefer to
see a distclean.

Regards,
Martin



From brett at python.org  Wed May 31 01:09:15 2006
From: brett at python.org (Brett Cannon)
Date: Tue, 30 May 2006 16:09:15 -0700
Subject: [Python-Dev] [Python-3000] stdlib reorganization
In-Reply-To: <d11dcfba0605301436o4058a383y41894e030410958b@mail.gmail.com>
References: <44716940.9000300@acm.org> <4472B196.7070506@acm.org>
	<ca471dc20605230817x331241e6r45e63c4c1c0eb8ed@mail.gmail.com>
	<447BC126.8050107@acm.org>
	<bbaeab100605300925k151a1437gea18eeaafe5c8068@mail.gmail.com>
	<d11dcfba0605301032m647094d2tcfe814ba190bfa21@mail.gmail.com>
	<bbaeab100605301054y73ff6741u6a388643b27083aa@mail.gmail.com>
	<d11dcfba0605301230k5e40cf41j5491de8727e6f6a2@mail.gmail.com>
	<bbaeab100605301311w14c3b811x5d939c947f4eaafc@mail.gmail.com>
	<d11dcfba0605301436o4058a383y41894e030410958b@mail.gmail.com>
Message-ID: <bbaeab100605301609o294222cag7dbb2a4b68e02655@mail.gmail.com>

On 5/30/06, Steven Bethard <steven.bethard at gmail.com> wrote:
>
> [Steven Bethard]
> > I think that having a package level that exactly matches the divisions
> > in the Library Reference (http://docs.python.org/lib/lib.html) would
> > be great.
>
> [Brett Cannon]
> > Makes sense to me.
> >
> [snip]
> > > 5. Miscellaneous Services
> >
> > I don't think we necessarily want a misc package.  Should stuff that
> falls
> > into here just be at the root level? Besides, some stuff, such as heapq,
> > bisect, collections, and the User* modules could got into a data
> structure
> > package.  I also think that a testing package would make sense.  Could
> also
> > have a math package.
>
> That sounds about reasonable.  One possible grouping:
>
> Testing Stuff:
>     * 5.2 doctest -- Test interactive Python examples
>     * 5.3 unittest -- Unit testing framework
>     * 5.4 test -- Regression tests package for Python
>     * 5.5 test.test_support -- Utility functions for tests


Yep, what I was thinking but possibly renaming 'test' to another naming or
putting its functions at the package top level.

Math/Numeric Stuff:
>     * 5.6 decimal -- Decimal floating point arithmetic
>     * 5.7 math -- Mathematical functions
>     * 5.8 cmath -- Mathematical functions for complex numbers
>     * 5.9 random -- Generate pseudo-random numbers
>     * 5.10 whrandom -- Pseudo-random number generator
>
> Data Structures/Collections Stuff:
>     * 5.11 bisect -- Array bisection algorithm
>     * 5.12 collections -- High-performance container datatypes
>     * 5.13 heapq -- Heap queue algorithm
>     * 5.14 array -- Efficient arrays of numeric values
>     * 5.15 sets -- Unordered collections of unique elements
>     * 5.16 itertools -- Functions creating iterators for efficient looping
>
> Stuff I still don't know what to do with:
>     * 5.1 pydoc -- Documentation generator and online help system
>     * 5.17 ConfigParser -- Configuration file parser
>     * 5.18 fileinput -- Iterate over lines from multiple input streams
>     * 5.19 calendar -- General calendar-related functions
>     * 5.20 cmd -- Support for line-oriented command interpreters
>     * 5.21 shlex -- Simple lexical analysis


There are just going to be some things that are odd.  Those will probably
just have to sit at the top-level with no place to stick them so that they
get proper visibility rather than sticking them in misc, forcing people to
hunt in there every time they can't find a specific module that does what
they want.


> > 8. Unix Specific Services
> > > 9. The Python Debugger
> > > 10. The Python Profiler
> >
> > Can't the pdb and profiling going into a developer package?
>
> I thought the same thing when I copied the list over the first time.



Well, there you go then.  It's just *that* good of an idea.  =)

-Brett


Steve
> --
> Grammar am for people who can't think for myself.
>         --- Bucky Katt, Get Fuzzy
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060530/e5296a92/attachment.htm 

From amk at amk.ca  Wed May 31 01:16:14 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Tue, 30 May 2006 19:16:14 -0400
Subject: [Python-Dev] [Python-3000] stdlib reorganization
In-Reply-To: <d11dcfba0605301436o4058a383y41894e030410958b@mail.gmail.com>
References: <ca471dc20605221028y91d50e0n4dcc2ed7b186fc70@mail.gmail.com>
	<4472B196.7070506@acm.org>
	<ca471dc20605230817x331241e6r45e63c4c1c0eb8ed@mail.gmail.com>
	<447BC126.8050107@acm.org>
	<bbaeab100605300925k151a1437gea18eeaafe5c8068@mail.gmail.com>
	<d11dcfba0605301032m647094d2tcfe814ba190bfa21@mail.gmail.com>
	<bbaeab100605301054y73ff6741u6a388643b27083aa@mail.gmail.com>
	<d11dcfba0605301230k5e40cf41j5491de8727e6f6a2@mail.gmail.com>
	<bbaeab100605301311w14c3b811x5d939c947f4eaafc@mail.gmail.com>
	<d11dcfba0605301436o4058a383y41894e030410958b@mail.gmail.com>
Message-ID: <20060530231614.GA415@Andrew-iBook2.local>

On Tue, May 30, 2006 at 03:36:02PM -0600, Steven Bethard wrote:
> That sounds about reasonable.  One possible grouping:

Note that 2.5's library reference has a different chapter organization
from 2.4's.  See <http://docs.python.org/dev/lib/lib.html>.

--amk

From brett at python.org  Wed May 31 01:27:13 2006
From: brett at python.org (Brett Cannon)
Date: Tue, 30 May 2006 16:27:13 -0700
Subject: [Python-Dev] [Python-3000] stdlib reorganization
In-Reply-To: <20060530231614.GA415@Andrew-iBook2.local>
References: <ca471dc20605221028y91d50e0n4dcc2ed7b186fc70@mail.gmail.com>
	<ca471dc20605230817x331241e6r45e63c4c1c0eb8ed@mail.gmail.com>
	<447BC126.8050107@acm.org>
	<bbaeab100605300925k151a1437gea18eeaafe5c8068@mail.gmail.com>
	<d11dcfba0605301032m647094d2tcfe814ba190bfa21@mail.gmail.com>
	<bbaeab100605301054y73ff6741u6a388643b27083aa@mail.gmail.com>
	<d11dcfba0605301230k5e40cf41j5491de8727e6f6a2@mail.gmail.com>
	<bbaeab100605301311w14c3b811x5d939c947f4eaafc@mail.gmail.com>
	<d11dcfba0605301436o4058a383y41894e030410958b@mail.gmail.com>
	<20060530231614.GA415@Andrew-iBook2.local>
Message-ID: <bbaeab100605301627i66dfdef4jcfe220b6866826d0@mail.gmail.com>

On 5/30/06, A.M. Kuchling <amk at amk.ca> wrote:
>
> On Tue, May 30, 2006 at 03:36:02PM -0600, Steven Bethard wrote:
> > That sounds about reasonable.  One possible grouping:
>
> Note that 2.5's library reference has a different chapter organization
> from 2.4's.  See <http://docs.python.org/dev/lib/lib.html>.



That's better.  Here are my issues with it:

I don't know if I like 'data types' containing stuff like datetime which
provide information rather than help deal with data.  And other stuff like
sched don't seem to really have anything to do with data directly.

I think pdb and profile can be moved to developer tools.

I realize that the misc. category is there just to have a place to stick
'formatter' since it doesn't need its own root-level spot, but I think in
terms of a package I don't like the idea of a misc package.


-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060530/254e5165/attachment.htm 

From andrewm at object-craft.com.au  Wed May 31 02:25:22 2006
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Wed, 31 May 2006 10:25:22 +1000
Subject: [Python-Dev] Socket module corner cases
In-Reply-To: <3581AA168D87A2479D88EA319BDF7D325CA909@RED-MSG-80.redmond.corp.microsoft.com>
References: <3581AA168D87A2479D88EA319BDF7D325CA909@RED-MSG-80.redmond.corp.microsoft.com>
Message-ID: <20060531002523.1208C6F4C5A@longblack.object-craft.com.au>

>Without further ado, the questions:
>
> * getfqdn(): The module docs specify that if no FQDN can be found,
>socket.getfqdn() should return the hostname as returned by
>gethostname(). However, CPython seems to return the passed-in hostname
>rather than the local machine's hostname (as would be expected from
>gethostname()). What's the correct behavior?
>>>> s.getfqdn(' asdlfk asdfsadf ')
>'asdlfk asdfsadf'
># expected 'mybox.mydomain.com'

I would suggest the documentation is wrong and the CPython code is right
in this case: if you supply the optional /name/ argument, then you don't
want it returning your own name (but returning gethostname() is desirable
if no /name/ is supplied).

> * getfqdn(): The function seems to not always return the FQDN. For
>example, if I run the following code from 'mybox.mydomain.com', I get
>strange output. Does getfqdn() remove the common domain between my
>hostname and the one that I'm looking up?
>>>> socket.getfqdn('otherbox')
>'OTHERBOX'
># expected 'otherbox.mydomain.com'

getfqdn() calls the system library gethostbyaddr(), and searches the
result for the first name that contains '.' or if no name contains dot,
it returns the canonical name (as defined by gethostbyaddr() and the
system host name resolver libraries (hosts file, DNS, NMB)).

> * getprotobyname(): Only a few protocols seem to be supported. Why?
>>>> for p in [a[8:] for a in dir(socket) if a.startswith('IPPROTO_')]:
>...     try:
>...             print p,
>...             print socket.getprotobyname(p)
>...     except socket.error:
>...             print "(not handled)"
>...

getprotobyname() looks up the /etc/protocols file (on a unix system -
I don't know about windows), whereas the socket.IPPROTO_* constants are
populated from the #defines in netinet/in.h at compile time. 

Personally, I think /etc/protocols and the associated library functions
are a historical mistake (getprotobynumber() is marginally useful -
but python doesn't expose it!).

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

From tim.peters at gmail.com  Wed May 31 03:24:05 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 30 May 2006 21:24:05 -0400
Subject: [Python-Dev] Converting crc32 functions to use unsigned
In-Reply-To: <4CA734D0-6FEC-4A88-A00F-AB755716485D@redivi.com>
References: <4CA734D0-6FEC-4A88-A00F-AB755716485D@redivi.com>
Message-ID: <1f7befae0605301824k541d4588vcf5dc29f4b1d9514@mail.gmail.com>

[Bob Ippolito]
> It seems that we should convert the crc32 functions in binascii,
> zlib, etc. to deal with unsigned integers. Currently it seems that 32-
> bit and 64-bit platforms are going to have different results for
> these functions.

binascii.crc32 very deliberately intends to return the same signed
integer values on all platforms.  That's what this section at the end
is trying to do:

#if SIZEOF_LONG > 4
	/* Extend the sign bit.  This is one way to ensure the result is the
	 * same across platforms.  The other way would be to return an
	 * unbounded unsigned long, but the evidence suggests that lots of
	 * code outside this treats the result as if it were a signed 4-byte
	 * integer.
	 */
	result |= -(result & (1L << 31));
#endif
	return PyInt_FromLong(result);

I wouldn't try to fight that, unless that code is buggy (it looks
correct to me).  I don't remember "the evidence" now, but Barry & I
found "lots of code" relying on it at the time ;-)

> Should we do the same as the struct module, and do DeprecationWarning
> when the input value is < 0?

If the _result_ is going to change from sometimes-negative to
always-positive, then FutureWarning is most appropriate.

> ...
> None of the unit tests seem to exercise values where 32-bit and 64-
> bit platforms would have differing results, but that's easy enough to
> fix...

As above, it shouldn't be possible to concoct a case where
binascii.crc32's result differ across boxes.  That function also
intends to treat negative inputs the same way across boxes.

From blais at furius.ca  Wed May 31 03:32:25 2006
From: blais at furius.ca (Martin Blais)
Date: Tue, 30 May 2006 21:32:25 -0400
Subject: [Python-Dev] Let's stop eating exceptions in dict lookup
In-Reply-To: <447B7621.3040100@ewtllc.com>
References: <20060529171147.GA1717@code0.codespeak.net>
	<008f01c68354$fc01dd70$dc00000a@RaymondLaptop1>
	<20060529195453.GB14908@code0.codespeak.net>
	<00de01c68363$308d37c0$dc00000a@RaymondLaptop1>
	<20060529213428.GA20141@code0.codespeak.net>
	<447B7621.3040100@ewtllc.com>
Message-ID: <8393fff0605301832s1776723dnfc63620631e7c5fd@mail.gmail.com>

On 5/29/06, Raymond Hettinger <rhettinger at ewtllc.com> wrote:
>
>  If it is really 0.5%, then we're fine.  Just remember that PyStone is an
> amazingly uninformative and crappy benchmark.

I'm still looking for a benchmark that is not amazingly uninformative
and crappy.  I've been looking around all day, I even looked under the
bed, I cannot find it.  I've also been looking around all day as well,
even looked for it shooting out of the Iceland geysirs, of all
places--it's always all day out here it seems, day and day-- and I
still can't find it.  (In the process however, I found Thule beer and
strangely dressed vikings, which makes it all worthwhile.)

cheers,

From t-bruch at microsoft.com  Wed May 31 03:59:58 2006
From: t-bruch at microsoft.com (Bruce Christensen)
Date: Tue, 30 May 2006 18:59:58 -0700
Subject: [Python-Dev] Socket module corner cases
In-Reply-To: <20060531002523.1208C6F4C5A@longblack.object-craft.com.au>
References: <3581AA168D87A2479D88EA319BDF7D325CA909@RED-MSG-80.redmond.corp.microsoft.com>
	<20060531002523.1208C6F4C5A@longblack.object-craft.com.au>
Message-ID: <3581AA168D87A2479D88EA319BDF7D325CACD0@RED-MSG-80.redmond.corp.microsoft.com>

> > * getfqdn(): The function seems to not always return the FQDN. For 
> >example, if I run the following code from 'mybox.mydomain.com', I get

> >strange output. Does getfqdn() remove the common domain between my 
> >hostname and the one that I'm looking up?
> >>>> socket.getfqdn('otherbox')
> >'OTHERBOX'
> ># expected 'otherbox.mydomain.com'
> 
> getfqdn() calls the system library gethostbyaddr(), and searches the
> result for the first name that contains '.' or if no name contains
dot, it
> returns the canonical name (as defined by gethostbyaddr() and the
system
> host name resolver libraries (hosts file, DNS, NMB)).

After a little more investigation here, it appears that getfqdn()
returns the
name unchanged (or perhaps run through the system's resolver libs) if
there's no reverse DNS PTR entry. In the case given above,
otherbox.mydomain.com didn't have a reverse DNS entry, so getfqdn()
returned
'OTHERBOX'. However, when getfqdn() is called with a name whose IP
*does*
have a PTR record, it returns the correct FQDN.

Thanks for the help. Now, couple more questions:

getnameinfo() accepts the NI_NAMEREQD flag. It appears, though that a
name
lookup (and associated error if the lookup fails) occurs independent of
whether the flag is specified. Does it actually do anything?

Does getnameinfo() support IPv6? It appears to fail (with a socket.error
that says "sockaddr resolved to multiple addresses") if both IPv4 and
IPv6
are enabled.

--Bruce

From jafo at tummy.com  Sat May 27 11:37:32 2006
From: jafo at tummy.com (Sean Reifschneider)
Date: Sat, 27 May 2006 03:37:32 -0600
Subject: [Python-Dev] Need for Speed Sprint status
In-Reply-To: <ee2a432c0605262151s72c83e3rb387a21fb5480ef9@mail.gmail.com>
References: <ee2a432c0605262151s72c83e3rb387a21fb5480ef9@mail.gmail.com>
Message-ID: <20060527093731.GN32617@tummy.com>

On Fri, May 26, 2006 at 09:51:55PM -0700, Neal Norwitz wrote:
>I just hope to hell you're done, because I can't keep up! :-)

Yeah, sucks to be you.  :-)

>It would help me enormously if someone could summarize the status and
>everything that went on.  These are the things that would help me the
>most.

I'm hoping that before we break tonight, that we can do an overview of the
changes, and make sure that everything is on the Wiki pages that Raymond
pointed to.

I've also been writing reports of activity on my work blog at
http://www.tummy.com/journals/users/jafo/

Thanks,
Sean
-- 
 Unix actually IS user friendly -- it's just very picky about whom it
 calls its friend.
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From jafo at tummy.com  Tue May 30 00:36:07 2006
From: jafo at tummy.com (Sean Reifschneider)
Date: Mon, 29 May 2006 16:36:07 -0600
Subject: [Python-Dev] 2.5a2 try/except slow-down: Convert to type?
In-Reply-To: <20060524102441.GD32617@tummy.com>
References: <20060524102441.GD32617@tummy.com>
Message-ID: <20060529223607.GO32617@tummy.com>

Changes have been checked into the trunk for converting exceptions to types
instead of classes.  According to pybench, 2.5a2 was 60% slower at
raising exceptions than 2.4.3, and now trunk is 30% faster than 2.4.3.
Yay!

Thanks,
Sean
-- 
 "I was on IRC once and got mistaken for Dan Bernstein. I still have
 nightmares."  -- Donnie Barnes
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability


From Andreas.Floeter at web.de  Tue May 30 19:53:38 2006
From: Andreas.Floeter at web.de (Andreas =?iso-8859-1?q?Fl=F6ter?=)
Date: Tue, 30 May 2006 19:53:38 +0200
Subject: [Python-Dev] Segmentation fault of Python if build on Solaris 9 or
	10 with Sun Studio 11
Message-ID: <200605301953.38638.Andreas.Floeter@web.de>

I have filed a bug report "Building Python 2.4.3 on Solaris 9/10 with Sun 
Studio 11" 1496561 in the Python tracker. The problem I have encountered is, 
that some of the unit tests of the application roundup fail with Python 
producing a segmentation fault and dumping core, if Python was build with Sun 
Studio 11 (Sun C 5.8). In fact not only some of the unit tests fail, but also 
the application roundup at certain steps. 

If gcc is used, everything works fine. As Richard Jones suggests, it might be 
a problem in the anydbm module. I would rather prefer to use the native 
compiler of a platform. To name only two reason, distributing the application 
is easier (dynamic library dependencies are most likely met on the target 
system) and Sun is maintaining the reference native libraries.

Help would be appreciated, thanks
Andreas

From tca at gnu.org  Tue May 30 22:32:11 2006
From: tca at gnu.org (Tom Cato Amundsen)
Date: Tue, 30 May 2006 22:32:11 +0200
Subject: [Python-Dev] optparse and unicode
Message-ID: <20060530203211.GA13434@localhost.localdomain>

optparse
========

Using unicode strings with non-ascii chars.[1]

I'm working around this by subclassing OptionParser. Below is a
workaround I use in GNU Solfege. Should something like this be
included in python 2.5?

(Please CC me any answer.)

Tom Cato

--- orig/src/mainwin.py
+++ mod/src/mainwin.py
@@ -30,7 +30,13 @@
 
 import optparse
 
-opt_parser = optparse.OptionParser()
+class MyOptionParser(optparse.OptionParser):
+    def print_help(self, file=None):
+        if file is None:
+            file = sys.stdout
+        file.write(self.format_help().encode(file.encoding, 'replace'))
+
+opt_parser = MyOptionParser()
 opt_parser.add_option('-v', '--version', action='store_true', dest='version')
 opt_parser.add_option('-w', '--warranty', action='store_true', dest='warranty',
     help=_('Show warranty and copyright.'))

[1]

Traceback (most recent call last):
  File "./solfege.py", line 43, in ?
    import src.mainwin
  File "/home/tom/src/solfege-mcnv/src/mainwin.py", line 70, in ?
    options, args = opt_parser.parse_args()
  File "/usr/lib/python2.3/optparse.py", line 1129, in parse_args
    stop = self._process_args(largs, rargs, values)
  File "/usr/lib/python2.3/optparse.py", line 1169, in _process_args
    self._process_long_opt(rargs, values)
  File "/usr/lib/python2.3/optparse.py", line 1244, in _process_long_opt
    option.process(opt, value, values, self)
  File "/usr/lib/python2.3/optparse.py", line 611, in process
    return self.take_action(
  File "/usr/lib/python2.3/optparse.py", line 632, in take_action
    parser.print_help()
  File "/usr/lib/python2.3/optparse.py", line 1370, in print_help
    file.write(self.format_help())
UnicodeEncodeError: 'ascii' codec can't encode characters in position 200-202: ordinal not in range(128)
-- 
Tom Cato Amundsen <tca at gnu.org>                 http://www.solfege.org/
GNU Solfege - free ear training    http://www.gnu.org/software/solfege/

From andrewm at object-craft.com.au  Wed May 31 04:15:16 2006
From: andrewm at object-craft.com.au (Andrew McNamara)
Date: Wed, 31 May 2006 12:15:16 +1000
Subject: [Python-Dev] Socket module corner cases
In-Reply-To: Your message of "Tue, 30 May 2006 18:59:58 MST."
	<3581AA168D87A2479D88EA319BDF7D325CACD0@RED-MSG-80.redmond.corp.microsoft.com>
Message-ID: <20060531021516.34C8D6F4C83@longblack.object-craft.com.au>

>After a little more investigation here, it appears that getfqdn() returns
>the name unchanged (or perhaps run through the system's resolver libs) if
>there's no reverse DNS PTR entry. In the case given above,
>otherbox.mydomain.com didn't have a reverse DNS entry, so getfqdn()
>returned 'OTHERBOX'. However, when getfqdn() is called with a name whose
>IP *does* have a PTR record, it returns the correct FQDN.

That sounds entirely plausible.

Many of these name resolver functions pre-date DNS, and show their
/etc/hosts heritage somewhat (gethostbyaddr returning ip, names and
aliases in one hit is a classic example - this isn't easy with DNS).

>Thanks for the help. Now, couple more questions:
>
>getnameinfo() accepts the NI_NAMEREQD flag. It appears, though that a name
>lookup (and associated error if the lookup fails) occurs independent of
>whether the flag is specified. Does it actually do anything?
>
>Does getnameinfo() support IPv6? It appears to fail (with a socket.error
>that says "sockaddr resolved to multiple addresses") if both IPv4 and IPv6
>are enabled.

Someone more knowledgeable will have to answer these.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/

From guido at python.org  Wed May 31 05:10:07 2006
From: guido at python.org (Guido van Rossum)
Date: Tue, 30 May 2006 20:10:07 -0700
Subject: [Python-Dev] Segmentation fault of Python if build on Solaris 9
	or 10 with Sun Studio 11
In-Reply-To: <200605301953.38638.Andreas.Floeter@web.de>
References: <200605301953.38638.Andreas.Floeter@web.de>
Message-ID: <ca471dc20605302010y410ccaa9o990369af19f4bc8b@mail.gmail.com>

On 5/30/06, Andreas Fl?ter <Andreas.Floeter at web.de> wrote:
> I have filed a bug report "Building Python 2.4.3 on Solaris 9/10 with Sun
> Studio 11" 1496561 in the Python tracker. The problem I have encountered is,
> that some of the unit tests of the application roundup fail with Python
> producing a segmentation fault and dumping core, if Python was build with Sun
> Studio 11 (Sun C 5.8). In fact not only some of the unit tests fail, but also
> the application roundup at certain steps.
>
> If gcc is used, everything works fine. As Richard Jones suggests, it might be
> a problem in the anydbm module. I would rather prefer to use the native
> compiler of a platform. To name only two reason, distributing the application
> is easier (dynamic library dependencies are most likely met on the target
> system) and Sun is maintaining the reference native libraries.

On the other hand, there is the fact that the binaries produced by the
Sun compiler segfault... :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ronaldoussoren at mac.com  Wed May 31 08:03:55 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Wed, 31 May 2006 08:03:55 +0200
Subject: [Python-Dev] optparse and unicode
In-Reply-To: <20060530203211.GA13434@localhost.localdomain>
References: <20060530203211.GA13434@localhost.localdomain>
Message-ID: <E40D59FF-BBC4-4C09-B4C5-5095CC481A14@mac.com>


On 30-mei-2006, at 22:32, Tom Cato Amundsen wrote:

> optparse
> ========
>
> Using unicode strings with non-ascii chars.[1]
>
> I'm working around this by subclassing OptionParser. Below is a
> workaround I use in GNU Solfege. Should something like this be
> included in python 2.5?

Could you please file a bug or feature request for this on SF? Your  
report might get lost otherwise.

Ronald


From talin at acm.org  Wed May 31 08:46:06 2006
From: talin at acm.org (Talin)
Date: Tue, 30 May 2006 23:46:06 -0700
Subject: [Python-Dev] [Python-3000] stdlib reorganization
In-Reply-To: <20060530231614.GA415@Andrew-iBook2.local>
References: <ca471dc20605221028y91d50e0n4dcc2ed7b186fc70@mail.gmail.com>	<4472B196.7070506@acm.org>	<ca471dc20605230817x331241e6r45e63c4c1c0eb8ed@mail.gmail.com>	<447BC126.8050107@acm.org>	<bbaeab100605300925k151a1437gea18eeaafe5c8068@mail.gmail.com>	<d11dcfba0605301032m647094d2tcfe814ba190bfa21@mail.gmail.com>	<bbaeab100605301054y73ff6741u6a388643b27083aa@mail.gmail.com>	<d11dcfba0605301230k5e40cf41j5491de8727e6f6a2@mail.gmail.com>	<bbaeab100605301311w14c3b811x5d939c947f4eaafc@mail.gmail.com>	<d11dcfba0605301436o4058a383y41894e030410958b@mail.gmail.com>
	<20060530231614.GA415@Andrew-iBook2.local>
Message-ID: <447D3BAE.9040004@acm.org>

A.M. Kuchling wrote:
> On Tue, May 30, 2006 at 03:36:02PM -0600, Steven Bethard wrote:
> 
>>That sounds about reasonable.  One possible grouping:
> 
> 
> Note that 2.5's library reference has a different chapter organization
> from 2.4's.  See <http://docs.python.org/dev/lib/lib.html>.

I like it. Its a much cleaner organization than the 2.4 libs. I would 
like to see it used as a starting point for a reorg of the standard lib 
namespace.

One question that is raised is whether the categories should map 
directly to package names in all cases. For example, I can envision a 
desire that 'sys' would stay a top-level name, rather than 'rt.sys'. 
Certain modules are so fundamental that they deserve IMHO to live in the 
root namespace.

-- Talin

From nnorwitz at gmail.com  Wed May 31 09:49:48 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Wed, 31 May 2006 00:49:48 -0700
Subject: [Python-Dev] test_struct failure on 64 bit platforms
Message-ID: <ee2a432c0605310049r16ffb3f3j1a7c5b0ceba31401@mail.gmail.com>

Bob,

There are a couple of things I don't understand about the new struct.
Below is a test that fails.

$ ./python ./Lib/test/regrtest.py test_tarfile test_struct
test_tarfile
/home/pybot/test-trunk/build/Lib/struct.py:63: DeprecationWarning: 'l'
format requires -2147483648 <= number <= 2147483647
  return o.pack(*args)
test_struct
test test_struct failed -- pack('>l', -2147483649) did not raise error
1 test OK.
1 test failed:
    test_struct

####

I fixed the error message (the min value was off by one before).  I
think I fixed a few ssize_t issues too.

The remaining issues I know of are:
  * The warning only appears on 64-bit platforms.
  * The warning doesn't seem correct for 64-bit platforms (l is 8 bytes, not 4).
  * test_struct only fails if run after test_tarfile.
  * The msg from test_struct doesn't seem correct for 64-bit platforms.

I tracked the problem down to trying to write the gzip tar file.  Can
you fix this?

n

From bob at redivi.com  Wed May 31 09:59:15 2006
From: bob at redivi.com (Bob Ippolito)
Date: Wed, 31 May 2006 00:59:15 -0700
Subject: [Python-Dev] test_struct failure on 64 bit platforms
In-Reply-To: <ee2a432c0605310049r16ffb3f3j1a7c5b0ceba31401@mail.gmail.com>
References: <ee2a432c0605310049r16ffb3f3j1a7c5b0ceba31401@mail.gmail.com>
Message-ID: <4ED0CBC7-B364-4C6C-9E59-5A476582737D@redivi.com>


On May 31, 2006, at 12:49 AM, Neal Norwitz wrote:

> Bob,
>
> There are a couple of things I don't understand about the new struct.
> Below is a test that fails.
>
> $ ./python ./Lib/test/regrtest.py test_tarfile test_struct
> test_tarfile
> /home/pybot/test-trunk/build/Lib/struct.py:63: DeprecationWarning: 'l'
> format requires -2147483648 <= number <= 2147483647
>  return o.pack(*args)
> test_struct
> test test_struct failed -- pack('>l', -2147483649) did not raise error
> 1 test OK.
> 1 test failed:
>    test_struct
>
> ####
>
> I fixed the error message (the min value was off by one before).  I
> think I fixed a few ssize_t issues too.
>
> The remaining issues I know of are:
>  * The warning only appears on 64-bit platforms.
>  * The warning doesn't seem correct for 64-bit platforms (l is 8  
> bytes, not 4).
>  * test_struct only fails if run after test_tarfile.
>  * The msg from test_struct doesn't seem correct for 64-bit platforms.
>
> I tracked the problem down to trying to write the gzip tar file.  Can
> you fix this?

The warning is correct, and so is the size. Only native formats have  
native sizes; l and i are exactly 4 bytes on all platforms when using  
=, >, <, or !. That's what "std size and alignment" means.

It looks like the only thing that's broken here is the test. The  
behavior changed to consistently allow any integer whatsoever to be  
passed to struct for all formats (except q and Q which have always  
done proper range checking). Previously, the range checking was  
inconsistent across platforms (32-bit and 64-bit anyway) and when  
using int vs. long.

Unfortunately I don't have a 64-bit platform easily accessible and I  
have no idea which test it is that's raising the warning. Could you  
isolate it?

-bob


From nnorwitz at gmail.com  Wed May 31 10:03:05 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Wed, 31 May 2006 01:03:05 -0700
Subject: [Python-Dev] fixing buildbots
In-Reply-To: <1149024203.447cb7cbd5f92@www.domainfactory-webmail.de>
References: <ee2a432c0605292354j4d454c1ehc55c8ee29dac9293@mail.gmail.com>
	<1149024203.447cb7cbd5f92@www.domainfactory-webmail.de>
Message-ID: <ee2a432c0605310103l68dcd006x13678dc2c21165db@mail.gmail.com>

On 5/30/06, martin at v.loewis.de <martin at v.loewis.de> wrote:
> Zitat von Neal Norwitz <nnorwitz at gmail.com>:
>
> > I've been starting to get some of the buildbots working again.  There
> > was some massive problem on May 25 where a ton of extra files were
> > left around.  I can't remember if I saw something about that at the
> > NFS sprint or not.
>
> You can clean each buildbot remotely by forcing a build on a
> (non-existing) branch. That removes the build tree, and then
> tries to build the branch (which fails if the branch doesn't
> exist). The next build will then start with a fresh checkout
> of the trunk.

That's good to know.  Thanks.  In this case there were also bogus
files/directories created that were siblings of the build area, so I
don't think that those could have been cleaned up.  I'm pretty sure it
would have fixed the problem though.

> > There is a lingering problem that I can't fix on all the boxes.  Namely:
> >
> >   cp Modules/Setup{.dist,}
> >
> > Should we always do that step before we build on the buildbots?
>
> If so, the easiest way to do it would be to "make distclean" in
> the clean step: that removes Modules/Setup as well.

I agree this is best.  I updated Makefile.pre.in and master.cfg.  I
didn't restart the buildbot yet though.  I wanted to wait for a quiet
period.

n

From nnorwitz at gmail.com  Wed May 31 11:04:31 2006
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Wed, 31 May 2006 02:04:31 -0700
Subject: [Python-Dev] test_struct failure on 64 bit platforms
In-Reply-To: <4ED0CBC7-B364-4C6C-9E59-5A476582737D@redivi.com>
References: <ee2a432c0605310049r16ffb3f3j1a7c5b0ceba31401@mail.gmail.com>
	<4ED0CBC7-B364-4C6C-9E59-5A476582737D@redivi.com>
Message-ID: <ee2a432c0605310204v2baebb1w38a894b8dbda3b1@mail.gmail.com>

On 5/31/06, Bob Ippolito <bob at redivi.com> wrote:
>
> On May 31, 2006, at 12:49 AM, Neal Norwitz wrote:
>
> > Bob,
> >
> > There are a couple of things I don't understand about the new struct.
> > Below is a test that fails.
> >
> > $ ./python ./Lib/test/regrtest.py test_tarfile test_struct
> > test_tarfile
> > /home/pybot/test-trunk/build/Lib/struct.py:63: DeprecationWarning: 'l'
> > format requires -2147483648 <= number <= 2147483647
> >  return o.pack(*args)
> > test_struct
> > test test_struct failed -- pack('>l', -2147483649) did not raise error
> > 1 test OK.
> > 1 test failed:
> >    test_struct
> >
> > ####
> >
> > I fixed the error message (the min value was off by one before).  I
> > think I fixed a few ssize_t issues too.
> >
> > The remaining issues I know of are:
> >  * The warning only appears on 64-bit platforms.
> >  * The warning doesn't seem correct for 64-bit platforms (l is 8
> > bytes, not 4).
> >  * test_struct only fails if run after test_tarfile.
> >  * The msg from test_struct doesn't seem correct for 64-bit platforms.
> >
> > I tracked the problem down to trying to write the gzip tar file.  Can
> > you fix this?
>
> The warning is correct, and so is the size. Only native formats have
> native sizes; l and i are exactly 4 bytes on all platforms when using
> =, >, <, or !. That's what "std size and alignment" means.

Ah, you are correct.  I see this is the behaviour in 2.4.  Though I
wouldn't call 4 bytes a standard size on a 64-bit platform.

> Unfortunately I don't have a 64-bit platform easily accessible and I
> have no idea which test it is that's raising the warning. Could you
> isolate it?

I wasted sleep for that?  Damn and I gotta get up early again tomorrow
too.  See the checkin for answer.  Would someone augment the warnings
module to make testing more reasonable?

n

From blais at furius.ca  Wed May 31 12:35:39 2006
From: blais at furius.ca (Martin Blais)
Date: Wed, 31 May 2006 06:35:39 -0400
Subject: [Python-Dev] [Python-checkins] r46300 - in python/trunk:
	Lib/socket.py Lib/test/test_socket.py Lib/test/test_struct.py
	Modules/_struct.c Modules/arraymodule.c Modules/socketmodule.c
In-Reply-To: <ca471dc20605291259h561526cek9dad74288ae0976b@mail.gmail.com>
References: <20060526120329.9A9671E400C@bag.python.org>
	<ca471dc20605260956t35aafa71tcad19b926c0c4b41@mail.gmail.com>
	<CD67B679-50C4-4398-81BF-40F3ED6B106B@redivi.com>
	<ca471dc20605291259h561526cek9dad74288ae0976b@mail.gmail.com>
Message-ID: <8393fff0605310335o12b2b26ew43ba63cfd69c26e9@mail.gmail.com>

On 5/29/06, Guido van Rossum <guido at python.org> wrote:
> [python-checkins]
> > >> * Added socket.recv_buf() and socket.recvfrom_buf() methods, that
> > >> use the buffer
> > >>   protocol (send and sendto already did).
> > >>
> > >> * Added struct.pack_to(), that is the corresponding buffer
> > >> compatible method to
> > >>   unpack_from().
>
> [Guido]
> > > Hm... The file object has a similar method readinto(). Perhaps the
> > > methods introduced here could follow that lead instead of using two
> > > different new naming conventions?
>
> On 5/27/06, Bob Ippolito <bob at redivi.com> wrote:
> > (speaking specifically about struct and not socket)
> >
> > pack_to and unpack_from are named as such because they work with objects
> > that support the buffer API (not file-like-objects). I couldn't find any
> > existing convention for objects that manipulate buffers in such a way.
>
> Actually, <fileobject>.readinto(<bufferobject>) is the convention I
> started long ago (at the time the buffer was introduced.
>
> > If there is an existing convention then I'd be happy to rename these.
> >
> > "readinto" seems to imply that some kind of position is being
> > incremented.
>
> No -- it's like read() but instead of returning a new object it reads
> "into" an object you pass.
>
> > Grammatically it only works if it's implemented on all buffer
> > objects, but
> > in this case it's implemented on the Struct type.
>
> It looks like <structobject>.pack_to(<bufferobject>, <other args>) is
> similar to <fileobject>.readinto(<bufferobject>) in that the buffer
> object receives the result of packing / reading.

So I assume you're suggesting the following renames:

  pack_to -> packinto
  recv_buf -> recvinto
  recvfrom_buf -> recvfrominto

(I don't like that last one very much.
I'll go ahead and make those renames once I return.)

From ncoghlan at gmail.com  Wed May 31 12:33:10 2006
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 31 May 2006 20:33:10 +1000
Subject: [Python-Dev] Reporting unexpected import failures as test failures
	in regrtest.py
In-Reply-To: <1f7befae0605301835r4f62fb65udf3c144a69eccfce@mail.gmail.com>
References: <1f7befae0605301835r4f62fb65udf3c144a69eccfce@mail.gmail.com>
Message-ID: <447D70E6.5020500@gmail.com>

Some background for those not watching python-checkins:

I neglected to do "svn add" for the new functools Python module when 
converting functional->functools. The buildbots stayed green because the 
ImportError triggered by the line "import functools" in test_functools was 
treated as a TestSkipped by regrtest.py.

Georg noticed the file missing from the checkin message, but this is the 
second time I (and the buildbots) have missed a regression due to this 
behaviour. (As I recall, last time I checked in some broken code because I 
didn't notice the additional entry appearing in the list of unexpected skips 
in my local testing)

Tim Peters wrote:
> [Nick Coghlan]
>> ... (we should probably do something about that misleading ImportError ->
>> TestSkipped -> green buildbot behaviour. . . )
> 
> I looked at that briefly a few weeks back and gave up.  Seemed the
> sanest thing was to entirely stop treating ImportError as "test
> skipped", and rewrite tests that legimately _may_ get skipped to catch
> expected ImportErrors and change them to TestSkipped themselves.
> 
> A bit of framework might help; e.g., a test that expects to get
> skipped due to failing imports on some platforms could define a
> module-level list bound to a conventional name containing the names of
> the modules whose import failure should be treated as TestSkipped, and
> then regrtest.py could be taught to check import errors against the
> test module's list (if any).
> 
> In the case du jour, test_functools.py presumably wouldn't define that
> list, so that any ImportError it raised would be correctly treated as
> test failure.

What if we appended unexpected skips to the list of bad tests so that they get 
rerun in verbose mode and the return value becomes non-zero?

     print count(len(surprise), "skip"), \
                       "unexpected on", plat + ":"
     printlist(surprise)
     # Add the next line after the previous two in regrtest.py
     bad.extend(surprise)

(This happens after the count of failed tests has been printed, so we don't 
affect that output)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org

From blais at furius.ca  Wed May 31 13:02:02 2006
From: blais at furius.ca (Martin Blais)
Date: Wed, 31 May 2006 07:02:02 -0400
Subject: [Python-Dev] A can of worms... (does Python C code have a new C
	style?)
Message-ID: <8393fff0605310402l4d45101agedbb47b3d8d2c12a@mail.gmail.com>

Hi all

I'd like to know what the policy is on the source code indentation for
C code in the interpreter.  At the Need-for-Speed sprints, there was
consensus that there is a "new" indentation for style for the Python C
source files, with

* indentation (emacs: c-basic-offset): 4 chars
* no tabs (so tab-width does not matter)
* 79 chars max for lines (I think)

(Correct me if any of this is wrong.)  I cannot find where this
discussion comes from, PEP 7 seems to imply that the new style only
applies to Python 3k.  Is this correct?

However, people were maintaining the existing styles when they were
editing part of existing files (I setup emacs users with two c-styles,
"python" and "python-new", so they could switch per-file), but using
the new guidelines for new files.  I look at the source code, and
there is a bit of a mix of the two styles in some places.

Is there a "grand plan" to convert all the code at once at some point?
If not, how is it that these new guidelines are supposed to take
effect?

I could easily contribute to the vaccuuming here, by writing a script
that will run emacs in batch mode on all the C code with the
appropriate new style to do the following for each C file present:

* re-indent the entire file according to the new guidelines
* convert all the tabs to whitespace
* delete trailing whitespace on all lines
* remove those pesky CR if there are any
* (anything else you'd like that's easily done from emacs?)

The problem is that if someone was to check-in the result of this
automation there would be a huge wave of complaints from merge
conflicts for people who have a checkouts with lots of uncommitted
changes.  I don't see any painless way to do this.  To ease the pain,
however, it could be done at a fixed time, giving everyone a wide
margin of chance to commit beforehand.

In addition, should this be applied, we should probably create a
Subversion hook that will automatically convert tabs to spaces, or
fails the commit if the files don't comply.

I realize this is potentially opening a can of worms, but on the one
hand I'd like to know what's the low-down on the indentation, and on
the other hand I won't spend a second on this if this is only going to
lead to trouble.

Then again, maybe I misunderstood and the new rules apply only to
Python 3k, in which case, well, press delete and move on.

cheers,

From barry at python.org  Wed May 31 14:10:34 2006
From: barry at python.org (Barry Warsaw)
Date: Wed, 31 May 2006 08:10:34 -0400
Subject: [Python-Dev] A can of worms... (does Python C code have a new C
 style?)
In-Reply-To: <8393fff0605310402l4d45101agedbb47b3d8d2c12a@mail.gmail.com>
References: <8393fff0605310402l4d45101agedbb47b3d8d2c12a@mail.gmail.com>
Message-ID: <20060531081034.08fc6518@geddy.wooz.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, 31 May 2006 07:02:02 -0400
"Martin Blais" <blais at furius.ca> wrote:

> I'd like to know what the policy is on the source code indentation for
> C code in the interpreter.  At the Need-for-Speed sprints, there was
> consensus that there is a "new" indentation for style for the Python C
> source files, with
> 
> * indentation (emacs: c-basic-offset): 4 chars
> * no tabs (so tab-width does not matter)
> * 79 chars max for lines (I think)
> 
> (Correct me if any of this is wrong.)  I cannot find where this
> discussion comes from, PEP 7 seems to imply that the new style only
> applies to Python 3k.  Is this correct?

AFAIK, yes it only applies to Py3K.  There are no plans to re-indent the
Python 2.x C code.  Or maybe, there have been plans to do so for > 10
years and in Py3K those plans might finally come to fruition. :)

BTW, A while back I think I posted a "py3k" cc-mode style for you
X/Emacs hackers out there.  It's based on the standard "python" style
but changes c-basic-offset and indent-tabs-mode.  Let me know if you
can't find it in the archives.

- -Barry
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iQCVAwUBRH2HvnEjvBPtnXfVAQJv/wQAhKBvh49xW59JgKv6tq9O/QvU/jvZJSPw
qHMjOi55IGdFUG4zrSOH08U0VSkkM/mhoBgAqggNnsWMsFjtEu0NeOcroKIPBmLK
RU1B4sw78RQFj/phEBpCvgObYRoI8lEVJYnXKFAp6yY9qGdJRGIRGhVX5nnBz/as
4BLr5tADpHo=
=IFzC
-----END PGP SIGNATURE-----

From amk at amk.ca  Wed May 31 14:54:35 2006
From: amk at amk.ca (A.M. Kuchling)
Date: Wed, 31 May 2006 08:54:35 -0400
Subject: [Python-Dev] [Python-3000] stdlib reorganization
In-Reply-To: <447D3BAE.9040004@acm.org>
References: <ca471dc20605230817x331241e6r45e63c4c1c0eb8ed@mail.gmail.com>
	<447BC126.8050107@acm.org>
	<bbaeab100605300925k151a1437gea18eeaafe5c8068@mail.gmail.com>
	<d11dcfba0605301032m647094d2tcfe814ba190bfa21@mail.gmail.com>
	<bbaeab100605301054y73ff6741u6a388643b27083aa@mail.gmail.com>
	<d11dcfba0605301230k5e40cf41j5491de8727e6f6a2@mail.gmail.com>
	<bbaeab100605301311w14c3b811x5d939c947f4eaafc@mail.gmail.com>
	<d11dcfba0605301436o4058a383y41894e030410958b@mail.gmail.com>
	<20060530231614.GA415@Andrew-iBook2.local>
	<447D3BAE.9040004@acm.org>
Message-ID: <20060531125435.GA14561@rogue.amk.ca>

On Tue, May 30, 2006 at 11:46:06PM -0700, Talin wrote:
> I like it. Its a much cleaner organization than the 2.4 libs. I would 
> like to see it used as a starting point for a reorg of the standard lib 
> namespace.

I'm not convinced that the chapter organization of a book is
necessarily the best choice for the actual package layout of the
standard library.  A grouping that's useful for reference may not be
ideal for staying in one's memory.

--amk

From tim.peters at gmail.com  Wed May 31 15:05:52 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 31 May 2006 09:05:52 -0400
Subject: [Python-Dev] test_struct failure on 64 bit platforms
In-Reply-To: <ee2a432c0605310204v2baebb1w38a894b8dbda3b1@mail.gmail.com>
References: <ee2a432c0605310049r16ffb3f3j1a7c5b0ceba31401@mail.gmail.com>
	<4ED0CBC7-B364-4C6C-9E59-5A476582737D@redivi.com>
	<ee2a432c0605310204v2baebb1w38a894b8dbda3b1@mail.gmail.com>
Message-ID: <1f7befae0605310605l71cddd63hecba38dc92478895@mail.gmail.com>

[Bob]
>> The warning is correct, and so is the size. Only native formats have
>> native sizes; l and i are exactly 4 bytes on all platforms when using
>> =, >, <, or !. That's what "std size and alignment" means.

[Neal]
> Ah, you are correct.  I see this is the behaviour in 2.4.  Though I
> wouldn't call 4 bytes a standard size on a 64-bit platform.

"standard" is a technical word with precise meaning here, and is
defined by the struct module docs, in contrast to "native".  It means
whatever they say it means :-)  "Portable" may have been a more
intuitive word than "standard" here -- read "standard" in the struct
context in the sense of "standardized, regardless of native platform
size or alignment or endian quirks".

> Would someone augment the warnings module to make testing
> more reasonable?

What's required?  I know of two things:

1. There's no advertised way to save+restore the internal
   filter list, or to remove a filter entry, so tests that want
   to make a temporary change have to break into the internals.

2. There's no advertised way to disable "only gripe once per source
   line" behavior.  This gets in the way of testing that warnings get
   raised when running tests more than once, or using a common
   function to raise warnings from multiple call sites.

These get in the way of Zope and ZODB testing too, BTW.
Unfortunately, looks like the new test_struct code bumped into both of
them at once.

From tim.peters at gmail.com  Wed May 31 15:37:05 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 31 May 2006 09:37:05 -0400
Subject: [Python-Dev] Reporting unexpected import failures as test
	failures in regrtest.py
In-Reply-To: <447D70E6.5020500@gmail.com>
References: <1f7befae0605301835r4f62fb65udf3c144a69eccfce@mail.gmail.com>
	<447D70E6.5020500@gmail.com>
Message-ID: <1f7befae0605310637w7d47618bx4b5854fc04ac5d20@mail.gmail.com>

[Nick Coghlan]
> What if we appended unexpected skips to the list of bad tests so that they get
> rerun in verbose mode and the return value becomes non-zero?
>
>      print count(len(surprise), "skip"), \
>                        "unexpected on", plat + ":"
>      printlist(surprise)
>      # Add the next line after the previous two in regrtest.py
>      bad.extend(surprise)
>
> (This happens after the count of failed tests has been printed, so we don't
> affect that output)

While "expected skips" is reliable on Windows, it's just a guess
everywhere else.  That's because the Windows build is unique in
supplying and compiling all the external packages Python ships with on
Windows.  On non-Windows boxes, which skips "are expected" also
depends on which external packages happen to be installed on the box,
and how Python was configured, and regrtest.py has no knowledge about
those.

For example, on the "sparc solaris10 gcc trunk" buildbot, 6 tests show
up as "unexpected skips", and there's just no way for a
non-platform-expert to guess which of those are and aren't pointing at
problems.  It's easy enough to guess that test_zipfile is skipped
because the box doesn't have an external zlib installed, but why is
test_ioctl skipped?  I have no idea (and neither does regrtest.py),
but it's presumably obvious to a Sun expert.

In any case, if "unexpected skips" got appended to `bad`, the non-zero
exit code would cause that buildbot slave to fail every time.  It's
not unique, either; e.g., the "x86 gentoo trunk" slave always skips
test_tcl, presumably because Tcl isn't installed on that box.

In short, there's simply no way to _guess_ which import errors are and
aren't "expected".  Improving that requires adding more knowledge to
the testing machinery.  The most important thing is to add knowledge
about which import errors are _never_ expected (perhaps by adding
knowledge about which imports _may_ legitimately fail), since we can
know that for sure about x-platform modules, and that's exactly the
problem that keeps burning us.  Confusing ImportError with TestSkipped
is an admirably lazy idea that unfortunately didn't turn out to work
very well.

From ronaldoussoren at mac.com  Wed May 31 16:05:55 2006
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Wed, 31 May 2006 16:05:55 +0200
Subject: [Python-Dev] test_struct failure on 64 bit platforms
In-Reply-To: <1f7befae0605310605l71cddd63hecba38dc92478895@mail.gmail.com>
References: <ee2a432c0605310049r16ffb3f3j1a7c5b0ceba31401@mail.gmail.com>
	<4ED0CBC7-B364-4C6C-9E59-5A476582737D@redivi.com>
	<ee2a432c0605310204v2baebb1w38a894b8dbda3b1@mail.gmail.com>
	<1f7befae0605310605l71cddd63hecba38dc92478895@mail.gmail.com>
Message-ID: <2402938.1149084355602.JavaMail.ronaldoussoren@mac.com>

 
On Wednesday, May 31, 2006, at 03:06PM, Tim Peters <tim.peters at gmail.com> wrote:
>> Would someone augment the warnings module to make testing
>> more reasonable?
>
>What's required?  I know of two things:
>
>1. There's no advertised way to save+restore the internal
>   filter list, or to remove a filter entry, so tests that want
>   to make a temporary change have to break into the internals.
>
>2. There's no advertised way to disable "only gripe once per source
>   line" behavior.  This gets in the way of testing that warnings get
>   raised when running tests more than once, or using a common
>   function to raise warnings from multiple call sites.
>
>These get in the way of Zope and ZODB testing too, BTW.
>Unfortunately, looks like the new test_struct code bumped into both of
>them at once.

The really annoying part of the new struct warnings is that the warning line mentions a line in struct.py instead the caller of struct.pack. That makes it hard to find the source of the warning without telling the warnings module to raise an exception for DeprecationWarnings.

Ronald

From guido at python.org  Wed May 31 16:10:22 2006
From: guido at python.org (Guido van Rossum)
Date: Wed, 31 May 2006 07:10:22 -0700
Subject: [Python-Dev] [Python-checkins] r46300 - in python/trunk:
	Lib/socket.py Lib/test/test_socket.py Lib/test/test_struct.py
	Modules/_struct.c Modules/arraymodule.c Modules/socketmodule.c
In-Reply-To: <8393fff0605310335o12b2b26ew43ba63cfd69c26e9@mail.gmail.com>
References: <20060526120329.9A9671E400C@bag.python.org>
	<ca471dc20605260956t35aafa71tcad19b926c0c4b41@mail.gmail.com>
	<CD67B679-50C4-4398-81BF-40F3ED6B106B@redivi.com>
	<ca471dc20605291259h561526cek9dad74288ae0976b@mail.gmail.com>
	<8393fff0605310335o12b2b26ew43ba63cfd69c26e9@mail.gmail.com>
Message-ID: <ca471dc20605310710s197e412bv9bd21ff62a829548@mail.gmail.com>

On 5/31/06, Martin Blais <blais at furius.ca> wrote:
> So I assume you're suggesting the following renames:
>
>   pack_to -> packinto
>   recv_buf -> recvinto
>   recvfrom_buf -> recvfrominto
>
> (I don't like that last one very much.
> I'll go ahead and make those renames once I return.)

You could add an underscore before _into.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From g.brandl at gmx.net  Wed May 31 16:14:45 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 31 May 2006 16:14:45 +0200
Subject: [Python-Dev] A can of worms... (does Python C code have a new C
	style?)
In-Reply-To: <8393fff0605310402l4d45101agedbb47b3d8d2c12a@mail.gmail.com>
References: <8393fff0605310402l4d45101agedbb47b3d8d2c12a@mail.gmail.com>
Message-ID: <e5k8cm$gmu$1@sea.gmane.org>

Martin Blais wrote:
> Hi all
> 
> I'd like to know what the policy is on the source code indentation for
> C code in the interpreter.  At the Need-for-Speed sprints, there was
> consensus that there is a "new" indentation for style for the Python C
> source files, with
> 
> * indentation (emacs: c-basic-offset): 4 chars
> * no tabs (so tab-width does not matter)
> * 79 chars max for lines (I think)
> 
> (Correct me if any of this is wrong.)  I cannot find where this
> discussion comes from, PEP 7 seems to imply that the new style only
> applies to Python 3k.  Is this correct?

The consensus at NFS was that it also applies to newly written C files.
I did update the PEP, but that doesn't seem to have propagated to the
web site yet.

> However, people were maintaining the existing styles when they were
> editing part of existing files (I setup emacs users with two c-styles,
> "python" and "python-new", so they could switch per-file), but using
> the new guidelines for new files.  I look at the source code, and
> there is a bit of a mix of the two styles in some places.

That's bad, but is the way the code was written and should not be
changed for the sake of changing.

> Is there a "grand plan" to convert all the code at once at some point?
> If not, how is it that these new guidelines are supposed to take
> effect?

AFAIK not before Python 3.0 since it would unnecessarily break,
for instance, svn blame and make merging more difficult.

[...]

> In addition, should this be applied, we should probably create a
> Subversion hook that will automatically convert tabs to spaces, or
> fails the commit if the files don't comply.

For the future (=Py3k), I think this would be nice.


Georg


From hannu at tm.ee  Wed May 31 16:17:18 2006
From: hannu at tm.ee (Hannu Krosing)
Date: Wed, 31 May 2006 17:17:18 +0300
Subject: [Python-Dev] Iterating generator from C (PostgreSQL's	pl/python
	RETUN SETOF/RECORD iterator support broken on RedHat buggy libs)
In-Reply-To: <9e804ac0605280518i7c1c543evf44a372c0f20568e@mail.gmail.com>
References: <1148144037.3833.89.camel@localhost.localdomain>
	<9e804ac0605280518i7c1c543evf44a372c0f20568e@mail.gmail.com>
Message-ID: <1149085038.4691.11.camel@localhost.localdomain>

?hel kenal p?eval, P, 2006-05-28 kell 14:18, kirjutas Thomas Wouters:
> 
> On 5/20/06, Hannu Krosing <hannu at tm.ee> wrote:
>         I try to move this to -dev as I hope there more people reading
>         it who
>         are competent in internal working :). So please replay to -dev
>         only.
> 
> I'm not sure if you have found the problem on another mailinglist
> then, but I saw no answers on python-dev. 
> 
> 
>         -------------
>         
>         The question is about use of generators in embedde v2.4 with
>         asserts 
>         enabled.
>         
>         Can somebody explain, why the code in try2.c works with
>         wrappers 2 and 3
>         but crashes on buggy exception for all others, that is pure
>         generator
>         and wrappers 1,4,5 ?
> 
> Your example code does not crash for me, not for any of the
> 'wrapper_source' variants, linking against Python 2.4 (debian's python
> 2.4.3 on amd64, both in 64-bit and 32-bit mode) or Python 2.5 (current
> trunk, same hardware.) I don't know what kind of crash you were
> expecting, but I do see unchecked return values, which can cause
> crashes for the simplest of reasons ;-) 

Fedora Core distributes its python libs with asserts on, and this code
triggers an assert, both without a wrapper and with most wrappers

[hannu at lap plpython]$ ./try2
one
try2: Objects/genobject.c:53: gen_iternext: Assertion `f->f_back !=
((void *)0)' failed.
Aborted

This is reported to be fixed for 2.5 (they changed the assert), but I'd
like to have a workaround which would not trigger the old buggy assert.

> -- 
> Thomas Wouters <thomas at python.org>
> 
> Hi! I'm a .signature virus! copy me into your .signature file to help
> me spread!

From tim.peters at gmail.com  Wed May 31 16:42:43 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 31 May 2006 10:42:43 -0400
Subject: [Python-Dev] Add new PyErr_WarnEx() to 2.5?
Message-ID: <1f7befae0605310742y7ac1f7b3i6e9c2e5cc20affbe@mail.gmail.com>

[Ronald Oussoren, hijacking the "test_struct failure on 64 bit platforms"
 thread]
> The really annoying part of the new struct warnings is that the
> warning line mentions a line in struct.py instead the caller of
> struct.pack. That makes it hard to find the source of the
> warning without telling the warnings module to raise an
> exception for DeprecationWarnings.

The problem seems to be that Python's C API apparently gives no simple
way to supply a value for warning.warn's optional `stacklevel`
argument.  The C-level signature is:

    int PyErr_Warn(PyObject *category, char *message);

and that's what _struct.c calls.  I think it would be good to add a new

    int PyErr_WarnEx(PyObject *category, char *message, long stacklevel);

C API function, change PyErr_Warn to call that forcing stacklevel to
1, and change _struct.c to call PyErr_WarnEx with stacklevel=2.  Then
it would point at struct.pack()'s caller.

From tim.peters at gmail.com  Wed May 31 17:31:35 2006
From: tim.peters at gmail.com (Tim Peters)
Date: Wed, 31 May 2006 11:31:35 -0400
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <21F670FB-82CD-401C-92DC-23391ECCCE59@redivi.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>
	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>
	<9ED42665-07AC-4819-B2B2-D899A56038B1@redivi.com>
	<7BD6DC02-C281-4C52-B9C7-6A05E11E8A17@redivi.com>
	<1f7befae0605292000k3ac8daa8n355fb03ffb74e00@mail.gmail.com>
	<0486A093-ECAF-49DB-962B-9CFF47D7AAAB@redivi.com>
	<1f7befae0605301047h5a3afc25ibb412b8a6df73335@mail.gmail.com>
	<21F670FB-82CD-401C-92DC-23391ECCCE59@redivi.com>
Message-ID: <1f7befae0605310831s3a09ea82nfa2587e74c7012b4@mail.gmail.com>

I'm afraid a sabbatical year isn't long enough to understand what the
struct module did or intends to do by way of range checking <0.7
wink>.

Is this intended?  This is on a 32-bit Windows box with current trunk:

>>> from struct import pack as p
>>> p("I", 2**32 + 2343)
C:\Code\python\lib\struct.py:63: DeprecationWarning: 'I' format
requires 0 <= number <= 4294967295
  return o.pack(*args)
'\x00\x00\x00\x00'

The warning makes sense, but the result doesn't make sense to me.  In
Python 2.4.3, that example raised OverflowError, which seems better
than throwing away all the bits without an exception.

From niko at alum.mit.edu  Wed May 31 17:35:39 2006
From: niko at alum.mit.edu (Niko Matsakis)
Date: Wed, 31 May 2006 17:35:39 +0200
Subject: [Python-Dev] Python Benchmarks
Message-ID: <AF774AE0-6E02-4ED9-A23E-5315B93B89BB@alum.mit.edu>

Hello,

After reading through recent Python mail regarding dictionaries and  
exceptions, I wondered, what is the current state of the art in  
Python benchmarks? I've tried before to find a definite set of Python  
benchmarks but failed.  There doesn't seem to be an up to date  
reference, though if there is one and I didn't find it I would be  
very happy to be proven wrong. In any case, I would appreciate advice  
from the experts regarding what's available and what it tests.


thank you,
Niko Matsakis

From bob at redivi.com  Wed May 31 17:52:54 2006
From: bob at redivi.com (Bob Ippolito)
Date: Wed, 31 May 2006 08:52:54 -0700
Subject: [Python-Dev] test_gzip/test_tarfile failure om AMD64
In-Reply-To: <1f7befae0605310831s3a09ea82nfa2587e74c7012b4@mail.gmail.com>
References: <9e804ac0605280431i6dc22956yee5387985a53997d@mail.gmail.com>
	<ca471dc20605291125k231f5ea8r5545fc63ad0a6c8b@mail.gmail.com>
	<1f7befae0605291227w109b30b6g293885e4e1316568@mail.gmail.com>
	<ca471dc20605291244g734d9bc6v7f6131f93dd2db0f@mail.gmail.com>
	<9ED42665-07AC-4819-B2B2-D899A56038B1@redivi.com>
	<7BD6DC02-C281-4C52-B9C7-6A05E11E8A17@redivi.com>
	<1f7befae0605292000k3ac8daa8n355fb03ffb74e00@mail.gmail.com>
	<0486A093-ECAF-49DB-962B-9CFF47D7AAAB@redivi.com>
	<1f7befae0605301047h5a3afc25ibb412b8a6df73335@mail.gmail.com>
	<21F670FB-82CD-401C-92DC-23391ECCCE59@redivi.com>
	<1f7befae0605310831s3a09ea82nfa2587e74c7012b4@mail.gmail.com>
Message-ID: <DEAC54D6-C2D0-48BF-8EE0-87C312CD1024@redivi.com>


On May 31, 2006, at 8:31 AM, Tim Peters wrote:

> I'm afraid a sabbatical year isn't long enough to understand what the
> struct module did or intends to do by way of range checking <0.7
> wink>.
>
> Is this intended?  This is on a 32-bit Windows box with current trunk:
>
>>>> from struct import pack as p
>>>> p("I", 2**32 + 2343)
> C:\Code\python\lib\struct.py:63: DeprecationWarning: 'I' format
> requires 0 <= number <= 4294967295
>  return o.pack(*args)
> '\x00\x00\x00\x00'
>
> The warning makes sense, but the result doesn't make sense to me.  In
> Python 2.4.3, that example raised OverflowError, which seems better
> than throwing away all the bits without an exception.

Throwing away all the bits is a bug, it's supposed to mask with  
0xffffffffL

-bob


From skip at pobox.com  Wed May 31 18:00:46 2006
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 31 May 2006 11:00:46 -0500
Subject: [Python-Dev] Python Benchmarks
In-Reply-To: <AF774AE0-6E02-4ED9-A23E-5315B93B89BB@alum.mit.edu>
References: <AF774AE0-6E02-4ED9-A23E-5315B93B89BB@alum.mit.edu>
Message-ID: <17533.48558.364179.717911@montanaro.dyndns.org>


(This is more appropriate for comp.lang.python/python-list at python.org.)

    Niko> After reading through recent Python mail regarding dictionaries
    Niko> and exceptions, I wondered, what is the current state of the art
    Niko> in Python benchmarks?

Pybench was recently added to the repository and will be in 2.5.  It works
with Python as far back as 1.5.2 though.  As a result of last week's
NeedForSpeed sprint some questions were raised about the efficacy of its
string/unicode tests, however, it seems to be the best available tool at the
moment.

Skip

From mal at egenix.com  Wed May 31 18:25:27 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 31 May 2006 18:25:27 +0200
Subject: [Python-Dev] Python Benchmarks
In-Reply-To: <17533.48558.364179.717911@montanaro.dyndns.org>
References: <AF774AE0-6E02-4ED9-A23E-5315B93B89BB@alum.mit.edu>
	<17533.48558.364179.717911@montanaro.dyndns.org>
Message-ID: <447DC377.4000504@egenix.com>

skip at pobox.com wrote:
> (This is more appropriate for comp.lang.python/python-list at python.org.)
> 
>     Niko> After reading through recent Python mail regarding dictionaries
>     Niko> and exceptions, I wondered, what is the current state of the art
>     Niko> in Python benchmarks?
> 
> Pybench was recently added to the repository and will be in 2.5.  It works
> with Python as far back as 1.5.2 though.  As a result of last week's
> NeedForSpeed sprint some questions were raised about the efficacy of its
> string/unicode tests, however, it seems to be the best available tool at the
> moment.

Could you please forward such questions to me ?

Steve Holden has added lots of good features to pybench during
the sprint and I'm working on having it use more accurate
timers (see pybench/systimes.py).

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 31 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From skip at pobox.com  Wed May 31 18:43:04 2006
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 31 May 2006 11:43:04 -0500
Subject: [Python-Dev] Python Benchmarks
In-Reply-To: <447DC377.4000504@egenix.com>
References: <AF774AE0-6E02-4ED9-A23E-5315B93B89BB@alum.mit.edu>
	<17533.48558.364179.717911@montanaro.dyndns.org>
	<447DC377.4000504@egenix.com>
Message-ID: <17533.51096.462552.451772@montanaro.dyndns.org>


    MAL> Could you please forward such questions to me ?

I suppose, though what question were you referring to?  I was referring to
Fredrik's thread about stringbench vs pybench for string/unicode tests,
which I thought was posted to python-dev.  I assumed you were aware of the
issue.

Skip

From g.brandl at gmx.net  Wed May 31 19:22:33 2006
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 31 May 2006 19:22:33 +0200
Subject: [Python-Dev] Arguments and PyInt_AsLong
Message-ID: <e5kjcu$rnt$1@sea.gmane.org>

Looking at #1153226, I found this:

We introduced emitting a DeprecationWarning for PyArg_ParseTuple
integer arguments if a float was given. This doesn't affect functions
like file.seek which use PyInt_AsLong to parse their argument.
PyInt_AsLong calls the nb_int slot which silently converts floats
to ints.

Is that acceptable, should PyInt_AsLong not accept other numbers
or should the functions be changed?

Georg


From mal at egenix.com  Wed May 31 19:24:35 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 31 May 2006 19:24:35 +0200
Subject: [Python-Dev] Python Benchmarks
In-Reply-To: <17533.51096.462552.451772@montanaro.dyndns.org>
References: <AF774AE0-6E02-4ED9-A23E-5315B93B89BB@alum.mit.edu>
	<17533.48558.364179.717911@montanaro.dyndns.org>
	<447DC377.4000504@egenix.com>
	<17533.51096.462552.451772@montanaro.dyndns.org>
Message-ID: <447DD153.8080202@egenix.com>

skip at pobox.com wrote:
>     MAL> Could you please forward such questions to me ?
> 
> I suppose, though what question were you referring to? 

Not sure - I thought you knew ;-)

> I was referring to
> Fredrik's thread about stringbench vs pybench for string/unicode tests,
> which I thought was posted to python-dev.  I assumed you were aware of the
> issue.

I'm aware of that thread, but Fredrik only posted some vague
comment to the checkins list, saying that they couldn't use
pybench. I asked for some more details, but he didn't get
back to me.

AFAIK, there were no real issues with pybench, only with the
fact that time.clock() (the timer used by pybench) is wall-time
on Windows and thus an MP3-player running in the background
will cause some serious noise in the measurements ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 31 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From skip at pobox.com  Wed May 31 19:31:33 2006
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 31 May 2006 12:31:33 -0500
Subject: [Python-Dev] Python Benchmarks
In-Reply-To: <447DD153.8080202@egenix.com>
References: <AF774AE0-6E02-4ED9-A23E-5315B93B89BB@alum.mit.edu>
	<17533.48558.364179.717911@montanaro.dyndns.org>
	<447DC377.4000504@egenix.com>
	<17533.51096.462552.451772@montanaro.dyndns.org>
	<447DD153.8080202@egenix.com>
Message-ID: <17533.54005.477327.673031@montanaro.dyndns.org>


    MAL> I'm aware of that thread, but Fredrik only posted some vague
    MAL> comment to the checkins list, saying that they couldn't use
    MAL> pybench. I asked for some more details, but he didn't get back to
    MAL> me.

I'm pretty sure I saw him (or maybe Andrew Dalke) post some timing
comparisons between pybench and stringbench.  Something about a change not
impacting performance showing a 60% slowdown on pybench but no change using
stringbench.  Maybe Fredrik had his iTunes volume cranked up too high... ;-)

Skip

From fredrik at pythonware.com  Wed May 31 19:47:20 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 31 May 2006 19:47:20 +0200
Subject: [Python-Dev] Python Benchmarks
In-Reply-To: <447DD153.8080202@egenix.com>
References: <AF774AE0-6E02-4ED9-A23E-5315B93B89BB@alum.mit.edu>	<17533.48558.364179.717911@montanaro.dyndns.org>	<447DC377.4000504@egenix.com>	<17533.51096.462552.451772@montanaro.dyndns.org>
	<447DD153.8080202@egenix.com>
Message-ID: <e5kkr6$14a$1@sea.gmane.org>

M.-A. Lemburg wrote:

> AFAIK, there were no real issues with pybench, only with the
> fact that time.clock() (the timer used by pybench) is wall-time
> on Windows and thus an MP3-player running in the background
> will cause some serious noise in the measurements 

oh, please; as I mentioned back then, PyBench reported massive slowdowns 
and huge speedups in code that wasn't touched, gave unrepeatable results 
on most platforms, and caused us to waste quite some time investigating 
potential regressions from 2.4 that simply didn't exist.

of about a dozen claimed slowdowns when comparing 2.4 to 2.5a2 on 
several platforms, only *one* slowdown could be independently confirmed 
with other tools.

and when we fixed that, and ended up with an implementation that was 
*faster* than in 2.4, PyBench didn't even notice the speedup.

the fact is that the results for individual tests in PyBench are 100% 
unreliable.  I have no idea why.

the accumulated result may be somewhat useful (at least after the "use 
minimum time instead of average" changes), but I wouldn't use it for any 
serious purpose.  at least PyStone is unusable in a well-defined way ;-)

</F>


From mal at egenix.com  Wed May 31 20:01:14 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 31 May 2006 20:01:14 +0200
Subject: [Python-Dev] Python Benchmarks
In-Reply-To: <17533.54005.477327.673031@montanaro.dyndns.org>
References: <AF774AE0-6E02-4ED9-A23E-5315B93B89BB@alum.mit.edu>
	<17533.48558.364179.717911@montanaro.dyndns.org>
	<447DC377.4000504@egenix.com>
	<17533.51096.462552.451772@montanaro.dyndns.org>
	<447DD153.8080202@egenix.com>
	<17533.54005.477327.673031@montanaro.dyndns.org>
Message-ID: <447DD9EA.2000608@egenix.com>

skip at pobox.com wrote:
>     MAL> I'm aware of that thread, but Fredrik only posted some vague
>     MAL> comment to the checkins list, saying that they couldn't use
>     MAL> pybench. I asked for some more details, but he didn't get back to
>     MAL> me.
> 
> I'm pretty sure I saw him (or maybe Andrew Dalke) post some timing
> comparisons between pybench and stringbench.  Something about a change not
> impacting performance showing a 60% slowdown on pybench but no change using
> stringbench.  Maybe Fredrik had his iTunes volume cranked up too high... ;-)

This is possible since pybench uses a long run strategy, whereas
stringbench uses the timeit.py short run method.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 31 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From guido at python.org  Wed May 31 20:02:06 2006
From: guido at python.org (Guido van Rossum)
Date: Wed, 31 May 2006 11:02:06 -0700
Subject: [Python-Dev] Arguments and PyInt_AsLong
In-Reply-To: <e5kjcu$rnt$1@sea.gmane.org>
References: <e5kjcu$rnt$1@sea.gmane.org>
Message-ID: <ca471dc20605311102l2518b234t5f2c86d8935fbacf@mail.gmail.com>

file.seek etc. should be changed to use PyNumber_AsIndex or whatever
it's called.

On 5/31/06, Georg Brandl <g.brandl at gmx.net> wrote:
> Looking at #1153226, I found this:
>
> We introduced emitting a DeprecationWarning for PyArg_ParseTuple
> integer arguments if a float was given. This doesn't affect functions
> like file.seek which use PyInt_AsLong to parse their argument.
> PyInt_AsLong calls the nb_int slot which silently converts floats
> to ints.
>
> Is that acceptable, should PyInt_AsLong not accept other numbers
> or should the functions be changed?
>
> Georg
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From mal at egenix.com  Wed May 31 20:28:37 2006
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 31 May 2006 20:28:37 +0200
Subject: [Python-Dev] Python Benchmarks
In-Reply-To: <e5kkr6$14a$1@sea.gmane.org>
References: <AF774AE0-6E02-4ED9-A23E-5315B93B89BB@alum.mit.edu>	<17533.48558.364179.717911@montanaro.dyndns.org>	<447DC377.4000504@egenix.com>	<17533.51096.462552.451772@montanaro.dyndns.org>	<447DD153.8080202@egenix.com>
	<e5kkr6$14a$1@sea.gmane.org>
Message-ID: <447DE055.4040105@egenix.com>

Fredrik Lundh wrote:
> M.-A. Lemburg wrote:
> 
>> AFAIK, there were no real issues with pybench, only with the
>> fact that time.clock() (the timer used by pybench) is wall-time
>> on Windows and thus an MP3-player running in the background
>> will cause some serious noise in the measurements 
> 
> oh, please; as I mentioned back then, PyBench reported massive slowdowns 
> and huge speedups in code that wasn't touched, gave unrepeatable results 
> on most platforms, and caused us to waste quite some time investigating 
> potential regressions from 2.4 that simply didn't exist.

It would be nice if you could post examples of these results
and also the details of the platforms on which you tested.

> of about a dozen claimed slowdowns when comparing 2.4 to 2.5a2 on 
> several platforms, only *one* slowdown could be independently confirmed 
> with other tools.
> 
> and when we fixed that, and ended up with an implementation that was 
> *faster* than in 2.4, PyBench didn't even notice the speedup.

Which one was that ?

With which parameters did you run pybench ?

> the fact is that the results for individual tests in PyBench are 100% 
> unreliable.  I have no idea why.

If they are 100% unreliable, then perhaps we should simply
reverse the claims pybench makes ;-) (this would be like a
100% wrong wheather report).

Seriously, I've been using and running pybench for years
and even though tweaks to the interpreter do sometimes
result in speedups or slow-downs where you wouldn't expect
them (due to the interpreter using the Python objects),
they are reproducable and often enough have uncovered
that optimizations in one area may well result in slow-downs
in other areas.

Often enough the results are related to low-level features
of the architecture you're using to run the code such as
cache size, cache lines, number of registers in the CPU or
on the FPU stack, etc. etc.

E.g. some years ago, I played around with the ceval loop,
ordered the switch statements in different ways, broke the
switch in two parts, etc. The results were impressive - but
mostly due to the switch statement being optimized for the
platform I was running (AMD at the time). Moving a switch
option sometimes had a huge effect on the timings. Most of
which was due to the code being more suitably aligned for
the CPU cache.

> the accumulated result may be somewhat useful (at least after the "use 
> minimum time instead of average" changes), but I wouldn't use it for any 
> serious purpose.  at least PyStone is unusable in a well-defined way ;-)

I think that most of your experience is related to the way
timing is done on Windows.

Like I said: I'm working on better timers for use in pybench.
Hopefully, this will make your experience a better one next
time you try.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 31 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From skip at pobox.com  Wed May 31 20:33:51 2006
From: skip at pobox.com (skip at pobox.com)
Date: Wed, 31 May 2006 13:33:51 -0500
Subject: [Python-Dev] Arguments and PyInt_AsLong
In-Reply-To: <ca471dc20605311102l2518b234t5f2c86d8935fbacf@mail.gmail.com>
References: <e5kjcu$rnt$1@sea.gmane.org>
	<ca471dc20605311102l2518b234t5f2c86d8935fbacf@mail.gmail.com>
Message-ID: <17533.57743.363625.820269@montanaro.dyndns.org>


    Guido> ... PyNumber_AsIndex or whatever it's called.

Maybe the API is getting a little fat if it doesn't fit comfortably in the
BDFL's brain...  Does that suggest it might need some streamlining for Py3k?

Skip

From martin at v.loewis.de  Wed May 31 22:38:20 2006
From: martin at v.loewis.de (martin at v.loewis.de)
Date: Wed, 31 May 2006 22:38:20 +0200
Subject: [Python-Dev] Segmentation fault of Python if build on Solaris 9
	or10 with Sun Studio 11
In-Reply-To: <200605301953.38638.Andreas.Floeter@web.de>
References: <200605301953.38638.Andreas.Floeter@web.de>
Message-ID: <1149107900.447dfebc603d7@www.domainfactory-webmail.de>

Zitat von Andreas Fl?ter <Andreas.Floeter at web.de>:

> Help would be appreciated

This strictly doesn't belong to python-dev: this is the list where
you say "I want to help", not so much "I need your help".

If you want to resolve this yourself, we can guide you through
that. I would start running the binary in a debugger to find
out where it crashes. Maybe the bug in Python is easy to see
from that. But then, maybe the bug is in the compiler, and not
in Python...

Regards,
Martin



From facundobatista at gmail.com  Wed May 31 22:53:01 2006
From: facundobatista at gmail.com (Facundo Batista)
Date: Wed, 31 May 2006 17:53:01 -0300
Subject: [Python-Dev] Segmentation fault of Python if build on Solaris 9
	or10 with Sun Studio 11
In-Reply-To: <1149107900.447dfebc603d7@www.domainfactory-webmail.de>
References: <200605301953.38638.Andreas.Floeter@web.de>
	<1149107900.447dfebc603d7@www.domainfactory-webmail.de>
Message-ID: <e04bdf310605311353x3a04243ay732f57ede9cd1766@mail.gmail.com>

2006/5/31, martin at v.loewis.de <martin at v.loewis.de>:

> This strictly doesn't belong to python-dev: this is the list where
> you say "I want to help", not so much "I need your help".

QOTW!

I love it!

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/